summaryrefslogtreecommitdiff
path: root/db/db_impl/db_impl_write.cc
diff options
context:
space:
mode:
authorYanqin Jin <yanqin@fb.com>2022-12-13 21:45:00 -0800
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>2022-12-13 21:45:00 -0800
commitc93ba7db5ddc33f69f8f049cc59454985b17dc46 (patch)
tree8161ef502a8d76611bf6fbc66302eb2015595844 /db/db_impl/db_impl_write.cc
parent98d5db5c2ebed6ade6ff424215d7deae67f4593b (diff)
Revise LockWAL/UnlockWAL implementation (#11020)
Summary: RocksDB has two public APIs: `DB::LockWAL()`/`DB::UnlockWAL()`. The current implementation acquires and releases the internal `DBImpl::log_write_mutex_`. According to the comment on `DBImpl::log_write_mutex_`: https://github.com/facebook/rocksdb/blob/7.8.fb/db/db_impl/db_impl.h#L2287:L2288 > Note: to avoid dealock, if needed to acquire both log_write_mutex_ and mutex_, the order should be first mutex_ and then log_write_mutex_. This puts limitations on how applications can use the `LockWAL()` API. After `LockWAL()` returns ok, then application should not perform any operation that acquires `mutex_`. Currently, the use case of `LockWAL()` is MyRocks implementing the MySQL storage engine handlerton `lock_hton_log` interface. The operation that MyRocks performs after `LockWAL()` is `GetSortedWalFiless()` which not only acquires mutex_, but also `log_write_mutex_`. There are two issues: 1. Applications using these two APIs may hang if one thread calls `GetSortedWalFiles()` after calling `LockWAL()` because log_write_mutex is not recursive. 2. Two threads may dead lock due to lock order inversion. To fix these issues, we can modify the implementation of LockWAL so that it does not keep `log_write_mutex_` held until UnlockWAL. To achieve the goal of locking the WAL, we can instead manually inject a write stall so that all future writes will be stopped. Pull Request resolved: https://github.com/facebook/rocksdb/pull/11020 Test Plan: make check Reviewed By: ajkr Differential Revision: D41785203 Pulled By: riversand963 fbshipit-source-id: 5ccb7a9c6eb9a2c3fa80fd2c399cc2568b8f89ce
Diffstat (limited to 'db/db_impl/db_impl_write.cc')
-rw-r--r--db/db_impl/db_impl_write.cc10
1 files changed, 10 insertions, 0 deletions
diff --git a/db/db_impl/db_impl_write.cc b/db/db_impl/db_impl_write.cc
index a597c168d..cbeab046f 100644
--- a/db/db_impl/db_impl_write.cc
+++ b/db/db_impl/db_impl_write.cc
@@ -924,6 +924,15 @@ Status DBImpl::WriteImplWALOnly(
write_thread->ExitAsBatchGroupLeader(write_group, status);
return status;
}
+ } else {
+ InstrumentedMutexLock lock(&mutex_);
+ Status status = DelayWrite(/*num_bytes=*/0ull, write_options);
+ if (!status.ok()) {
+ WriteThread::WriteGroup write_group;
+ write_thread->EnterAsBatchGroupLeader(&w, &write_group);
+ write_thread->ExitAsBatchGroupLeader(write_group, status);
+ return status;
+ }
}
WriteThread::WriteGroup write_group;
@@ -1762,6 +1771,7 @@ uint64_t DBImpl::GetMaxTotalWalSize() const {
// REQUIRES: this thread is currently at the front of the writer queue
Status DBImpl::DelayWrite(uint64_t num_bytes,
const WriteOptions& write_options) {
+ mutex_.AssertHeld();
uint64_t time_delayed = 0;
bool delayed = false;
{