diff options
author | Yanqin Jin <yanqin@fb.com> | 2021-04-19 18:10:23 -0700 |
---|---|---|
committer | Facebook GitHub Bot <facebook-github-bot@users.noreply.github.com> | 2021-04-19 18:11:13 -0700 |
commit | a376c220665bce2035d587e3380cb4a771ad1c61 (patch) | |
tree | bb61c41d21df2014cfba861fc87e1a5d339ce407 /utilities | |
parent | 0c6e4674a64d07a00f9960f72cf155b34c8f1308 (diff) |
Handle rename() failure in non-local FS (#8192)
Summary:
In a distributed environment, a file `rename()` operation can succeed on server (remote)
side, but the client can somehow return non-ok status to RocksDB. Possible reasons include
network partition, connection issue, etc. This happens in `rocksdb::SetCurrentFile()`, which
can be called in `LogAndApply() -> ProcessManifestWrites()` if RocksDB tries to switch to a
new MANIFEST. We currently always delete the new MANIFEST if an error occurs.
This is problematic in distributed world. If the server-side successfully updates the CURRENT
file via renaming, then a subsequent `DB::Open()` will try to look for the new MANIFEST and fail.
As a fix, we can track the execution result of IO operations on the new MANIFEST.
- If IO operations on the new MANIFEST fail, then we know the CURRENT must point to the original
MANIFEST. Therefore, it is safe to remove the new MANIFEST.
- If IO operations on the new MANIFEST all succeed, but somehow we end up in the clean up
code block, then we do not know whether CURRENT points to the new or old MANIFEST. (For local
POSIX-compliant FS, it should still point to old MANIFEST, but it does not matter if we keep the
new MANIFEST.) Therefore, we keep the new MANIFEST.
- Any future `LogAndApply()` will switch to a new MANIFEST and update CURRENT.
- If process reopens the db immediately after the failure, then the CURRENT file can point
to either the new MANIFEST or the old one, both of which exist. Therefore, recovery can
succeed and ignore the other.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/8192
Test Plan: make check
Reviewed By: zhichao-cao
Differential Revision: D27804648
Pulled By: riversand963
fbshipit-source-id: 9c16f2a5ce41bc6aadf085e48449b19ede8423e4
Diffstat (limited to 'utilities')
-rw-r--r-- | utilities/backupable/backupable_db_test.cc | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/utilities/backupable/backupable_db_test.cc b/utilities/backupable/backupable_db_test.cc index b8e3c9d15..b4c62c7df 100644 --- a/utilities/backupable/backupable_db_test.cc +++ b/utilities/backupable/backupable_db_test.cc @@ -2718,19 +2718,19 @@ TEST_F(BackupableDBTest, GarbageCollectionBeforeBackup) { OpenDBAndBackupEngine(true); ASSERT_OK(backup_chroot_env_->CreateDirIfMissing(backupdir_ + "/shared")); - std::string file_five = backupdir_ + "/shared/000008.sst"; + std::string file_five = backupdir_ + "/shared/000009.sst"; std::string file_five_contents = "I'm not really a sst file"; - // this depends on the fact that 00008.sst is the first file created by the DB + // this depends on the fact that 00009.sst is the first file created by the DB ASSERT_OK(file_manager_->WriteToFile(file_five, file_five_contents)); FillDB(db_.get(), 0, 100); - // backup overwrites file 000008.sst + // backup overwrites file 000009.sst ASSERT_OK(backup_engine_->CreateNewBackup(db_.get(), true)); std::string new_file_five_contents; ASSERT_OK(ReadFileToString(backup_chroot_env_.get(), file_five, &new_file_five_contents)); - // file 000008.sst was overwritten + // file 000009.sst was overwritten ASSERT_TRUE(new_file_five_contents != file_five_contents); CloseDBAndBackupEngine(); |