diff options
author | Hui Xiao <huixiao@fb.com> | 2024-04-22 14:07:34 -0700 |
---|---|---|
committer | Facebook GitHub Bot <facebook-github-bot@users.noreply.github.com> | 2024-04-22 14:07:34 -0700 |
commit | 7d83b4e3e5d7742657e7d9a1fd078b69719e0217 (patch) | |
tree | 95f8a0483a0611431077f838ee8c9c22415358d4 /table | |
parent | bcfe4a0dcfbd5aa44263c4a4bb0acddee725d5b0 (diff) |
Fix file checksum mismatch due to padded bytes when block_align=true (#12542)
Summary:
**Context/Summary:**
When `BlockBasedTableOptions::block_align=true`, we pad bytes to align blocks https://github.com/facebook/rocksdb/blob/d41e568b1cc67e8a248dce7197b8a8aebaf3bb2f/table/block_based/block_based_table_builder.cc#L1415-L1421.
Those bytes are not included in generating the file checksum upon file creation. But `VerifyFileChecksums()` includes those bytes in generating the file check to compare against the checksum generating upon file creation. Therefore a file checksum mismatch is returned in `VerifyFileChecksums()`.
We decided to include those padded bytes in generating the checksum upon file creation.
Bonus: also fix surrounding code to use actual padded bytes for verification - see https://github.com/facebook/rocksdb/pull/12542#discussion_r1571429163
Pull Request resolved: https://github.com/facebook/rocksdb/pull/12542
Test Plan:
- New UT
- Benchmark
```
TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=fillseq[-X300] --num=100000 --block_align=1 --compression_type=none
```
Pre-PR:
fillseq [AVG 300 runs] : 422857 (± 3942) ops/sec; 46.8 (± 0.4) MB/sec
Post-PR:
fillseq [AVG 300 runs] : 424707 (± 3799) ops/sec; 47.0 (± 0.4) MB/sec
Reviewed By: ajkr
Differential Revision: D56168447
Pulled By: hx235
fbshipit-source-id: 96209ef950d42943d336f11968ae3fcf9872fc2c
Diffstat (limited to 'table')
-rw-r--r-- | table/table_test.cc | 23 |
1 files changed, 23 insertions, 0 deletions
diff --git a/table/table_test.cc b/table/table_test.cc index 02a8d899d..ba44b95b3 100644 --- a/table/table_test.cc +++ b/table/table_test.cc @@ -5505,6 +5505,29 @@ TEST_P(BlockBasedTableTest, BlockAlignTest) { table_reader.reset(); } +TEST_P(BlockBasedTableTest, FixBlockAlignMismatchedFileChecksums) { + Options options; + options.create_if_missing = true; + options.compression = kNoCompression; + options.file_checksum_gen_factory = GetFileChecksumGenCrc32cFactory(); + BlockBasedTableOptions bbto; + bbto.block_align = true; + bbto.block_size = 1024; + options.table_factory.reset(NewBlockBasedTableFactory(bbto)); + const std::string kDBPath = + test::PerThreadDBPath("block_align_padded_bytes_verify_file_checksums"); + ASSERT_OK(DestroyDB(kDBPath, options)); + DB* db; + ASSERT_OK(DB::Open(options, kDBPath, &db)); + ASSERT_OK(db->Put(WriteOptions(), "k1", "v1")); + ASSERT_OK(db->Flush(FlushOptions())); + // Before the fix, VerifyFileChecksums() will fail as padded bytes from + // aligning blocks are used to generate the checksum to compare against the + // one not generated by padded bytes + ASSERT_OK(db->VerifyFileChecksums(ReadOptions())); + delete db; +} + TEST_P(BlockBasedTableTest, PropertiesBlockRestartPointTest) { BlockBasedTableOptions bbto = GetBlockBasedTableOptions(); bbto.block_align = true; |