summaryrefslogtreecommitdiff
path: root/options/cf_options.h
diff options
context:
space:
mode:
authorJay Zhuang <zjay@meta.com>2022-09-29 19:43:55 -0700
committerFacebook GitHub Bot <facebook-github-bot@users.noreply.github.com>2022-09-29 19:43:55 -0700
commitf3cc66632b9e9fda4822e7512beb523d78179205 (patch)
treeba7cef6a851a95aefeb619431245b934d21d154d /options/cf_options.h
parent47b57a37317fb31219eb9838643ac7576924ba4f (diff)
Align compaction output file boundaries to the next level ones (#10655)
Summary: Try to align the compaction output file boundaries to the next level ones (grandparent level), to reduce the level compaction write-amplification. In level compaction, there are "wasted" data at the beginning and end of the output level files. Align the file boundary can avoid such "wasted" compaction. With this PR, it tries to align the non-bottommost level file boundaries to its next level ones. It may cut file when the file size is large enough (at least 50% of target_file_size) and not too large (2x target_file_size). db_bench shows about 12.56% compaction reduction: ``` TEST_TMPDIR=/data/dbbench2 ./db_bench --benchmarks=fillrandom,readrandom -max_background_jobs=12 -num=400000000 -target_file_size_base=33554432 # baseline: Flush(GB): cumulative 25.882, interval 7.216 Cumulative compaction: 285.90 GB write, 162.36 MB/s write, 269.68 GB read, 153.15 MB/s read, 2926.7 seconds # with this change: Flush(GB): cumulative 25.882, interval 7.753 Cumulative compaction: 249.97 GB write, 141.96 MB/s write, 233.74 GB read, 132.74 MB/s read, 2534.9 seconds ``` The compaction simulator shows a similar result (14% with 100G random data). As a side effect, with this PR, the SST file size can exceed the target_file_size, but is capped at 2x target_file_size. And there will be smaller files. Here are file size statistics when loading 100GB with the target file size 32MB: ``` baseline this_PR count 1.656000e+03 1.705000e+03 mean 3.116062e+07 3.028076e+07 std 7.145242e+06 8.046139e+06 ``` The feature is enabled by default, to revert to the old behavior disable it with `AdvancedColumnFamilyOptions.level_compaction_dynamic_file_size = false` Also includes https://github.com/facebook/rocksdb/issues/1963 to cut file before skippable grandparent file. Which is for use case like user adding 2 or more non-overlapping data range at the same time, it can reduce the overlapping of 2 datasets in the lower levels. Pull Request resolved: https://github.com/facebook/rocksdb/pull/10655 Reviewed By: cbi42 Differential Revision: D39552321 Pulled By: jay-zhuang fbshipit-source-id: 640d15f159ab0cd973f2426cfc3af266fc8bdde2
Diffstat (limited to 'options/cf_options.h')
-rw-r--r--options/cf_options.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/options/cf_options.h b/options/cf_options.h
index 47de8e7ae..da6b7252a 100644
--- a/options/cf_options.h
+++ b/options/cf_options.h
@@ -64,6 +64,8 @@ struct ImmutableCFOptions {
bool level_compaction_dynamic_level_bytes;
+ bool level_compaction_dynamic_file_size;
+
int num_levels;
bool optimize_filters_for_hits;