1 #+BIBLIOGRAPHY: refs.bib
4 :ID: 29f7085e-53a3-4d70-90a7-e3437ee99775 8 :ID: 3e7bb82e-7dae-476a-8ed0-18c361c9bd1b 11 BTRFS is a Linux filesystem based on copy-on-write, allowing for
12 efficient snapshots and clones.
14 It uses B-trees as its main on-disk data structure. The design goal is
15 to work well for many use cases and workloads. To this end, much
16 effort has been directed to maintaining even performance as the
17 filesystem ages, rather than trying to support a particular narrow
20 Linux filesystems are installed on smartphones as well as enterprise
21 servers. This entails challenges on many different fronts.
23 - Scalability :: The filesystem must scale in many dimensions: disk
24 space, memory, and CPUs.
26 - Data integrity :: Losing data is not an option, and much effort is
27 expended to safeguard the content. This includes checksums, metadata
28 duplication, and RAID support built into the filesystem.
30 - Disk diversity :: The system should work well with SSDs and hard
31 disks. It is also expected to be able to use an array of different
32 sized disks, which poses challenges to the RAID and striping
36 *** [2023-08-08 Tue] btrfs performance speculation :: 38 :ID: 2b662144-97b2-4736-a8fc-bc8f861b9829 40 - [[https://www.percona.com/blog/taking-a-look-at-btrfs-for-mysql/]] 41 - zfs outperforms immensely, but potential misconfiguration on btrfs side (virt+cow 43 - https://www.ctrl.blog/entry/btrfs-vs-ext4-performance.html 44 - see the follow up comment on this post 45 - https://www.reddit.com/r/archlinux/comments/o2gc42/is_the_performance_hit_of_btrfs_serious_is_it/ 47 I’m the author of OP’s first link. I use BtrFS today. I often shift lots of 48 de-duplicatable data around, and benefit greatly from file cloning. The data is actually 49 the same data that caused the slow performance in the article. BtrFS and file cloning 50 now performs this task quicker than a traditional file system. (Hm. It’s time for a 53 In a laptop with one drive: it doesn’t matter too much unless you do work that benefit 54 from file cloning or snapshots. This will likely require you to adjust your tooling and 55 workflow. I’ve had to rewrite the software I use every day to make it take advantage of 56 the capabilities of a more modern file system. You won’t benefit much from the data 57 recovery and redundancy features unless you’ve got two storage drives in your laptop and 58 can setup redundant data copies. 60 on similar hardware to mine? 62 It’s not a question about your hardware as much as how you use it. The bad performance I 63 documented was related to lots and lots of simultaneous random reads and writes. This 64 might not be representative of how you use your computer. 66 - https://dl.acm.org/doi/fullHtml/10.1145/3386362 67 - this is about distributed file systems (in this case Ceph) - they argue against 68 basing DFS on ondisk-format filesystems (XFS ext4) - developed BlueStore as 69 backend, which runs directly on raw storage hardware. 70 - this is a good approach, but expensive (2 years in development) and risky 71 - better approach is to take advantage of a powerful enough existing ondisk-FS 72 format and pair it with supporting modules which abstract away the 'distributed' 74 - the strategy presented here is critical for enterprise-grade hardware where the 75 ondisk filesystem becomes the bottleneck that you're looking to optimize 76 - https://lore.kernel.org/lkml/cover.1676908729.git.dsterba@suse.com/ 77 - linux 6.3 patch by David Sterba [2023-02-20 Mon] 78 - btrfs continues to show improvements in the linux kernel, ironing out the kinks 79 - makes it hard to compare benchmarks tho :/ 82 :ID: 9ddb1caf-014a-4d0f-972e-82028e8be286 84 - see this WIP k-ext for macos: [[https://github.com/relalis/macos-btrfs][macos-btrfs]] 85 - maybe we can help out with the VFS/mount support
88 :ID: 7535f844-330b-4f9c-b2a1-4578d64acbf7 90 - [[https://btrfs.readthedocs.io/en/latest/dev/On-disk-format.html][on-disk-format]] 91 - 'btrfs consists entirely of several trees. the trees use copy-on-write.'
92 - trees are stored in nodes which belong to a level in the b-tree structure.
93 - internal nodes (inodes) contain refs to other inodes on the
/next/ level OR
94 - to leaf nodes then the level reaches 0.
95 - leaf nodes contain various types depending on the tree.
97 - 0:8 uint
= objectid, each tree has its own set of object IDs 98 - 8:1 uint = item type
99 - 9:8 uint
= offset, depends on type. 101 - fields are unsigned 103 - primary superblock is located at 0x10000 (64KiB) 104 - Mirror copies of the superblock are located at physical addresses 0x4000000 (64 105 MiB) and 0x4000000000 (256GiB), if valid. copies are updated simultaneously. 106 - during mount only the first super block at 0x10000 is read, error causes mount to 108 - BTRFS onls recognizes disks with a valid 0x10000 superblock. 110 - stored at the start of every inode 111 - data following it depends on whether it is an internal or leaf node. 113 - node header followed by a number of key pointers 115 - 11:8 uint = block number
116 - 19:8 uint
= generation 118 - leaf nodes contain header followed by key pointers 120 - 11:4 uint = data offset relative to end of header(65)
121 - 15:4 uint
= data size 124 - holds ROOT_ITEMs, ROOT_REFs, and ROOT_BACKREFs for every tree other than itself. 125 - used to find the other trees and to determine the subvol structure. 126 - holds items for the 'root tree directory'. laddr is store in the superblock 128 - free ids: BTRFS_FIRST_FREE_OBJECTID=256ULL:BTRFS_LAST_FREE_OBJECTID=-256ULL
129 - otherwise used for internal use
130 *** send-stream format 132 :ID: 1d1a6211-c91f-48ae-8113-0ddada286cee 134 - [[https://btrfs.readthedocs.io/en/latest/dev/dev-send-stream.html][send stream format]] 135 - Send stream format represents a linear sequence of commands describing actions to be
136 performed on the target filesystem (receive side), created on the source filesystem
138 - The stream is currently used in two ways: to generate a stream representing a
139 standalone subvolume (full mode) or a difference between two snapshots of the same
140 subvolume (incremental mode).
141 - The stream can be generated using a set of other subvolumes to look for extent
142 references that could lead to a more efficient stream by transferring only the
143 references and not full data.
144 - The stream format is abstracted from on-disk structures (though it may share some
145 BTRFS specifics), the stream instructions could be generated by other means than the
147 - it's a checksum+TLV
148 - header: u32len,u16cmd,u32crc32c
149 - data: type,length,raw data
150 - the v2 protocol supports the encoded commands
151 - the commands are kinda clunky - need to MKFIL/MKDIR then RENAM to create
152 *** [2023-08-09 Wed] ioctls 154 :ID: be04dc90-86a6-46c5-9dcf-25519ebed34d 157 - https://docs.kernel.org/userspace-api/ioctl/ioctl-number.html
158 - Btrfs filesystem some lifted to vfs/generic
159 - fs/btrfs/ioctl.h and linux/fs.h
162 :ID: 219cb1b5-7dff-4800-8ad0-2a19309a9e9f 166 - core component of TrueNAS software
169 :ID: f2167ca1-f751-4ee3-a398-4cf7fff6b57c 175 :ID: 698fb02f-dc73-40be-8bac-0af3a03c39c6 180 :ID: 8c6cf1e4-1555-4270-a101-40b6fbb0a1f9 183 -- [cite/t/f:@xfs-scalability]
186 :ID: e3701458-b333-44e3-b6f2-12861d6287ed 190 :ID: dfe51b59-5f9d-4e7d-86f1-27e51453ae1f 192 -- [cite/t/f:@hd-failure-ml]
195 :ID: 9c9a5470-d0c2-49f1-9dc9-d0d62c841a19 197 -- [cite/t/f:@smart-ssd-qp]
198 -- [cite/t/f:@ssd-perf-opt]
202 :ID: 1414bda3-7fae-4fe0-ae65-8a5ed05ad822 204 -- [cite/t/f:@flash-openssd-systems]
207 :ID: 95e44402-9235-4b4b-a772-b91d78e38a6b 209 -- [cite/t/f:@nvme-ssd-ux]
210 --
[[https://nvmexpress.org/specifications/][specifications]] 213 :ID: 5639429c-1c9d-4cf0-b69c-cc45528cac50 215 -- [cite/t/f:@zns-usenix]
217 Zoned Storage is an open source, standards-based initiative to enable data centers to
218 scale efficiently for the zettabyte storage capacity era. There are two technologies
219 behind Zoned Storage, Shingled Magnetic Recording (SMR) in ATA/SCSI HDDs and Zoned
220 Namespaces (ZNS) in NVMe SSDs.
222 --
[[https://zonedstorage.io/][zonedstorage.io]] 223 -- $465 8tb 2.5"?
[[https://www.serversupply.com/SSD/PCI-E/7.68TB/WESTERN%20DIGITAL/WUS4BB076D7P3E3_332270.htm][retail]] 226 :ID: b8539369-0e0f-4f23-be8f-cd38be031bac 228 -- [cite/t/f:@emmc-mobile-io]
231 :ID: b244015e-f3e2-4837-8186-a2f5edef1f14 235 :ID: 935f67e0-eef6-4913-9dcc-8530129be37c 239 :ID: 134d256a-f7b3-4603-846c-b6c9bad2d708 241 - [[https://elixir.bootlin.com/linux/latest/source/Documentation/userspace-api/ioctl/ioctl-number.rst][ioctl-numbers]] 244 :ID: a3b9e17a-75a6-4aca-bf96-b713bc2ded43 248 :ID: 4e7e4fb5-55f7-4036-b568-b84cefa45de8 252 :ID: 861e5180-14b4-47c9-a779-fe25c0428d7e 254 - [[https://crates.io/crates/nix][crates.io]] 257 :ID: 320428ab-2d0f-4390-978f-c89907f8d0f4 259 - [[https://crates.io/crates/memmap2][crates.io]] 262 :ID: 1cf1597d-2f42-4b92-b8fc-a88c649f7cbf 264 - [[https://crates.io/crates/zstd][crates.io]] 267 :ID: 1f8fae07-2fbb-4a35-8269-ea436f846193 269 - [[https://crates.io/crates/rocksdb][crates.io]] 272 :ID: ec55c0e1-7862-4f05-a6ee-b59ffc68a8ff 274 - [[https://crates.io/crates/tokio][crates.io]] 277 :ID: bcf1904b-184e-4cae-86e7-5fcf57762944 279 - [[https://crates.io/crates/tracing][crates.io]] 280 **** tracing-subscriber 282 :ID: 85f6ed51-f3f3-489c-911a-e90c4974048e 284 - [[https://crates.io/crates/tracing-subscriber][crates.io]] 287 :ID: 29e0fb8d-e35a-4e47-9b11-45cc4019e2db 289 - [[https://crates.io/crates/axum][crates.io]] 292 :ID: 8e5a71ed-85d6-4562-be3a-9261ab376a0e 294 - [[https://crates.io/crates/tower][crates.io]] 297 :ID: f6f24187-53b1-408e-b3ac-a101c9ba3040 299 - [[https://crates.io/crates/uuid][crates.io]] 302 :ID: c09c812a-e884-4a28-ac4b-4f997ad2e932 306 :ID: 990d862d-80b1-4620-aa6a-d5e1a2c23517 308 - [[https://github.com/rust-lang/rust/issues/109736][tracking-issue]] 309 *** {BTreeMap,BTreeSet}::extract_if 311 :ID: 15bcf475-336a-4ed0-9b1d-921414c4ff9a 313 - [[https://github.com/rust-lang/rust/issues/70530][tracking-issue]] 316 :ID: 5aac7727-f53a-4414-9d6b-2cb50fb45c87 320 :ID: 043ab5da-6f3f-47ee-b9cf-ba8f0c7bb87c 322 - [[https://gitlab.common-lisp.net/asdf/asdf][gitlab.common-lisp.net]] 323 - [[https://asdf.common-lisp.dev/][common-lisp.dev]] 324 - [[https://github.com/fare/asdf/blob/master/doc/best_practices.md][best-practices]] 326 ** Reference Projects 328 :ID: f25d3c51-7338-484c-9068-31c1a4c7a565 332 :ID: 23dcbfef-b703-4dc5-a60a-9f2be66e32f2 334 - [[https://github.com/stumpwm/stumpwm][github]] 337 :ID: bfbb355d-2b09-4450-b39a-368a5f685d77 339 - [[https://github.com/atlas-engineer/nyxt][github]] 342 :ID: 19929b73-2a4c-43f1-b04c-ec88dfa209bd 344 - [[https://github.com/kaveh808/kons-9][github]] 347 :ID: 935dbb0f-2f04-46c1-b250-48dea359398d 349 - [[https://github.com/vindarel/cl-torrents][github]] 352 :ID: 2da96a46-f71c-436a-ab60-8f2a30469b15 354 - [[https://github.com/froggey/Mezzano][github]] 357 :ID: 38459f44-90e3-4fcc-a829-27c60e28b2cd 359 - [[https://github.com/whily/yalo][github]] 362 :ID: d2835614-461c-4bdd-8f25-a055b51797f4 364 - [[https://github.com/ledger/cl-ledger][github]] 367 :ID: 8aa32222-31a9-4dc5-999c-7f10a9649d9f 369 - [[https://github.com/lem-project/lem][github]] 372 :ID: 977ccbf4-ca2f-466b-9420-105df90cfcdc 374 - [[https://github.com/kindista/kindista][github]] 377 :ID: c745e1c8-a675-4cfe-bb7f-30916b9198dd 379 - [[https://github.com/ryukinix/lisp-chat][github]] 382 :ID: 9a03c3b2-e9b6-4ab8-a1c5-3517374afbf0 384 #+print_bibliography: