-
Notifications
You must be signed in to change notification settings - Fork 36.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug-fix macos: give free bytes to F_PREALLOCATE #17887
Conversation
So this code has been wrong for 7 years? Why do we only notice the effect now? |
It’s likely this issue only started occurring with the introduction of APFS, which is quite recent. |
Old edit:
That is probably more likely. |
On my machine (APFS encrypted volume), I was seeing rev00002.dat hitting 20 MB around block 135k. After this patch (running up to block 191106, then shutting down) I see $ du -ch *.dat
128M blk00000.dat
128M blk00001.dat
128M blk00002.dat
128M blk00003.dat
128M blk00004.dat
128M blk00005.dat
128M blk00006.dat
128M blk00007.dat
128M blk00008.dat
128M blk00009.dat
128M blk00010.dat
128M blk00011.dat
128M blk00012.dat
128M blk00013.dat
128M blk00014.dat
128M blk00015.dat
128M blk00016.dat
48M blk00017.dat
20M rev00000.dat
17M rev00001.dat
17M rev00002.dat
17M rev00003.dat
17M rev00004.dat
18M rev00005.dat
18M rev00006.dat
19M rev00007.dat
19M rev00008.dat
18M rev00009.dat
18M rev00010.dat
19M rev00011.dat
19M rev00012.dat
19M rev00013.dat
19M rev00014.dat
18M rev00015.dat
19M rev00016.dat
5.1M rev00017.dat
2.5G total Master (~same block, after shutdown): $ du -ch *.dat
128M blk00000.dat
128M blk00001.dat
128M blk00002.dat
128M blk00003.dat
128M blk00004.dat
128M blk00005.dat
128M blk00006.dat
128M blk00007.dat
128M blk00008.dat
128M blk00009.dat
128M blk00010.dat
128M blk00011.dat
128M blk00012.dat
128M blk00013.dat
128M blk00014.dat
128M blk00015.dat
128M blk00016.dat
144M blk00017.dat
19M rev00000.dat
17M rev00001.dat
16M rev00002.dat
17M rev00003.dat
17M rev00004.dat
18M rev00005.dat
18M rev00006.dat
36M rev00007.dat
51M rev00008.dat
18M rev00009.dat
18M rev00010.dat
18M rev00011.dat
162M rev00012.dat
67M rev00013.dat
126M rev00014.dat
51M rev00015.dat
67M rev00016.dat
21M rev00017.dat
3.0G total Both runs were after
Looking closely at the run on master, the files are blowing up 2x in size until they are no longer used, at which point they are sometimes truncated to their actual size. The blow-up applies to blk files as well, but these are always truncated down after use (see blk00017.dat, which is too large because it isn't finished yet; this was after shutting Bitcoin Core down). Running on this PR, the 2x blowup is not occurring for either files. Running macos 10.14.6 (Mojave). User is running Catalina, so perhaps I would see the problem more often if I upgraded. |
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsNo conflicts as of last run. |
Enlightening read: https://www.mail-archive.com/[email protected]/msg00226.html |
Tested on MacOS 10.15.2 on an APFS volume and the problem disappears. Also tested on a mounted HFS+ disk volume and the problem disappears. |
Concept ACK |
I ran this on machine 2 in the issue report and can confirm that there is no blowup.
Tested on an APFS unencrypted disk image with 3 GB in size and let it run until the image was full:
Tested on a Mac OS Extended (HFS+) unencrypted disk image with 3 GB in size:
|
I also built a Bitcoin-Qt.dmg and started a fresh download on machine 1 in the issue report, to verify that I can successfully complete the initial block download on the Mac mini with 480 GB of free disk space at the start. ETA 3 days. |
I've just finished a from scratch sync using this PR + 17892, on an macOS (APFS) machine. Data directory is the expected size. No longer seeing very large rev*.dat files, most are between 16 and 21mb. |
The IBD is now complete and the estimate is spot on. 280 GB of disk space is used. The rev*.dat file sizes are in the range 12-27 MB.
|
Looking a bit more into the rev files I see what is most likely the truncation issue @kallewoof discovered in #17890
|
Thanks! Yeah, we will have to do a one time run-through of all the rev files to clean up the excess use, or we will end up with wasted space for everyone.. I'm honestly not sure why this isn't affecting other systems though. The finalize issue is not mac specific (only the over-preallocating issue is). |
This made me realize that this actually is a bug in the operating system. I couldn't explain why this didn't happen to earlier versions, but I believe I can now. The bug is that ftruncate does not return pre-allocated disk space. Before, we would over-preallocate and then immediately truncate back down to the expected size. This worked in pre-APFS land, but APFS has a bug which leaves the pre-allocated space as occupied by the file, even though we call ftruncate. I will make a minimal test case displaying this behavior and submit a report to Apple. Edit: https://openradar.appspot.com/radar?id=4930713610616832 Edit: I updated the bug report above with information on |
src/util/system.cpp
Outdated
fst.fst_bytesalloc = 0; | ||
if (fcntl(fileno(file), F_PREALLOCATE, &fst) == -1) { | ||
fst.fst_flags = F_ALLOCATEALL; | ||
fcntl(fileno(file), F_PREALLOCATE, &fst); | ||
} | ||
ftruncate(fileno(file), fst.fst_length); | ||
ftruncate(fileno(file), (off_t)offset + length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could static_cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Code review ACK 5de93dc also checked |
5de93dc
to
2db88bd
Compare
@arvidn Any chance you could comment here, given you patched an APFS related issue in libtorrent some time ago? |
This patch changes the behavior of
It doesn't seem like this patch correctly implements the documented behavior. It might fix an issue in how If I understand the mailing list post Dominic Giampaolo correctly, in order to implement the documented behavior of
and if |
Edit: nevermind, I see what you're saying, but I believe offset is already (supposed to be) equal to st_size, so I don't think it matters. |
Right, this patch makes that assumption. The windows implementation does not make this assumption, nor does the documentation state it will make this assumption. |
I see what you mean. I don't think it matters in this case either way, but it's worth pointing out. |
Either way, that's unrelated to the question which is how to properly pre-allocate (and more importantly, how to return unused space at the end) on APFS. |
The macos manpage for fcntl (for F_PEOFPOSMODE) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired.
2db88bd
to
75163f4
Compare
My sizes using this pull request, running Catalina and an unencrypted APFS disk: Running master: All file sizes for both this PR and master don't seem any different while running or after shutdown. |
@bg002h |
@Sjors Thx for the tip! I reformatted my NVME raid to APFS, encrypted, case sensitive since the above. For master, I see: running this PR (pastern is down): |
using master, and checking du repeatedly during blk1, rev1 gets pretty big before being trimmed back: cg@Brians-MacBook-Pro blocks % du -ch * The behavior is not observed using this PR: bcg@Brians-MacBook-Pro blocks % du -ch * |
Strange, I double checked and still not seeing any large rev files with |
|
FWIW I'm able to reproduce this on gitian-built clients as far back as 0.12. |
code review ACK 75163f4 Thanks everyone for testing! |
75163f4 bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: The macos manpage for `fcntl` (for `F_PEOFPOSMODE`) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate `pos + length` **free** bytes, rather than allocating `length` bytes after `pos`, as expected. Fixes #17827. ACKs for top commit: eriknylund: ACK 75163f4 built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB. laanwj: code review ACK 75163f4 Tree-SHA512: 105c8d56c20acad8febdf0583f1e5721b63376ace325a7a62c2e4b15a442c7131404ed604c32c0cda716791d7ca5aa9f5b6a774ff86e39838bc7e87ca3c42760
The macos manpage for fcntl (for F_PEOFPOSMODE) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. Github-Pull: bitcoin#17887 Rebased-From: 75163f4
Being backported to 0.19 in #17988. |
daf2fff test: add missing #include to fix compiler errors (Karl-Johan Alm) c8ad23c bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: We're about ready to do a [0.19.1 release](https://github.com/bitcoin/bitcoin/milestone/44); so I've opened this to collect the last remaining backports. If there's something that's been missed / or isn't tagged ["Needs backport (0.19)"](https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+label%3A%22Needs+backport+%280.19%29%22) that you think should be, please comment. Currently backports: * #17887 - bug-fix macos: give free bytes to F_PREALLOCATE * #17980 - test: add missing #include to fix compiler errors ACKs for top commit: laanwj: ACK daf2fff Tree-SHA512: 8438f992d0c39315a4da4c3f8ab5c92acefada4b0ba5a5cec3775dea2541492d386bf4e7c9e76e1494a6d4cf16a9205287d27ffd23d9c3056f213d733605eeee
75163f4 bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: The macos manpage for `fcntl` (for `F_PEOFPOSMODE`) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate `pos + length` **free** bytes, rather than allocating `length` bytes after `pos`, as expected. Fixes bitcoin#17827. ACKs for top commit: eriknylund: ACK 75163f4 built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB. laanwj: code review ACK 75163f4 Tree-SHA512: 105c8d56c20acad8febdf0583f1e5721b63376ace325a7a62c2e4b15a442c7131404ed604c32c0cda716791d7ca5aa9f5b6a774ff86e39838bc7e87ca3c42760
- [0.19] wallet: Reset reused transactions cache bitcoin#18083 - 0.19: Backports bitcoin#17792 - psbt: handle unspendable psbts bitcoin#17524 - qt: Fix comparison function signature bitcoin#17634 - psbt: check that various indexes and amounts are within bounds bitcoin#17156 - [0.19] psbt: check that various indexes and amounts are within bounds bitcoin#18079 - [0.19] Final backports for 0.19.1 bitcoin#17988 - Bug: IsUsedDestination shouldn't use key id as script id for ScriptHash bitcoin#17924 - qt: Fix deprecated QCharRef usage bitcoin#18101 - gui: Throttle GUI update pace when -reindex bitcoin#18121 - gui: Fix race in WalletModel::pollBalanceChanged bitcoin#18123 - gui: Fix unintialized WalletView::progressDialog bitcoin#18062 - Bugfix: GUI: Hide the HD/encrypt icons earlier so they get re-shown if another wallet is open bitcoin#18007 - bug-fix macos: give free bytes to F_PREALLOCATE bitcoin#17887 - test: add missing #include to fix compiler errors bitcoin#17980 - zmq: Fix due to invalid argument and multiple notifiers bitcoin#17445
Apple responded to my bug report regarding this, in case people are curious: https://openradar.appspot.com/radar?id=4930713610616832 |
Summary: The Initial Block Download (IBD) on macOS Catalina uses a lot more disk space than expected. The macos manpage for fcntl (for F_PEOFPOSMODE) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate pos + length free bytes, rather than allocating length bytes after pos, as expected. Note: more explanations on the issue from Apple in this bug report https://openradar.appspot.com/radar?id=4930713610616832 Backport of Bitcoin Core [[bitcoin/bitcoin#17887 | PR17887]] Test Plan: `ninja check` to verify that the patch does not accidentaly break anything else. But I do not have a MacOS to run the tests on, and try to replicate the issue. Reviewers: O1 Bitcoin ABC, #bitcoin_abc, deadalnix Reviewed By: O1 Bitcoin ABC, #bitcoin_abc, deadalnix Subscribers: deadalnix Differential Revision: https://reviews.bitcoinabc.org/D7526
75163f4 bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: The macos manpage for `fcntl` (for `F_PEOFPOSMODE`) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate `pos + length` **free** bytes, rather than allocating `length` bytes after `pos`, as expected. Fixes bitcoin#17827. ACKs for top commit: eriknylund: ACK 75163f4 built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB. laanwj: code review ACK 75163f4 Tree-SHA512: 105c8d56c20acad8febdf0583f1e5721b63376ace325a7a62c2e4b15a442c7131404ed604c32c0cda716791d7ca5aa9f5b6a774ff86e39838bc7e87ca3c42760
75163f4 bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: The macos manpage for `fcntl` (for `F_PEOFPOSMODE`) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate `pos + length` **free** bytes, rather than allocating `length` bytes after `pos`, as expected. Fixes bitcoin#17827. ACKs for top commit: eriknylund: ACK 75163f4 built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB. laanwj: code review ACK 75163f4 Tree-SHA512: 105c8d56c20acad8febdf0583f1e5721b63376ace325a7a62c2e4b15a442c7131404ed604c32c0cda716791d7ca5aa9f5b6a774ff86e39838bc7e87ca3c42760
75163f4 bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: The macos manpage for `fcntl` (for `F_PEOFPOSMODE`) states: > Allocate from the physical end of file. In this case, fst_length indicates the number of newly allocated bytes desired. This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate `pos + length` **free** bytes, rather than allocating `length` bytes after `pos`, as expected. Fixes bitcoin#17827. ACKs for top commit: eriknylund: ACK 75163f4 built locally. All tests passing. Manual test as per my previous comment above on an older commit, using an APFS unencrypted disk image with 3 GB. laanwj: code review ACK 75163f4 Tree-SHA512: 105c8d56c20acad8febdf0583f1e5721b63376ace325a7a62c2e4b15a442c7131404ed604c32c0cda716791d7ca5aa9f5b6a774ff86e39838bc7e87ca3c42760
daf2fff test: add missing #include to fix compiler errors (Karl-Johan Alm) c8ad23c bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: We're about ready to do a [0.19.1 release](https://github.com/bitcoin/bitcoin/milestone/44); so I've opened this to collect the last remaining backports. If there's something that's been missed / or isn't tagged ["Needs backport (0.19)"](https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+label%3A%22Needs+backport+%280.19%29%22) that you think should be, please comment. Currently backports: * bitcoin#17887 - bug-fix macos: give free bytes to F_PREALLOCATE * bitcoin#17980 - test: add missing #include to fix compiler errors ACKs for top commit: laanwj: ACK daf2fff Tree-SHA512: 8438f992d0c39315a4da4c3f8ab5c92acefada4b0ba5a5cec3775dea2541492d386bf4e7c9e76e1494a6d4cf16a9205287d27ffd23d9c3056f213d733605eeee
daf2fff test: add missing #include to fix compiler errors (Karl-Johan Alm) c8ad23c bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: We're about ready to do a [0.19.1 release](https://github.com/bitcoin/bitcoin/milestone/44); so I've opened this to collect the last remaining backports. If there's something that's been missed / or isn't tagged ["Needs backport (0.19)"](https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+label%3A%22Needs+backport+%280.19%29%22) that you think should be, please comment. Currently backports: * bitcoin#17887 - bug-fix macos: give free bytes to F_PREALLOCATE * bitcoin#17980 - test: add missing #include to fix compiler errors ACKs for top commit: laanwj: ACK daf2fff Tree-SHA512: 8438f992d0c39315a4da4c3f8ab5c92acefada4b0ba5a5cec3775dea2541492d386bf4e7c9e76e1494a6d4cf16a9205287d27ffd23d9c3056f213d733605eeee
daf2fff test: add missing #include to fix compiler errors (Karl-Johan Alm) c8ad23c bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: We're about ready to do a [0.19.1 release](https://github.com/bitcoin/bitcoin/milestone/44); so I've opened this to collect the last remaining backports. If there's something that's been missed / or isn't tagged ["Needs backport (0.19)"](https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+label%3A%22Needs+backport+%280.19%29%22) that you think should be, please comment. Currently backports: * bitcoin#17887 - bug-fix macos: give free bytes to F_PREALLOCATE * bitcoin#17980 - test: add missing #include to fix compiler errors ACKs for top commit: laanwj: ACK daf2fff Tree-SHA512: 8438f992d0c39315a4da4c3f8ab5c92acefada4b0ba5a5cec3775dea2541492d386bf4e7c9e76e1494a6d4cf16a9205287d27ffd23d9c3056f213d733605eeee
daf2fff test: add missing #include to fix compiler errors (Karl-Johan Alm) c8ad23c bug-fix macos: give free bytes to F_PREALLOCATE (Karl-Johan Alm) Pull request description: We're about ready to do a [0.19.1 release](https://github.com/bitcoin/bitcoin/milestone/44); so I've opened this to collect the last remaining backports. If there's something that's been missed / or isn't tagged ["Needs backport (0.19)"](https://github.com/bitcoin/bitcoin/issues?q=is%3Aopen+label%3A%22Needs+backport+%280.19%29%22) that you think should be, please comment. Currently backports: * bitcoin#17887 - bug-fix macos: give free bytes to F_PREALLOCATE * bitcoin#17980 - test: add missing #include to fix compiler errors ACKs for top commit: laanwj: ACK daf2fff Tree-SHA512: 8438f992d0c39315a4da4c3f8ab5c92acefada4b0ba5a5cec3775dea2541492d386bf4e7c9e76e1494a6d4cf16a9205287d27ffd23d9c3056f213d733605eeee
The macos manpage for
fcntl
(forF_PEOFPOSMODE
) states:This would result in the rev files being essentially pre-allocating 2x their necessary size (this is the case for block files as well, but these are flushed down to their right sizes every time) as they would pre-allocate
pos + length
free bytes, rather than allocatinglength
bytes afterpos
, as expected.Fixes #17827.