-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make system more responsive by using the fadvise DONTNEED #252
Comments
fadvise DONTNEED basically tells the kernel that the data in the cache is not needed anymore, but the data is already in the cache at that point. If the data is actually read only once, I think the solution would be to bypass the kernel cache by opening the file with O_DIRECT. |
https://github.com/ThomasWaldmann/attic/commits/o_direct I did some O_DIRECT changes there (read the commit comments). Somehow I still see the cache growing rather quickly - I suspect it is due to writes (I only changed input file reads to use O_DIRECT). Note: I gave up the O_DIRECT route. It is just a pain to use due to the alignment limitations imposed by O_DIRECT and python not supporting that. |
See PR #279 for posix_fadvise based solution, it works (on linux, py >= 3.3). \o/ Note: |
Is POSIX_FADV_DONTNEED really what we want? Just because we know that we will not need a specific piece of data again it is not our business to tell the kernel to remove it from the cache. We have no way of knowing if the data was originally loaded by us or by another process and how actively used it is. Has anyone checked what (if any) posix_fadvise settings are used by other backup solutions (In the default configuration)? |
from http://linux.die.net/man/2/posix_fadvise : The advice applies to a (not necessarily existent) region starting at offset and extending for len bytes (or until the end of the file if len is 0) within the file referred to by fd. The advice is not binding; it merely constitutes an expectation on behalf of the application. The phrasing "programs can ... announce an intention" and "constitutes an expectation on behalf of the application" rather clearly means to me that the scope of this is "application", not "system-wide". So the advice of the application "dontneed" is correct in our case. I didn't do specific performance measurements, but I watched how the cache behaved:
|
As far as the specification goes, I'd agree with your interpretation. The actual implementation is, however, what ultimately counts. From what I could tell, Linux currently responds to a DONTNEED fadvise by immediately invalidating the pages, regardless of their use by any other process. One could argue that this isn't in the spirit of the specification of posix_fadvise(), but that doesn't change the fact that such a behavior is undesirable for any backup program. The issue with your observation of cache usage is that it you can't infer anything from it. Assume for a moment that DONTNEED does in fact evict cache pages: Would anything change in your observation? |
Rsync does use fadvise. This page seems relevant: http://insights.oetiker.ch/linux/fadvise.html On Tue, Apr 14, 2015 at 7:17 AM, dnnr [email protected] wrote:
Dmitry Astapov |
This page seems to talk about a patch for rsync. Has this been accepted upstream? |
I believe that in the middle of that page it says "the patch has been On Wed, Apr 15, 2015 at 10:02 PM, Jonas Borgström [email protected]
Dmitry Astapov |
Hmm. Maybe I spoke too soon. Looking at https://tobi.oetiker.ch/patches/, On Wed, Apr 15, 2015 at 10:06 PM, Dmitry Astapov [email protected] wrote:
Dmitry Astapov |
It is pretty trivial to test. Download rsync and check. Nope, not in rsync here. On Wed, Apr 15, 2015 at 5:08 PM, Dmitry Astapov [email protected]
|
you need to grep again for fadvise (with "s"). |
Heh, well that is embarrassing. However the typo was in the email, I had On Wed, Apr 15, 2015 at 6:44 PM, TW [email protected] wrote:
|
take a look at the bup side before jumping into that ship. i heard they discoverd performance problems where such policies would actually remove good contents from the cache that was unrelated to backups, seriously impacting performance on production servers. it is quite possible that |
@anarcat do you have some more specific info? fadvise acts on a open filehandle (that belongs to the specific file opened by the backup process for reading). I could imagine that a simplistic fadvise kernel implementation kills the cached blocks of THAT file for all processes, but even that would be better than not using fadvise because of the lower cache flooding pressure of all the files that are not used by any other process and that do not end up / remain in the cache when using fadvise. |
the https://www.percona.com/blog/2010/04/02/fadvise-may-be-not-what-you-expect/ it's still subject to discussion on the bup mailing list, but please do be careful about this - i don't believe it is process-specific... it's nice to optimise attic: but if it's done at the depends of the rest of the system that is being backed up, that doesn't sound like a good tradeoff. :) |
apparently, the issue came up in this thread: https://groups.google.com/d/msg/bup-list/TXfSAgD9-ZM/saofDu1CdxcJ where bup would trash the sqlite cache of a big file, which had to be reloaded in memory, which was basically breaking the site... |
@anarcat I looked through the first 3 links. Lots of guessing and gut feelings (I can do that, too and I even posted reasons why I think it is good, while they didn't really reason about why they think it's worse than without). I didn't find anything in the google groups link of previous post about fadvise, did you post wrong url? |
the last link was where they discovered the issue apparently. for me it makes sense that trashing the cache will have a performance impact. when you load a page in the kernel VM and tell the kernel to drop it when you close the FD, it will drop the page - it seems logical to me. the fact that another process was using it at the same time probably doesn't change anything. but that's just me. |
See my April 14 comment. |
It shouldn't be too hard to test this experimentally. E.g., on a system with 8GB RAM, create eight 1GB files named
Then do exactly the same thing, but using a version of attic patched to use To address the concerns about losing data that you want in the cache, run the test a third time, this time including the file For the record, my instinct is with TW here. Why would the kernel provide this feature if it could trash the cache used by other processes? But the only way to know is to test. And from what I've read, it may depend on kernel version, so multiple testers is probably a good idea. Maybe someone can provide a program that does the analog of |
This article provides a very good explanation of the complexity of using It is indeed the case that FADV_DONTNEED will purge the file from the cache I agree that this behavior isn't very helpful, but that is how it is. It seems to me that the mincore hack in the article is not worth using. On Mon, Jun 1, 2015 at 4:05 PM, Dan Christensen [email protected]
|
@jdchristensen that test just shows if fadvise dontneed removes the file from cache (or not). while that is a bit interesting, more interesting is comparing the effect from permanently flooding the cache with a lot of data only needed once (attic without fadvise) vs. avoiding to flood the cache (attic with fadvise). @jbms I've read that article back then (but didn't want to put a lot of [C] code, like shown there, for a maybe negligible effect). |
@ThomasWaldmann I proposed running attic three times. The difference between run 1 and run 2 would exactly show that not flooding the cache gives an improvement for other applications (dd, in this case). The difference between run 1 and run 3 would show whether fadvise removes a file from the cache that was already there. Both bits of information seem important for this discussion. If the information at |
👎 We shouldn't DONTNEED the user's files.
It's possible Linux can still be improved - as suggested by the un-merged patch which implements NOREUSE as a gentler alternative. (Or that it's regressed :). Everyone has this problem. If we don't have the resources to test this properly (or implement the If we had any reason to be worried in the first place, we could keep the DONTNEED on attic files only. It shouldn't hurt anyone else; it'll hurt us but probably not where we care. It could account for about half the cache buildup (when we're not working with virtual machine image files or similar). |
Sorry, but I can't follow what you wanted to say. But I am doing the fadvise DONTNEED thing in borg (after practically seeing beneficial effects), so comparing attic vs. borg (or borg with/without that call) should be easy. I'll change things / accept pull requests for borg as soon as there is practical proof that change is needed / beneficial. |
@ThomasWaldmann Since you have both attic and borg with fadvise handy, you could run the three tests I proposed and see if there is a problem. |
@jdchristensen here are the results, completely as expected for me. http://paste.thinkmo.de/rNXe3Mmm#fadv_test.txt The problem is just that they are not that helpful in deciding whether fadvise DONTNEED is helpful or not. With fadvise DONTNEED, they show that the cache isn't killed by the backup process (as expected). With fadvise DONTNEED, it kills files from the cache which are backed up (as expected, due to the simple implementation in linux). So one might think one is as good as the other, but I still think fadvise DONTNEED is way better as it avoids that the cache is flooded with useless data (potentially for hours) which is much more benefit than killing the currently backed up file from the cache (in that moment, it can be cached again a second later) is harmful for the case when that file is in use. The tests you proposed can't show that, though (and any simple test might be a bit unrealistic compared to real system behaviour). |
hmm... well, the problem described earlier is specifically with stuff like mysql databases that get totally flushed out of the cache, having a major performance impact on the whole system. from what i understand, that performance problem is confirmed by those tests? since this is a corner case (and linux doesn't deal well with this (yet)), maybe we should make DONTNEED optional somehow? |
For what it's worth, I've found rsync also causes linux to fragment memory like no tomorrow creating cache entries for tiny files and then releasing some (but not all) of them over the next few hours. If you're not going to use the data again right away, it makes sense to avoid that. YMMV, but you might check system memory usage - a significant free proportion and lots of small blocks in /proc/buddyinfo that can't be compacted with compact_memory but are when you drop_caches are a sign of fragmentation wastage. Note however that fragmentation can take time to build up - days in some cases, depending on how actively the system is used. |
fadvise SEQUENTIAL is supposed to help as well, since about 2009. Unfortunately not - at least not as well as DONTNEED. Negative results published for any other project looking at this :). I guess the DB case was a worst-case problem because there's suddenly a lot that needs reading back, and the reads will be very random (lots of disk seeks). Btw with DONTNEED we're purging the entire file after reading each chunk - that "live database" case is really going to hate us. (Not that I think it was a good case; for a database he should really have been using LVM snapshots). The internet says you can hack O_DIRECT reads from pure python (using mincore() looks pretty ugly especially with the mmap(). Maybe the performance side isn't too bad, if you avoid actually touching any of the pages (and you can batch the calls up a bit). |
@ThomasWaldmann Thanks for doing the experiment! Now we know for sure how linux handles DONTNEED (at least for your kernel version). It's really unfortunate that DONTNEED kills things that were already in the cache, but that's life. I suspect that for most uses, using DONTNEED will be much better. But I suspect that some use cases might suffer, so providing a command line option to disable it seems reasonable (as @anarcat suggested). |
@sourcejedi i also tried using O_DIRECT, but that was a total pain. |
@ThomasWaldmann, @jdchristensen: My results, running it (Borg, 8cf0ead693) on a HDD via USB3, Intel Core M 5Y10 using Fedora 22: https://gist.github.com/pguth/481980bd67993984eda4 |
Testing this way the newest code ("call fadvise DONTNEED for the byterange we actually have read") I got these results (before/after): https://gist.github.com/pguth/4b436cf15c58549cbc4d/revisions |
Right, so my concern in #158 wasn't an issue (<1%), and it was tested on a HDD over USB (so low-speed io). Another entry for the journal of negative results, good work everyone :). I'm surprised. I guess this HDD might actually do enough read-ahead internally, or the kernel code we're using doesn't work how I thought. (I still like not hammering DONTNEED multiple times if someone else is reading the file too :). |
@sourcejedi yep, just had the same idea. :) |
I think, the "cleaning cache of db files when using DONTNEED problem" is because people are doing things wrong way. Why do they backup a working database? If you need a consistent backup, please either:
You don't want to backup the live&working db because of possible inconsistencies. If you use the right way (dump or db halt), there will be no problems with DONTNEED. |
@neutrinus very true! I already thought the same, but did not document it yet. |
while attic is running, file system access is slowed down for the rest of the system. that can be expected, but its effects could be mitigated if attic used
posix_fadvise(POSIX_FADV_DONTNEED)
on the files it is backuping. this tells the operating tystem that the "data will not be accessed in the near future" (man 2 posix_fadvise).this should minimize the amount of disk cache contents dropped from ram to accomodate attic's reads, while not slowing down attic. (it won't read over the same file itself (will it?), but the kernel can't know that without being told, and might keep the files around just in case attic wants to look at them again).
The text was updated successfully, but these errors were encountered: