memory usage is too high #302

ThomasWaldmann · 2015-05-02T23:04:22Z

To accelerate operations, attic keeps some information in RAM:

repository index (if a remote repo is used, this is allocated on remote side)
chunks cache (telling which chunks we already have in the repo, for not storing same stuff twice)
files cache (telling which filenames, mtimes, etc. we already have in the repo, so attic can just skip these files)

In this section (and also the paragraph above it), there are some [not completely clear] numbers about memory usage:
https://github.com/attic/merge/blob/merge/docs/internals.rst#indexes-memory-usage

So, if I understand correctly, this would be an estimate for the ram usage (for a local repo):

chunk_count ~= total_file_size / 65536
repo_index_usage = chunk_count * 40
chunks_cache_usage = chunk_count * 44
files_cache_usage = total_file_count * 240 + chunk_count * 80
mem_usage ~= repo_index_usage + chunks_cache_usage + files_cache_usage
                      = total_file_count * 240 + total_file_size / 400
All units are Bytes.
It is assuming every chunk is referenced exactly once and that typical chunk size is 64kiB.

E.g. backing up a total count of 1Mi files with a total size of 1TiB:

mem_usage = 1 * 2**20 * 240 + 1 * 2**40 / 400 = 2.8GiB

So, this will need 3GiB RAM just for attic. If you run attic on a NAS device (or other device with limited RAM), this might be already beyond the RAM you have available and will lead to paging (assuming you have enough swap space) and slowdown. If you don't have enough RAM+swap, attic will run into "malloc failed" or get killed by the OOM Killer.

For bigger servers, the problem will just appear a bit later:

10TiB data in 10Mi files will eat 28GiB of RAM
1TiB data in 100Mi files will eat 28GiB of RAM

The text was updated successfully, but these errors were encountered:

anarcat · 2015-05-02T23:16:16Z

so could these caches be turned into fixed-size (say relative to available RAM for example) LRU caches? in other words, are they really caches (that we can discard) or indexes (that we can't discard)?

ThomasWaldmann · 2015-05-02T23:29:14Z

So, the question now is "what are the options to deal with bigger data amounts?".

Some ideas:

increase chunk size, e.g. from 64Ki to 1Mi
- reduces chunk count related memory usage to 1/16
- reduces other related resource usage (CPU, I/O, metadata size) also
- less good deduplication / dedup granularity
- more efficient compression due to larger chunks
switch off files cache (there's a PR open for this option)
- worse speed for 2nd+ backup
have chunks cache on local disk only (use special on-disk code instead of in-ram code)
- likely much worse speed
have chunks cache on local disk only (mmap this file into ram)
- maybe not much different in speed compared to previous option, but code reusage
- existing chunks cache load/save code might even get simpler
- compared to previous option, would never run out of swap space as paging happens directly from/to the chunks cache file
do not use a chunks cache, query repo for chunks presence
- worst speed
buy (new machine with) more RAM

ThomasWaldmann · 2015-05-02T23:36:37Z

@anarcat they are caches in the sense that they cache information from the (possibly remote) repository. So you could kill them and they could be rebuilt from repo information (or from fs when creating the next archive).

LRU won't help as for the files every entry is accessed only once per "attic create". For the chunks, there are sometimes multiple accesses, but not in a way where LRU would help.

anarcat · 2015-05-02T23:44:21Z

ah right, so even if the caches would be reused, not much because it's only for "within a filesystem" deduplication...

okay, so another strategy, which you seem to already have a few ideas for.. i guess the next step is benchmarks, as there are fairly low hanging fruits there (chunk size, for one..)

level323 · 2015-05-03T03:44:55Z

My 2 cents is that the chunk size and whether or not the cache should be maintained in RAM will depend on the particular circumstances to which attic is being applied as there are many use cases, variables and trade-offs to consider.

Therefore, my present assessment is that it makes sense to:

offer an option to specify chunk size at attic repo creation time, and
gracefully and automatically fail-over to on-disk storage of the cache when a (preferably user-specifiable) RAM usage threshold is exceeded.

Regarding point 2, modern linux kernels support per-cgroup resource limiting. So one way to address seamless fallback from RAM to disk would be to put attic in a cgroup with whatever resource limits and swappiness suit their particular use case. However, this may be considered a bit of a hack and, of course, will not help Mac or Windows users.

mathbr · 2015-05-03T11:43:10Z

@ThomasWaldmann as requested on #300 here is a bit more data from my setup: my media weighs 2.8 TB and currently has 6109 files. Usually memory usage of Attic was ~11% but at the end it was mostly ~50%. Right before Attic died the usage went up to ~70%. Let me know if you need more details.

ThomasWaldmann · 2015-05-03T12:13:09Z

@mathbr ~70% of 8GiB is ~5.6GB. The formula computes 6.5 (5.3 if remote repo) GiB RAM usage for your backup data. As the formula does not consider all of attic's memory needs, just the repo index and files/chunks cache, it seems to fit. If you had some other stuff running besides attic and your swap space wasn't very large, that maybe was all the memory you had.

mathbr · 2015-05-03T12:17:12Z

Well there where indeed a few apps running in parallel, most of the memory being claimed by Chromium and Plex Media Server, everything else is rather lightweight (running Xfce as desktop).

My swap is at 2GB which is is not much but with 8GB I actually shouldn't need it at all. ;-)

ThomasWaldmann · 2015-05-18T20:51:19Z

about mmap: see 2f72b9f

mathbr · 2015-05-19T18:11:58Z

~~Has anyone tried again with that latest change yet? I'd like to know in advance how this fares before giving it another try. ;-)~~ Just noticed that this change was from July 2014, nevermind.

ThomasWaldmann mentioned this issue May 2, 2015

Exception: hashindex_set failed (malloc failed) #300

Open

maltefiala mentioned this issue May 14, 2015

Dealing with attic issues borgbackup/borg#5

Closed

ThomasWaldmann mentioned this issue May 15, 2015

memory usage is too high borgbackup/borg#16

Closed

quickslyver mentioned this issue Jun 1, 2018

use on disk hashindex instead of RAM (reduce RAM usage for large backups) borgbackup/borg#3868

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory usage is too high #302

memory usage is too high #302

ThomasWaldmann commented May 2, 2015

anarcat commented May 2, 2015

ThomasWaldmann commented May 2, 2015

ThomasWaldmann commented May 2, 2015

anarcat commented May 2, 2015

level323 commented May 3, 2015

mathbr commented May 3, 2015

ThomasWaldmann commented May 3, 2015

mathbr commented May 3, 2015

ThomasWaldmann commented May 18, 2015

mathbr commented May 19, 2015

memory usage is too high #302

memory usage is too high #302

Comments

ThomasWaldmann commented May 2, 2015

anarcat commented May 2, 2015

ThomasWaldmann commented May 2, 2015

ThomasWaldmann commented May 2, 2015

anarcat commented May 2, 2015

level323 commented May 3, 2015

mathbr commented May 3, 2015

ThomasWaldmann commented May 3, 2015

mathbr commented May 3, 2015

ThomasWaldmann commented May 18, 2015

mathbr commented May 19, 2015