Memory Dump for very large RDB files (> 30 GBs) is Slow #23

jsrawan-mobo · 2013-03-19T15:31:38Z

For very large RDB, the memory dump can take upwards of 30 minutes. Even slower, the "key" feature requires a sequential scan over the whole file.

Finally trying to further introspect a data structure like a hash, list, set to find out which field is taking up the most memory. In my case I use celery as worker queue, and some tasks can be gigantic.

So I've made some enhancements such as the following
i) Reduce time to about 5 minutes to dump in quick mode
ii) Allow re-seeking for key contents in seconds, and limit mode
iii) Allow for verbose dumping of hash/list/set to file structure

sripathikrishnan · 2013-03-19T16:41:57Z

@jsrawan-mobo Thanks for taking the time to investigate this!

I am painfully aware of the sub-optimal performance. I have been tracking it under issue#1, but haven't really found the motivation to fix it yet.

It seems you have made some fixes/enhancements. Did you miss a pull request? Can you point me where you have made these fixes?

jsrawan-mobo · 2013-03-20T06:50:37Z

See Pull Request #24.

It not completely done, but you can try and see the performance improvement by skipping past the lzf_decompress() and storing the index to a deep dump later.

If you like where its headed, i can cleanup and do a proper pull request.

amarlot · 2016-07-19T12:36:51Z

Have you been able to improve it ? Would it be possible to realease it ?
As for huge DB (about 50Go / 1 Millions keys) on very faster server it takes like half a day as it's monothread.

Thanks,
Alex

jsrawan-mobo · 2016-07-31T06:20:03Z

I hadn't looked at this in a few years, seems like this project went stale. The pull request I put up does work in quick mode like this if you want to give it a try

Generate a quick memory dump and index. In quick mode, only compressed_size is valid.
rdb.py -c memory -q --file redis_memory_quick.csv redis.rdb
After viewing, dump a hash/list to view contents of a offending key
rdb.py -c memory --max 1 --pos 3568796958 -v --key mongow --file redis_memory_mongow.csv redis.rdb

I'd be willing to fix this up if someone finds use for it, or fork the repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Dump for very large RDB files (> 30 GBs) is Slow #23

Memory Dump for very large RDB files (> 30 GBs) is Slow #23

jsrawan-mobo commented Mar 19, 2013

sripathikrishnan commented Mar 19, 2013

jsrawan-mobo commented Mar 20, 2013

amarlot commented Jul 19, 2016

jsrawan-mobo commented Jul 31, 2016

Memory Dump for very large RDB files (> 30 GBs) is Slow #23

Memory Dump for very large RDB files (> 30 GBs) is Slow #23

Comments

jsrawan-mobo commented Mar 19, 2013

sripathikrishnan commented Mar 19, 2013

jsrawan-mobo commented Mar 20, 2013

amarlot commented Jul 19, 2016

jsrawan-mobo commented Jul 31, 2016