-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: fs-repo-11-to-12 special flatfs handling #152
Conversation
9325bbf
to
359523d
Compare
359523d
to
2e77a1d
Compare
… doing it for the actual migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some preliminary testing on my local machine indicates this is much faster, so it'd be great if we could land this.
Did some local benchmarking: On my HDD (running on Windows)
I suspect SSDs which have better random access will perform better here. There's probably also clever ways to group up renames to maximize mechanical sympathies, but what we have now seems like a pretty good improvement. Looking at this migration running on my HDD (using a single worker) I'm actually a little concerned that users with lots of data might be annoyed at the disk usage during the migration and might ask us to tune it down. If users have non-huge repos and are using SSDs (e.g. most people on laptops and desktops) they should be fine though. WDYT? |
:-) I am very proud of my tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks ok to me.
We should test it.
My only question is whether this should have a flag to enable (disabled by default), or to disable (enabled by default).
I guess the main danger is that this breaks on some weird mount configuration or something, but seems to take these things into account.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, pushing this in at the last second makes me slightly uncomfortable because it hasn't been exercised by users w/ the RC.
The users who benefit from this are the ones with default datastore configs, are there a lot of such users who have huge repos that would benefit from this optimization? What's the "typical" speedup that a typical user would see?
Also log total swapped in flatfs specific migration
Hard to say. I think most people have the default configs and some people with really big repos have started coming back to FlatFS after trying out Badger and running into problems, so this is a bunch of users. The larger repo I was testing on locally (although I didn't finish the full migration) was about 80GB with 1.2M CIDv1s and I was seeing a 10x speedup. I'd suspect typical users might see even bigger speedups if they have SSDs, but haven't directly tested that yet.
True and me too, although I'm not sure what our exercise rate on v0.12.0-rc1 has been despite how long it's been out for. In theory we can just do RC2 and wait a week or so and do the final release, but I'm not sure how much value that brings us. |
…cle so that they are all logged
…eneric migration worker
…ngleton anyway. close datastore after use in migration tests
Did some local testing of env vars just to double check (Go tests aren't too happy about dealing with them) and everything looks good so we're going to hit some buttons and do a release 🎉 |
…ed for the flatfs fast path
Trying to add some special casing fast-path code for flatfs.
Some things that are currently missing that may be problematic are: