feat: fs-repo-11-to-12 special flatfs handling #152

aschmahmann · 2022-02-15T19:59:16Z

Trying to add some special casing fast-path code for flatfs.

Some things that are currently missing that may be problematic are:

We're not using the datastore which means things like disk usage could technically end up out of sync, doesn't seem relevant though since we're not adding or removing data
Reversions do not leave behind data copies (e.g. a CIDv0 and a CIDv1) which effects both the disk usage and more importantly the discoverability of data. Good news here is that we could use the slow code path for reversion and things should be fine (first let's see if CI catches the bug)
- CI did catch the bug 🎉, and using the slow path for the reversion seems fine. I hope/suspect reversion of this migration will be very uncommon anyway

… doing it for the actual migration

aschmahmann

Some preliminary testing on my local machine indicates this is much faster, so it'd be great if we could land this.

fs-repo-11-to-12/migration/setup.go

fs-repo-11-to-12/migration/swapper.go

aschmahmann · 2022-02-16T17:58:21Z

Did some local benchmarking:

On my HDD (running on Windows)

190k FlatFS moves in ~10minutes (Disk usage showed 100% on that drive which is only being used for this migration)
16k regular moves in ~11 min (Disk usage was ~90% on the drive)

I suspect SSDs which have better random access will perform better here. There's probably also clever ways to group up renames to maximize mechanical sympathies, but what we have now seems like a pretty good improvement.

Looking at this migration running on my HDD (using a single worker) I'm actually a little concerned that users with lots of data might be annoyed at the disk usage during the migration and might ask us to tune it down. If users have non-huge repos and are using SSDs (e.g. most people on laptops and desktops) they should be fine though. WDYT?

hsanjuan · 2022-02-16T18:30:30Z

first let's see if CI catches the bug)

:-) I am very proud of my tests

hsanjuan

This looks ok to me.

We should test it.

My only question is whether this should have a flag to enable (disabled by default), or to disable (enabled by default).

I guess the main danger is that this breaks on some weird mount configuration or something, but seems to take these things into account.

guseggert

TBH, pushing this in at the last second makes me slightly uncomfortable because it hasn't been exercised by users w/ the RC.

The users who benefit from this are the ones with default datastore configs, are there a lot of such users who have huge repos that would benefit from this optimization? What's the "typical" speedup that a typical user would see?

fs-repo-11-to-12/migration/setup.go

fs-repo-11-to-12/migration/swapper.go

Also log total swapped in flatfs specific migration

aschmahmann · 2022-02-16T19:34:12Z

The users who benefit from this are the ones with default datastore configs, are there a lot of such users who have huge repos that would benefit from this optimization? What's the "typical" speedup that a typical user would see?

Hard to say. I think most people have the default configs and some people with really big repos have started coming back to FlatFS after trying out Badger and running into problems, so this is a bunch of users.

The larger repo I was testing on locally (although I didn't finish the full migration) was about 80GB with 1.2M CIDv1s and I was seeing a 10x speedup. I'd suspect typical users might see even bigger speedups if they have SSDs, but haven't directly tested that yet.

TBH, pushing this in at the last second makes me slightly uncomfortable because it hasn't been exercised by users w/ the RC.

True and me too, although I'm not sure what our exercise rate on v0.12.0-rc1 has been despite how long it's been out for. In theory we can just do RC2 and wait a week or so and do the final release, but I'm not sure how much value that brings us.

…by default

…cle so that they are all logged

…eneric migration worker

fs-repo-11-to-12/migration/migration_test.go

…ngleton anyway. close datastore after use in migration tests

fs-repo-11-to-12/migration/swapper.go

aschmahmann · 2022-02-17T21:58:55Z

Did some local testing of env vars just to double check (Go tests aren't too happy about dealing with them) and everything looks good so we're going to hit some buttons and do a release 🎉

…ed for the flatfs fast path

aschmahmann force-pushed the feat/fs-repo-11-to-12-flatfs branch from 9325bbf to 359523d Compare February 15, 2022 21:13

feat: fs-repo-11-to-12 special flatfs handling

2e77a1d

aschmahmann force-pushed the feat/fs-repo-11-to-12-flatfs branch from 359523d to 2e77a1d Compare February 15, 2022 21:44

aschmahmann requested a review from hsanjuan February 15, 2022 22:25

aschmahmann changed the title ~~[WIP] feat: fs-repo-11-to-12 special flatfs handling~~ feat: fs-repo-11-to-12 special flatfs handling Feb 15, 2022

fix: run the migration using the flatfs fast-path. Was previously not…

6cf297c

… doing it for the actual migration

aschmahmann commented Feb 16, 2022

View reviewed changes

fs-repo-11-to-12/migration/setup.go Show resolved Hide resolved

fs-repo-11-to-12/migration/swapper.go Show resolved Hide resolved

fs-repo-11-to-12/migration/swapper.go Outdated Show resolved Hide resolved

aschmahmann commented Feb 16, 2022

View reviewed changes

fs-repo-11-to-12/migration/swapper.go Show resolved Hide resolved

aschmahmann commented Feb 16, 2022

View reviewed changes

fs-repo-11-to-12/migration/swapper.go Show resolved Hide resolved

hsanjuan reviewed Feb 16, 2022

View reviewed changes

guseggert reviewed Feb 16, 2022

View reviewed changes

fs-repo-11-to-12/migration/setup.go Outdated Show resolved Hide resolved

fs-repo-11-to-12/migration/swapper.go Show resolved Hide resolved

add time to incremental migration worker logging

7bdf4b4

Also log total swapped in flatfs specific migration

guseggert approved these changes Feb 16, 2022

View reviewed changes

aschmahmann added 3 commits February 16, 2022 15:37

add env var to control disabling the flatfs fast path. it is enabled …

015aa16

…by default

log the number of successful flatfs swaps at the end of the worker cy…

e5286e2

…cle so that they are all logged

changed generic migration worker logging to indicate that it is the g…

2b16950

…eneric migration worker

aschmahmann commented Feb 16, 2022

View reviewed changes

fs-repo-11-to-12/migration/migration_test.go Show resolved Hide resolved

load plugins as global singleton since the datastore registry is a si…

0eca6f7

…ngleton anyway. close datastore after use in migration tests

aschmahmann requested a review from guseggert February 16, 2022 22:39

aschmahmann mentioned this pull request Feb 16, 2022

Release v0.12 ipfs/kubo#8344

Closed

59 tasks

guseggert approved these changes Feb 17, 2022

View reviewed changes

fs-repo-11-to-12/migration/swapper.go Show resolved Hide resolved

updated internal error used to indicate a datastore isn't one support…

a0957a0

…ed for the flatfs fast path

aschmahmann merged commit 3dc218e into master Feb 17, 2022

aschmahmann deleted the feat/fs-repo-11-to-12-flatfs branch February 17, 2022 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fs-repo-11-to-12 special flatfs handling #152

feat: fs-repo-11-to-12 special flatfs handling #152

aschmahmann commented Feb 15, 2022 •

edited

Loading

aschmahmann left a comment

aschmahmann commented Feb 16, 2022

hsanjuan commented Feb 16, 2022

hsanjuan left a comment

guseggert left a comment

aschmahmann commented Feb 16, 2022

aschmahmann commented Feb 17, 2022

feat: fs-repo-11-to-12 special flatfs handling #152

feat: fs-repo-11-to-12 special flatfs handling #152

Conversation

aschmahmann commented Feb 15, 2022 • edited Loading

aschmahmann left a comment

Choose a reason for hiding this comment

aschmahmann commented Feb 16, 2022

hsanjuan commented Feb 16, 2022

hsanjuan left a comment

Choose a reason for hiding this comment

guseggert left a comment

Choose a reason for hiding this comment

aschmahmann commented Feb 16, 2022

aschmahmann commented Feb 17, 2022

aschmahmann commented Feb 15, 2022 •

edited

Loading