-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Partial storage locking to increase 'storing signature' performance #421
Comments
@nalind, WDYT? That's roughly related to a PR over at c/image (containers/image#611) which does per-layer locking to avoid redundantly downloading a given blob. This PR isn't merged and I doubt we ever will but the conclusion regarding the layer-locks is that it is fine as long as those locks reside on a tmpfs (the amount of lockfiles could grow high over time). My 2 cents: Fine-granular locking can be really tricky to implement (more critical sections to lock and get wrong) and can sometimes decrease performance (more locks need to be acquired, scheduling can screw caches etc.). But I'm convinced it would speed things up here. |
Be careful about creating to many files on tmpfs, I think we found a cache leak in the kernel based on number of files left in tmpfs. @mheon would know the specifics. |
Ouch, yes that'll hurt. @rhatdan, any other way to do that? We need some way to clean them up. |
@vrothberg memory based locking? |
@rhatdan sounds complex in shared memory but possible. One thing the afternoon coffee made me think is how could we migrate? Assuming we have fine-grained locking in place. How could we remain backward compatible? Other tools with older versions of c/storage might be running in parallel. |
Well they would still be taking the higher locks, so they could block the newer tools. The newer tools could not block the older. |
New versions could take the big lock as readers, this way old versions would block trying to acquire it as writers. |
Sounds like a plan, I would still keep the locks close to the storage somehow. Would something like bbolt work out for this use case? |
@mheon will know. It's crucial that the locks are non-persistent (in contrast to the containers, image, layer databases etc.). Is that possible with bolt? |
As long as the DB lives on a tmpfs, sure - that'll wipe the DB on reboot. If you need something even less persistent than that, you can close your DB connection, remove the DB, and reopen, and that ought to resolve things. |
I think we should still store the (let me call it) lock db on |
Yes, I believe so. |
Hmmm. So in that case, you'd want to detect reboots, and then "unlock" all the locks? Libpod does something similar (detect a reboot via a file in tmpfs disappearing, and then "refresh" the DB by clearing nonpersistent state). That might work? |
@mheon, I'd just put the entire DB on a tmpfs. |
I agree putting the DB on tmpfs would work. Container/storage already relies on tmpfs for cleaning up mount counts. |
containers/image#611 is placing the layer lock also on the storage tmpfs. |
So if we speak in terms of boltdb, then the database would have three buckets for layers, containers and images, whereas the entries indicate (per id) if it's a read or write lock, right? |
I am not actually sure how we could implement process-locking with bolt. We still need file-locks, which makes me believe that we can follow the idea of containers/image#611 and put them in the storage's tmpfs and reference them by convention (e.g., layers in ./layer-locks, images in ./image-locks etc.). |
Sounds also good, some additional questions from my side:
|
I assume we would need one lock to serialize accesses to the database (which would turn into a bottleneck). I would not know how to implement a synchronization mechanism in a database without reimplementing a mutex. That's why I feel more attached to just use file locking. There is an open PR against storage to add a
In containers/image#611 we create file locks at a specific folder. The file names are the digests of a blob. We can do the same for containers, images and layers if we want to. If we want to acquire a layer as reader, we would do an
I bet that file locking performs better as we don't need to load the entire DB for each process but only use the lock we actually need. File locks are real synchronization primitives with proper scheduling, signaling etc. If we tried to emulate them with a DB we'd run into all kinds of issues regarding fairness, scheduling, signaling, etc. |
@saschagrunert @vrothberg Still something we want to pursue? |
Yes, but we're still waiting for feedback from @nalind :) |
Yes, definitely worth pursuing. Layer-locking is something we might need to generalize to cover other scenarios as well (e.g., pull-delete races, simultaneous pulls, etc.). |
Also this issue is still waiting for #473 correct? |
This is more a rough idea and I'm wondering if you had any thoughts on this as well.
It relates somehow to #420
We're currently locking the whole containers, images and layers directories when doing write operations on the store. I think this could be improved by the penalty of a more complex locking mechanism.
The idea is to not lock the whole layer store, but only the layers which belong to a certain image. This would allow to store two images in parallel if they do not share the same layers.
WDYT?
The text was updated successfully, but these errors were encountered: