Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTFSCache eviction can close feeds in use #799

Open
abyrd opened this issue Apr 6, 2022 · 4 comments
Open

GTFSCache eviction can close feeds in use #799

abyrd opened this issue Apr 6, 2022 · 4 comments

Comments

@abyrd
Copy link
Member

abyrd commented Apr 6, 2022

We occasionally see errors like:

[staging] User [[email protected]]() of group conveyal accessing /api/gtfs/5a8deb3f32b98e146046c83a/metro_bus/stops: java.lang.NullPointerException
at org.mapdb.StoreDirect.get2(StoreDirect.java:451)
at com.conveyal.analysis.controllers.GtfsController.getAllStopsForOneFeed(GtfsController.java:157)

StoreDirect.java L451 is: long indexVal = index.getLong(ioRecid); The only object in this line that can be null is the field protected Volume index. It's possible that this MapDB is simply closed. We identified a race condition in the past that was relatively rare at the time, so we erred on the side of avoiding MapDB file corruption and prematurely closing feeds under certain conditions.

The backend API is probably receiving many simultaneous requests for lists of stops from many feeds. For example, for a bundle with 30 different feeds we might hit this endpoint 30 times at once with a lot of overlapping requests, or several users may simultaneously examine bundles that each have multiple feeds. If this tries to open more feeds than the maximum size of the cache, the cache will evict feeds while they're still being used and close them as part of the eviction handler.

This error has only appeared a few times in the last 6 months, but we will need to address it more thoroughly because it will almost certainly occur more often under increasing concurrent load.

The point of GTFSCache is to hold MapDBs open when they are being used, but limit the number of them held open so we don't have an unbounded number of memory mappings open or exhaust off-heap memory. So GTFSCache is configured to evict feeds and close them when too many are open or they haven't been used in a while.

MapDBs must be manually closed when we are finished using them to allow cleanup. There is no one place to handle closing these MapDBs - you can close them during eviction but something might still hold a reference to them and try to use them. You could close them after use, but then the cache serves no purpose as it's holding objects that were closed by their first user. The documentation for LoadingCache at https://github.com/google/guava/wiki/CachesExplained#removal-listeners shows a database connection being closed in a removal listener, which is analogous to what we're doing. But it doesn't explain how the eviction listener would know whether something is holding a reference to this database connection and potentially using it when close() is called, nor how it would defer closure even if it had this information. Perhaps the approach illustrated only works with time-based eviction and not with size-based eviction.

It is conceivable to defer cleanup until each block of code that called get() signals that it has finished using the value it retrieved. But by itself this would still allow the number of simultaneously open values to exceed the limit we impose, consuming memory in an unbounded way. In the case where 40 threads open 40 different databases at once, what we actually want is for most of them to block and wait their turn to open and close the databases.

A good option would be something like a Guava or Caffeine LoadingCache with some additional constraints:

  1. Evict and close only items whose user count is zero (or haven't been used in t > T to gracefully handle accidental non-release)
  2. Do not load and open any items until it's possible to do so without increasing the total number of open items above the limit.

Calling code would be required to manually release a key after it called get(key) - doing this automatically seems nearly impossible.

A quick search does not reveal any obvious choices for a cacheing concurrent "heavy resource manager" like what I'm describing. It could be crafted with a bunch of ReentrantLocks and Conditions, but concurrent code is definitely an area where it's better to use heavily tested standard building blocks if possible.

@abyrd
Copy link
Member Author

abyrd commented Apr 6, 2022

An alternate solution is to open and close every MapDB every time it's used. I would expect this to generate tons of mmap system calls so it doesn't seem like a great solution, but it might be worth prototyping and profiling. And it still doesn't solve the problem of limiting the total number of MapDBs accessed at once.

@ben-manes
Copy link

Other ideas to consider…

A common approach is to return a closeable proxy. For example a transaction manager will return the same underlying connection until fully committed, with the proxy only faking that the connection was closed. This is explicit reference counting by resource management.

A phantom reference (e.g. using Cleaner) can be useful for implicit reference counting. This relies on the GC so it might be delayed, but is a modern alternative to finalizers.

abyrd added a commit that referenced this issue Apr 6, 2022
Hard limit on total number of feeds open at once.
get() blocks when full until a feed can be evicted (has no users).
only one GTFSFeed can be open for each key at a time.

Mostly untested draft, needs serious audit for concurrency issues.

To simplify (while losing generality):
reference counting logic could be pushed down into GTFSFeed itself.
ReferenceCountingCache logic could be pulled up into GTFSCache.
@abyrd
Copy link
Member Author

abyrd commented Apr 6, 2022

Thanks for the pointers @ben-manes. We appreciate all your work on Caffeine which has found many uses in our system. I was leaning toward a manual reference-counting approach using an AutoCloseable wrapper around the values, which seems reasonably pleasant to work with. There's still a bit of a trick to pull off where the "last one out turns off the lights" but only when the cache has already evicted the value. I suppose this can be achieved by having the cache itself behave as one reference. But since we want to enforce the existence of only one open value per unique key, I believe the cache also needs to be able to reverse its decision to evict if a new get() comes in after eviction, but while still waiting for a previous get() caller to close() and bring the RC down to zero. Or similarly defer the eviction decision until all get() callers have called close(). Even if we defer closing to the right time, imposing a hard limit on the number of open values (for predictable maximum resource usage) seems like a separate problem. I'm thinking we want to block on get() until there's a free or evictable slot in the cache. I'm hesitant about delayed GC approaches e.g. involving Cleaner because the resource usage we're trying to keep under tight control goes beyond JVM managed heap memory (including off-heap memory and mmapped files).

I have pushed a quick concept for how we might want this to behave in c1cb077 on branch blocking-gtfs-cache.

@ben-manes
Copy link

I like it! The logic seems correct when skimming.

The cache supports pinning to opt an entry out from eviction, and will handle that removal atomically with any metadata changes to avoid races. However the only feasible approach to avoid violating the contracts is to opt-out of the resource constraints. For example setting the entry's weight to zero so that eviction skips over it. That ensures it grows to use the additional capacity (if not explicitly reduced) and eviction will find a victim. For resource pooling, a cache is not really a fit and your approach is much nicer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants