forked from ofiwg/libfabric
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
core/mr_cache: Do not hold lock when building new cache entry
When a build a new cache entry (via util_mr_cache_create), we allocate memory and register the region with the underlying provider. This can result in the generation of monitor notifications, for example, intercepting the alloc calls. Because the notifications will acquire the cache lock in order to flush unusable entries, we cannot hold that same lock while building the entry, or deadlock can occur. This has been seen by applications. See issue ofiwg#5687. To handle this, we build new cache entries outside of the lock, and only acquire the lock when inserting them back into the cache. This opens a race condition where a conflicting entry can be inserted into the cache between the first find() call and the insert() call. We expect such occurences to be rare, as it requires a multi-threaded app to post transfers referencing the same region simultaneously from multiple threads. In order to handle the race, we need to duplicate the find() check after building the new entry prior to inserting it. If a conflict is found, we abort the insertion and restart the entire higher-level search operation. Signed-off-by: Sean Hefty <[email protected]>
- Loading branch information
Showing
1 changed file
with
58 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters