Fix kernel panic due to tsd_exit in ZFS_EXIT(zsb) #3247
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The following panic would occur under certain heavy load:
[ 4692.202686] Kernel panic - not syncing: thread ffff8800c4f5dd60 terminating with rrw lock ffff8800da1b9c40 held
[ 4692.228053] CPU: 1 PID: 6250 Comm: mmap_deadlock Tainted: P OE 3.18.10 #7
The culprit is that ZFS_EXIT(zsb) would call tsd_exit() every time, which
would purge all tsd data for the thread. However, ZFS_ENTER is designed to be
reentrant, so we cannot allow ZFS_EXIT to blindly purge tsd data.
Instead, when we are doing rrw_exit, if we are removing the last rrn entry,
then we calls tsd_exit_key(rrw_tsd_key), which would only remove the
rrw_tsd_key tsd entry and also the PID_KEY tsd entry if it is the only entry
left for this thread.
The zfs_fsyncer_key tsds rely on ZFS_EXIT(zsb) to call tsd_exit() to do clean
up. Now we need to explicit call tsd_exit_key() on them. We also clean up the
zfs_allow_log_key when it's not needed.
Signed-off-by: Chunwei Chen [email protected]