Add a diagnostic kstat for obtaining pool status #17076
Open
+1,570
−362
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
A hung pool process can be left holding the spa config lock or the spa namespace lock. If an admin wants to observe the status of a pool using the traditional zpool status, it could hang waiting for one of the locks held by the stuck process. It would be nice to observe pool status in this scenario without the risk of the inquiry hanging.
This PR is an aggregated and updated version of #16026 and #16484.
Description
This change adds
/proc/spl/kstat/zfs/<poolname>/status_json
.This kstat output does not require taking the spa_namespace lock, as in the case for 'zpool status'. It can be used for investigations when pools are in a hung state while holding global locks required for a traditional 'zpool status' to proceed.
The newly introduced
zfs_lockless_read_enabled
module parameter enables traversal of related kernel structures in cases where the required config read locks cannot be taken.When
zfs_lockess_read_enabled
is set, this kstat is not safe to use in conditions where pools are in the process of configuration changes (i.e., adding/removing devices).The idea is to follow
zpool status -jp
output as much as possible:How Has This Been Tested?
Added new
pool_status_json.ksh
atuomated test which compares the output withzpool status -jp
.Types of changes
Checklist:
Signed-off-by
.