Remove scanner pool in favor of single-use scanners #765
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The scanner pool concept was neat in theory, but yara-x leverages WASM which exhibits unpredictable behavior when a new scanner is created. Notably, each new scanner is allocated 8GB of virtual memory for the time being and this value is not configurable. Controlling this behavior with a separate value then adds another layer of complexity.
This PR reverts to the original pattern we used for go-yara with the main downside being that we are unable to scan file descriptors with yara-x and must instead read the entire file into memory before scanning it. That said, we'll be able to directly influence how many scanners are created with
c.Concurrency
rather than needing to account for two values. Ultimately, not using the scanner pool has little impact on performance (though I'd much rather have stability right now anyway).To allow for yara-x to work in environments like Cloud Run, we'll still need a change that allows for WASM tuning to drop the memory reservation to hopefully free up virtual memory overhead and prevent panics.