You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using a modified version of stress-opa I was able to cause OPA to use multiple GBs of memory after a minute or two. Obviously this is a pathological case but if we assume that users can pass regex patterns in input, leaks will begin to appear eventually. For workloads with lower request frequency (e.g., 100s of RPS), it could be hours or days.
I verified the heap usage by enabling pprof and also tested removing the cache insertion and saw that the memory usage was stable.
I haven't tested the same issue with glob patterns but presumably we have the same problem there. Though I'm not sure what the curve looks like (e.g., overhead may be higher or lower.)
In terms of solutions... we should benchmark a few representative policies with and without caching. I'm not convinced that the cache actually saves that much time for typical patterns. The cache was included in the original implementation of the regex.match function and there is no accompanying benchmark that I am aware of. This seems like a case of premature optimization.
Here is the trivial policy that I tested with:
package x
import rego.v1
p if {
regex.match(input.patterns[_], "x")
}
Topdown contains two global caches for compiled regex and glob match patterns:
Using a modified version of stress-opa I was able to cause OPA to use multiple GBs of memory after a minute or two. Obviously this is a pathological case but if we assume that users can pass regex patterns in input, leaks will begin to appear eventually. For workloads with lower request frequency (e.g., 100s of RPS), it could be hours or days.
I verified the heap usage by enabling pprof and also tested removing the cache insertion and saw that the memory usage was stable.
I haven't tested the same issue with glob patterns but presumably we have the same problem there. Though I'm not sure what the curve looks like (e.g., overhead may be higher or lower.)
In terms of solutions... we should benchmark a few representative policies with and without caching. I'm not convinced that the cache actually saves that much time for typical patterns. The cache was included in the original implementation of the
regex.match
function and there is no accompanying benchmark that I am aware of. This seems like a case of premature optimization.Here is the trivial policy that I tested with:
Here is the modified version of stress-opa:
The text was updated successfully, but these errors were encountered: