Host instrumentation for available memory on Linux systems is less accurate than is tested for #425
Labels
area: instrumentation
Related to an instrumentation package
area: testing
Related to package testing
bug
Something isn't working
instrumentation: host
Frequently the TestHostMemory test in the host instrumentation fails because the relative error is too high. E.g.
opentelemetry-go-contrib/instrumentation/host/host_test.go
Line 180 in a0dc004
Or,
opentelemetry-go-contrib/instrumentation/host/host_test.go
Line 183 in a0dc004
This was run in the following environment.
Based on this build, the package we use to measure the host memory is directly reading out of
/proc/meminfo
for the available memory and using the following algorithm to calculate the used memory:The value read out of
/proc/meminfo
is an estimate that does not necessarily equal the inverse of the above computation:This means that our test to validate the relationship between the reported available and used memory will be an estimate at best and likely to be outside of the tolerance we are testing for.
Proposal
We should remove the tolerance test between used and available memory. At least in the interim where we are deciding how to move forward. Having a flaky test that fails randomly on certain system is not idea.
For the actual measured values, we could replace our utilization metric measurement:
opentelemetry-go-contrib/instrumentation/host/host.go
Line 212 in a0dc004
with a scaled version of the
UsedPercent
(i.e.UsedPercent/100.0
) our dependency reports. For the available system memory we could just report1 - UsedPercent
. This would not be an accurate guess as to how much memory a theoretical application starting up would actually have, but it would match the users of this monitoring code. They will expect used + available to equal 1.0.The text was updated successfully, but these errors were encountered: