Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process walker perfs: optimize readLimits and readStats #2461

Closed
alban opened this issue Apr 24, 2017 · 2 comments
Closed

process walker perfs: optimize readLimits and readStats #2461

alban opened this issue Apr 24, 2017 · 2 comments
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore

Comments

@alban
Copy link
Contributor

alban commented Apr 24, 2017

I am testing Scope with a load of 100 processes running my openfiles program, so that should generate 50.000 opened files.

I get the following pprof data:

scope-pprof-proc-walker

For the proc walker, that's:

There is room for improving the string parsing. For readStats, I wrote an alternative implementation of the parser: https://gist.github.com/alban/7bdf6ced1352d8efbc6c22a64eabe7e3 and I get the following improvements:

$ sudo ./read-stats-limits 
time (old): 8.345651064s
time (new): 5.330234374s

I initially tried with fmt.Sscanf but unfortunately, the performance was bad. My new implementation follows the pattern explained on http://stackoverflow.com/questions/31333353/faster-input-scanning

@alban
Copy link
Contributor Author

alban commented Apr 24, 2017

The /proc/<pid>/limits files are cached (by probe/process/cache.go, with a timeout of 30s) so they are not read from procfs each time. But only the unparsed version is cached, and it is parsed each time in readLimits(). Ditto for /proc/<pid>/cmdline and /proc/<pid>/comm.

We should try to cache the parsed version instead. The cache timeout could be increased too: processes should rarely change their "ulimit" and their "cmdline"/"comm" (if done at all, this is usually done only one time at startup).

The constant statsTimeout is declared but unused: the stats are not cached in master.

@alban
Copy link
Contributor Author

alban commented Apr 25, 2017

I compared:

for 4 minutes with the following commands:

$ sudo ./scope launch -probe.ebpf.connections --probe.http.listen :4041
$ go tool pprof http://localhost:4041/debug/pprof/profile?seconds=240
web

The scenario is:

  • 100 processes with 500 file descriptors each (50.000 opened files)
  • 7 idle Docker containers (nginx & busybox)
  • approx. 510 processes on my Fedora laptop

Branch master

pprof-4minutes


Branch perf-proc-walker

pprof-4minutes


The function github.com/weaveworks/scope/probe/process.(*walker).Walk takes respectively:

  • 20s
  • 10.24s (almost 2x faster)

Overall, the graphs' titles say respectively:

  • 48.82s
  • 28.60s (41% faster overall)

@rade rade added performance Excessive resource usage and latency; usually a bug or chore chore Related to fix/refinement/improvement of end user or new/existing developer functionality labels May 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore
Projects
None yet
Development

No branches or pull requests

2 participants