-
Notifications
You must be signed in to change notification settings - Fork 8.4k
Add nginx metrics to prometheus #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
aledbf
commented
Nov 29, 2016
} | ||
|
||
func (n *NGINXController) setupMonitor(args []string) { | ||
pc, err := newProcessCollector(true, exeMatcher{"nginx", args}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why run this in the nginx controller vs in the generic controller and report Status
through an interface method? prometheus library seems pretty general purpose, and we could just ask the backend to return a map of like:
map [string]string{
"num_procs": 10,
"read_bytes": 123,
...
}
Which the generic_controller converts into some export format (could be prometheus, could be write directly to some database etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the generic_controller already exposes prometheus metrics for the go process. Using this approach each backend can decide what to export and the comments for each variable (and how to extract the information)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(could be prometheus, could be write directly to some database etc)
can we take this as an improvement and iterate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be great if we could avoid the duplication for each metric, otherwise this mostly lgtm after nits are fixed
glog.Warningf("unexpected error obtaining nginx status info: %v", err) | ||
return | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a lot of repetition here, though I haven't spent time thinking about a better solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be isolated to this backend.
In case of caddy or haproxy there's already an exporter:
https://github.com/miekg/caddy-prometheus
https://github.com/prometheus/haproxy_exporter
n.start(cmd, done) | ||
select { | ||
case err := <-done: | ||
if exitError, ok := err.(*exec.ExitError); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please explain why this is needed (i.e why is it important to wait for master process in this way vs just running start and assuming the liveness check will fail if the master never comes up?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add the comment in the code. Basically if the nginx master process dies the workers continue to process requests (passing the checks) but in case of updates in ingress no updates will be reflected in the nginx configuration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please explain why this is needed
done
} | ||
|
||
func getNginxStatus() (*nginxStatus, error) { | ||
resp, err := http.DefaultClient.Get("http://localhost:18080/internal_nginx_status") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make this port a const, and the url path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
e19c4ff
to
a43d2b1
Compare
Coverage decreased (-1.0%) to 39.239% when pulling a43d2b1e66db64f9a4a21820b4ae144626155c17 on aledbf:prometheus-nginx into 666cbf5 on kubernetes:master. |
a43d2b1
to
f7011d2
Compare
@bprashanth ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the simplest way to handle zombies is probably the shell but that can come as a follow up if testing turns out ok
@@ -23,4 +23,10 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y \ | |||
|
|||
COPY . / | |||
|
|||
# https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just rely on the shell to manage zombies?
eg http://blog.dscpl.com.au/2015/12/issues-with-running-as-pid-1-in-docker.html