Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add histogram metrics for infoblox calls #805

Merged
merged 1 commit into from
Dec 16, 2021

Conversation

jkremser
Copy link
Member

@jkremser jkremser commented Dec 15, 2021

Fix issue #713

Introducing a new metric called k8gb_infoblox_request_duration_bucket of type histogram with labels:

  • request (name of the method that was called on infoblox client ~ those CRUD methods for txt record and zone delegation call)
  • success (true/false) did the call ended up ok?

It's a histogram, so it splits the space into buckets (in our case it's exponential scale - [0 -.2], [.2 -.8] , [.8 - 3.2], [3.2 - 12.8], [12.8 - 51.2], [51.2- infinity]) and the value of the metric denotes the number of hits that fell into that buckets, i.e. how many calls took that many seconds.

  • there are two special records w/ _sum and _count prefix to be able to calculate mean and what not

Note: I did not use the common labels we have on gauge metrics (gslb name and namespace), because I don't think they are useful in this context and the "combinatoric explosion" w/ all the possible vector(/label) values would be too huge.

ibclient was renamed to ibcl because the line were too long and linter was not happy about it

I did some manual testing and it looks like this:

bash-5.1# curl 10.42.0.7:8080/metrics | grep infoblo
# HELP k8gb_infoblox_request_duration How long it took for Infoblox requests to complete, partitioned by request type. Round-trip time of http communication is included.
# TYPE k8gb_infoblox_request_duration histogram
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="0.2"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="0.8"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="3.2"} 1
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="12.8"} 1
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="51.2"} 2
k8gb_infoblox_request_duration_bucket{request="TXTRecordCreate",success="false",le="+Inf"} 3
k8gb_infoblox_request_duration_sum{request="TXTRecordCreate",success="false"} 221.0000727
k8gb_infoblox_request_duration_count{request="TXTRecordCreate",success="false"} 3
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="0.2"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="0.8"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="3.2"} 1
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="12.8"} 1
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="51.2"} 2
k8gb_infoblox_request_duration_bucket{request="TXTRecordDelete",success="false",le="+Inf"} 3
k8gb_infoblox_request_duration_sum{request="TXTRecordDelete",success="false"} 221.00009020000002
k8gb_infoblox_request_duration_count{request="TXTRecordDelete",success="false"} 3
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="0.2"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="0.8"} 1
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="3.2"} 6
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="12.8"} 6
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="51.2"} 10
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="false",le="+Inf"} 13
k8gb_infoblox_request_duration_sum{request="TXTRecordUpdate",success="false"} 718.3002365
k8gb_infoblox_request_duration_count{request="TXTRecordUpdate",success="false"} 13
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="0.2"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="0.8"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="3.2"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="12.8"} 0
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="51.2"} 2
k8gb_infoblox_request_duration_bucket{request="TXTRecordUpdate",success="true",le="+Inf"} 3
k8gb_infoblox_request_duration_sum{request="TXTRecordUpdate",success="true"} 300.0000806
k8gb_infoblox_request_duration_count{request="TXTRecordUpdate",success="true"} 3

Signed-off-by: Jirka Kremser [email protected]

@jkremser jkremser force-pushed the infoblox-calltime-m branch from f509e71 to 07d480c Compare December 15, 2021 15:12
delegateTo = append(delegateTo, nameServer)
}

findZone, err := objMgr.GetZoneDelegated(p.config.DNSZone)
findZone, err := p.getZoneDelegated(objMgr, p.config.DNSZone)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to ibclient version?

Copy link
Member Author

@jkremser jkremser Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, there is a new private function w/ almost the same signature that makes the actual call (objMgr.GetZoneDelegated) but this new function surrounds the call w/ time measurement so that we measure the request times in the histogram.

Copy link
Collaborator

@k0da k0da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ytsarev ytsarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming it was e2e tested( and according to description it was )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants