-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: offline CPU handling #873
Comments
Interesting, we get CPU metrics from |
Hi @SuperQ , you can check this so,
|
Neither of these make sense semantically, the information for each cpu needs to be always there or always not there. |
Notice, that e.g. a broken CPU cache can trigger the offlining of a CPU during runtime, too. I.e. the ability to count the number of online and offline CPUs would be useful for alerting in this case, too. Here's an example (x86_64!):
On CentOS the script that offlines a CPU in this case can be found here:
|
Yes, I think a separate bool metric is the right thing to do here.
Looking at some of my systems, none of them have and What do we want to do if the cpu is offline, should we stop exposing |
That can cause problems with rates, they should stay exposed. Constant time series are cheap to store anyway. |
Hi @SuperQ @brian-brazil , I did some research about this topic, and also read the Linux documentation here. See,
I think, we could iterate over You can see this more detailed here.
Another approach, thinking on sparing metrics, would be having something like this,
I will be making a PR on this soon if you agree. |
Agree that a separate |
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Required for prometheus/node_exporter#873. Signed-off-by: Pranshu Srivastava <[email protected]>
Host operating system: output of
uname -a
Linux xxxx 3.10.0-693.2.2.el7.ppc64le #1 SMP Sat Sep 9 03:58:38 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux
node_exporter version: output of
node_exporter --version
node_exporter command line flags
default
Are you running node_exporter in Docker?
no
What did you do that produced an error?
none
What did you expect to see?
This PPC server has
SMT=2
(Simultaneous multithreading) which can scale on-the-fly up to 8x.In the 'SMT=2' case there are 960 metrics we could ignore (4 sockets * 5 cores * 6 (8-2) threads * 8 modes).
My feature request is to reduce the amount of CPU metrics. There are 2 alternatives that come to mind,
What did you want to see instead?
The text was updated successfully, but these errors were encountered: