Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Top CPU consumers missorted #203

Closed
mauved opened this issue Feb 18, 2019 · 7 comments
Closed

Top CPU consumers missorted #203

mauved opened this issue Feb 18, 2019 · 7 comments
Assignees
Labels
Milestone

Comments

@mauved
Copy link

mauved commented Feb 18, 2019

The top CPU consumers in the resources file are not being sorted correctly on both older and newer operating systems. The pidstat version check seems to result in the 7th column always being used as the sorting key.

recap/src/core/resources

Lines 112 to 128 in 4ceb5be

local pidstat_cpufield=0
#We need to know whether the version of pidstat is > 10.1.4, since that added a UID field
local version_strings="$( pidstat -V 2>&1 |
awk '{if(NR==1){print $NF}}' ) 10.1.4"
local sorted_versions="$( echo ${version_strings} | tr ' ' '\n' |
sort --version-sort | tr '\n' ' ' )"
if [[ "${version_strings}" == "${sorted_versions}" ]]; then
pidstat_cpufield=6
else
pidstat_cpufield=7
fi
echo "Top 10 cpu using processes" >> "${LOGFILE}"
# capture header
pidstat | sed -n '3p' >> "${LOGFILE}"
pidstat -l 2 2 | grep -v '%system' |
egrep ^Average: | sort -nr -k ${pidstat_cpufield} |
head -11 >> "${LOGFILE}"

I did some testing to be sure. I ran nice -n $(( RANDOM % 40 - 20 )) stress -c 1 -q & a few times to get a good list of processes of varying CPU loads, then manually executed recap after I had a sufficient number of stress processes running.

Test 1: Fedora 29

[root@localhost recap]# grep -A 8 "Top 10 cpu using processes" resources_20190218-205205.log
Top 10 cpu using processes
08:52:09 PM   UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0     16390    0.25    0.00    0.00  268.83    0.25     -  stress -c 1 -q
Average:        0     16388    3.74    0.25    0.00   96.76    3.99     -  stress -c 1 -q
Average:        0     17093    3.99    0.00    0.00   95.01    3.99     -  stress -c 1 -q
Average:        0     16392   11.97    0.50    0.00   87.28   12.47     -  stress -c 1 -q
Average:        0     16395   36.16    1.75    0.00   61.35   37.91     -  stress -c 1 -q
Average:        0     17090   36.66    1.25    0.00   60.85   37.91     -  stress -c 1 -q

[root@localhost recap]# pidstat -V
sysstat version 11.7.3
(C) Sebastien Godard (sysstat <at> orange.fr)
[root@localhost recap]# recap --version
2.0.1

Test 2: CentOS 6

[root@localhost recap]# grep -A 8 "Top 10 cpu using processes" resources_20190218-210442.log
Top 10 cpu using processes
09:04:57 PM       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        27110    0.00    0.41    0.00    0.41     -  pidstat -l 2 2
Average:        27007    0.41    0.00    0.00    0.41     -  stress -c 1 -q
Average:        26841  100.00    0.00    0.00  100.00     -  stress -c 1 -q
Average:        26839   21.58    0.00    0.00   21.58     -  stress -c 1 -q
Average:        26837    1.66    0.00    0.00    1.66     -  stress -c 1 -q
Average:        26835   21.99    0.00    0.00   21.99     -  stress -c 1 -q
Average:        26833   17.43    0.00    0.00   17.43     -  stress -c 1 -q
[root@localhost recap]# pidstat -V
sysstat version 9.0.4
(C) Sebastien Godard (sysstat <at> orange.fr)
[root@localhost recap]# recap --version
1.4.0
@tonyskapunk
Copy link
Contributor

Hi @mauved thanks for taking the time in opening and documenting this issue.

Hmmm, seems like a newer version of pidstat introduced wait% column. I'll check all the distros supported to find where that one was added, since CentOS 7 is sorting it still right:

CentOS 7

# pidstat | sed -n '3p'; pidstat -l 2 2 | grep -v '%system' |egrep ^Average: | sort -nr -k7 | head -3
04:16:58 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0      1651   13.09   83.46    0.00   96.54     -  cat /dev/zero 
Average:        0      1658    0.00    0.74    0.00    0.74     -  pidstat -l 2 2 
Average:        0      1554    0.25    0.25    0.00    0.49     -  /usr/sbin/xe-daemon -p /var/run/xe-daemon.pid 

# pidstat -V 2>&1 | head -1
sysstat version 10.1.5

I'll get this fixed, thanks!!

@tonyskapunk tonyskapunk self-assigned this Feb 18, 2019
@tonyskapunk
Copy link
Contributor

The field was added in version 11.5.5: https://github.com/sysstat/sysstat/blob/master/CHANGES#L269

These are the pidstat versions per distro supported:

Red Hat based:

  • centos6: 9.0.4

  • centos7: 10.1.5

  • fedora28: 11.7.3*

  • fedora29: 11.7.3*

  • rawhide: 11.7.3*

Debian based:

  • debian8: 11.0.1

  • debian9: 11.4.3

  • debian10: 12.0.1*

  • ubuntu14: 10.2.0

  • ubuntu16: 11.2.0

  • ubuntu18: 11.6.1*

*Affected

@tonyskapunk
Copy link
Contributor

@mauved I got a fix in #204. Would you mind giving it a look and provide any feedback you may have? Thanks!

Tests of the new way to calculate the %CPU field:

u14:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-122001.log 
Top 10 cpu using processes
12:20:08      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     14027   14.14   85.11    0.00   99.26     -  cat /dev/zero 
Average:        0     18266    0.25    0.25    0.00    0.50     -  pidstat -l 2 2 
Average:        0        46    0.00    0.25    0.00    0.25     -  kworker/0:1
 
Top 10 memory using processes

u16:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-124816.log
Top 10 cpu using processes
12:48:23      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     16752    7.73   67.08    0.00   74.81     -  cat /dev/zero 
Average:        0     16751    1.50   22.94    0.00   24.44     -  cat /dev/zero 
Average:        0      6372    0.25    0.00    0.00    0.25     -  /usr/bin/python3 /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b 
Average:        0     16897    0.25    0.00    0.00    0.25     -  pidstat -l 2 2 
 
Top 10 memory using processes

u18

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-122009.log  
Top 10 cpu using processes
12:20:16      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0      8286    7.69   91.32    0.00    0.74   99.01     -  cat /dev/zero 
Average:        0     11961    0.00    0.50    0.00    0.50    0.50     -  pidstat -l 2 2 
Average:        0       549    0.25    0.00    0.00    0.00    0.25     -  /usr/bin/python3 /usr/bin/nova-agent --no-fork -o /var/log/nova-agent.log -l info 

c6:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p'  /var/log/recap/resources_20190219-122814.log
Top 10 cpu using processes
12:28:20          PID    %usr %system  %guest    %CPU   CPU  Command
Average:        12362   19.50   79.25    0.00   98.75     -  cat /dev/zero 
Average:        14527    0.25    0.25    0.00    0.50     -  pidstat -l 2 2 
Average:         3148    0.25    0.00    0.00    0.25     -  /usr/local/bin/driveclient --daemon 
Average:         1144    0.25    0.00    0.00    0.25     -  /usr/bin/python2.6 /usr/bin/nova-agent -o /var/log/nova-agent.log -l info 
Average:            1    0.00    0.25    0.00    0.25     -  /sbin/init 
 
Top 10 memory using processes

c7:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-123937.log
Top 10 cpu using processes
12:39:44      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0      5646    8.93   59.80    0.00   68.73     -  cat /dev/zero 
Average:        0      5482    2.48   20.10    0.00   22.58     -  cat /dev/zero 
Average:        0      2770    6.95    0.00    0.00    6.95     -  /usr/bin/rackspace-monitoring-agent -l /var/log/rackspace-monitoring-agent.log --production --exit-on-upgrade 
Average:        0      5755    0.00    0.50    0.00    0.50     -  pidstat -l 2 2 
Average:        0     25258    0.25    0.00    0.00    0.25     -  /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-
Average:        0     25253    0.25    0.00    0.00    0.25     -  /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec
 
Top 10 memory using processes

d8:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-123601.log 
Top 10 cpu using processes
12:36:08      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     29604    9.98   89.78    0.00   99.75     -  cat /dev/zero 
Average:      106      4359    0.00    0.25    0.00    0.25     -  /usr/sbin/ntpd -p /var/run/ntpd.pid -g -u 106:111 
Average:        0       424    0.25    0.00    0.00    0.25     -  /usr/bin/python /usr/bin/nova-agent --no-fork True -o /var/log/nova-agent.log -l info 
Average:        0     29755    0.25    0.00    0.00    0.25     -  pidstat -l 2 2 
 
Top 10 memory using processes

d9:

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-123834.log
Top 10 cpu using processes
12:38:41      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     14136    5.76   59.15    0.00   64.91     -  cat /dev/zero 
Average:        0     14138    3.01   23.56    0.00   26.57     -  cat /dev/zero 
Average:        0     14135    0.50    8.27    0.00    8.77     -  cat /dev/zero 
Average:        0     14236    0.25    0.00    0.00    0.25     -  pidstat -l 2 2 
 
Top 10 memory using processes

d10(buster):

# sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-190911.log 
Top 10 cpu using processes
19:09:13      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0      2302    3.25   96.75    0.00    0.00  100.00     -  cat /dev/zero 
Average:        0      1959    2.50   97.00    0.00    0.00   99.50     -  cat /dev/zero 
Average:        0      1958    3.00   96.50    0.00    0.25   99.50     -  cat /dev/zero 
 
Top 10 memory using processes

f29:

$ sudo sed -n '/^Top 10 cpu/,/^Top 10 mem/p' /var/log/recap/resources_20190219-123350.log
Top 10 cpu using processes
12:33:57      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:     1000     12818    2.74   96.76    0.00    0.00   99.50     -  cat /dev/zero 
Average:     1000     20237    1.75    1.75    0.00    0.00    3.49     -  /usr/lib64/firefox/firefox -contentproc -childID 7 -isForBrowser -prefsLen 5964 -prefMapSize 279949 -schedulerPrefs 0001,2 -parentBuildID 20190204162333 -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.ja -appdir /usr/lib64/firefox/browser 10187 true tab 
Average:     1000      1790    1.00    1.75    0.00    0.25    2.74     -  /usr/bin/pulseaudio --daemonize=no 
Average:     1000     10187    1.50    1.25    0.00    0.00    2.74     -  /usr/lib64/firefox/firefox 
Average:     1000     20045    0.50    0.25    0.00    0.00    0.75     -  /usr/lib64/firefox/firefox -contntproc -childID 6 -isForBrowser -prefsLen 5964 -prefMapSize 279949 -schedulerPrefs 0001,2 -parentBuildID 20190204162333 -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.ja -appdir /usr/lib64/firefox/browser 10187 true tab 
Average:     1000     10628    0.75    0.00    0.00    0.00    0.75     -  /usr/lib64/firefox/firefox -contentproc -childID 2 -isForBrowser -prefsLen 122 -prefMapSize 279949 -schedulerPrefs 0001,2 -parentBuildID 20190204162333 -greomni /usr/lib64/firefox/omni.ja -appomni /usr/lib64/firefox/browser/omni.ja -appdir /usr/lib64/firefox/browser 10187 true tab 
Average:        0     14805    0.00    0.50    0.00    0.00    0.50     -  pidstat -l 2 2

Top 10 memory using processes

@mauved
Copy link
Author

mauved commented Feb 19, 2019

Test: CentOS 6

[root@40523e5bcf7f ~]# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
19:55:40          PID    %usr %system  %guest    %CPU   CPU  Command
Average:          460   26.68    0.00    0.00   26.68     -  stress -c 1 -q
Average:          449   25.19    0.00    0.00   25.19     -  stress -c 1 -q
Average:          451   24.19    0.00    0.00   24.19     -  stress -c 1 -q
Average:          607   14.71    0.00    0.00   14.71     -  stress -c 1 -q
Average:          609    7.48    0.00    0.00    7.48     -  stress -c 1 -q
Average:          447    2.00    0.00    0.00    2.00     -  stress -c 1 -q

[root@40523e5bcf7f ~]# cat /etc/redhat-release ; pidstat -V
CentOS release 6.10 (Final)
sysstat version 9.0.4
(C) Sebastien Godard (sysstat <at> orange.fr)

Test: CentOS 7

[root@d2b80f7f24fd ~]# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:27:49      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0       351   26.59    0.00    0.00   26.59     -  stress -c 1 -q
Average:        0       359   25.12    0.00    0.00   25.12     -  stress -c 1 -q
Average:        0       346   20.00    0.00    0.00   20.00     -  stress -c 1 -q
Average:        0       344   18.54    0.24    0.00   18.78     -  stress -c 1 -q
Average:        0       355    6.10    0.00    0.00    6.10     -  stress -c 1 -q
Average:        0       357    3.17    0.00    0.00    3.17     -  stress -c 1 -q

[root@d2b80f7f24fd ~]# cat /etc/redhat-release ; pidstat -V
CentOS Linux release 7.6.1810 (Core)
sysstat version 10.1.5
(C) Sebastien Godard (sysstat <at> orange.fr)

Test: Fedora 29

[root@feea61d74bde ~]# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:44:25      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0       112   29.68    0.00    0.00   70.07   29.68     -  stress -c 1 -q
Average:        0       108   29.18    0.00    0.00   71.07   29.18     -  stress -c 1 -q
Average:        0       116   21.20    0.00    0.00   78.55   21.20     -  stress -c 1 -q
Average:        0       106   16.96    0.00    0.00   83.04   16.96     -  stress -c 1 -q
Average:        0       114    2.49    0.00    0.00   97.26    2.49     -  stress -c 1 -q
Average:        0       110    1.00    0.00    0.00  102.00    1.00     -  stress -c 1 -q

[root@feea61d74bde ~]# cat /etc/redhat-release ; pidstat -V
Fedora release 29 (Twenty Nine)
sysstat version 11.7.3
(C) Sebastien Godard (sysstat <at> orange.fr)

Test: Debian Stretch

root@8a8be88b43a1:~# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:47:07      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0      1477   25.69    0.00    0.00   25.69     -  stress -c 1 -q
Average:        0      1473   25.44    0.25    0.00   25.69     -  stress -c 1 -q
Average:        0      1479   22.44    0.00    0.00   22.44     -  stress -c 1 -q
Average:        0      1481   18.20    0.00    0.00   18.20     -  stress -c 1 -q
Average:        0      1471    5.99    0.00    0.00    5.99     -  stress -c 1 -q
Average:        0      1475    2.99    0.00    0.00    2.99     -  stress -c 1 -q

root@8a8be88b43a1:~# cat /etc/debian_version ; pidstat -V
9.7
sysstat version 11.4.3
(C) Sebastien Godard (sysstat <at> orange.fr)

Test: Debian Buster/Sid

root@0691556e0801:~# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:50:10      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0      1498   24.50    0.00    0.00   75.50   24.50     -  stress -c 1 -q
Average:        0      1488   24.25    0.00    0.00   75.75   24.25     -  stress -c 1 -q
Average:        0      1496   18.50    0.00    0.00   81.50   18.50     -  stress -c 1 -q
Average:        0      1494   16.50    0.00    0.00   83.50   16.50     -  stress -c 1 -q
Average:        0      1490    8.25    0.00    0.00   91.75    8.25     -  stress -c 1 -q
Average:        0      1492    7.50    0.00    0.00   90.25    7.50     -  stress -c 1 -q

root@0691556e0801:~# cat /etc/debian_version ; pidstat -V
buster/sid
sysstat version 12.0.1
(C) Sebastien Godard (sysstat <at> orange.fr)

Test: Ubuntu 16.04

root@6e6221ed5d50:~# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:55:42      UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0       502   29.25    0.25    0.00   29.50     -  stress -c 1 -q
Average:        0       500   27.75    0.00    0.00   27.75     -  stress -c 1 -q
Average:        0       498   16.50    0.25    0.00   16.75     -  stress -c 1 -q
Average:        0       496   11.50    0.00    0.00   11.50     -  stress -c 1 -q
Average:        0       504   10.00    0.00    0.00   10.00     -  stress -c 1 -q
Average:        0       494    4.50    0.00    0.00    4.50     -  stress -c 1 -q

root@6e6221ed5d50:~# grep PRETTY_NAME /etc/os-release ; pidstat -V
PRETTY_NAME="Ubuntu 16.04.5 LTS"
sysstat version 11.2.0
(C) Sebastien Godard (sysstat <at> orange.fr)

I tested on a bunch of OSs and it looks good now! Test results posted below.

Test: Ubuntu 18.04

root@588c69821deb:~# grep -A 8 "Top 10 cpu using processes" /var/log/recap/resources_*.log
Top 10 cpu using processes
20:55:42      UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
Average:        0       515   27.00    0.25    0.00   72.50   27.25     -  stress -c 1 -q
Average:        0       525   26.50    0.00    0.00   73.25   26.50     -  stress -c 1 -q
Average:        0       521   18.50    0.25    0.00   81.00   18.75     -  stress -c 1 -q
Average:        0       517   17.50    0.25    0.00   82.25   17.75     -  stress -c 1 -q
Average:        0       519    7.00    0.00    0.00   92.75    7.00     -  stress -c 1 -q
Average:        0       523    2.00    0.00    0.00  100.00    2.00     -  stress -c 1 -q

root@588c69821deb:~# grep PRETTY_NAME /etc/os-release ; pidstat -V
PRETTY_NAME="Ubuntu 18.04.1 LTS"
sysstat version 11.6.1
(C) Sebastien Godard (sysstat <at> orange.fr)

@mauved
Copy link
Author

mauved commented Feb 19, 2019

Whoops, goofed up my previous post a bit. I tested recap in a handful of OSs and the sorting looks good now.

@tonyskapunk
Copy link
Contributor

Yay!, thanks for checking!

@tonyskapunk
Copy link
Contributor

Fixed in #204

@tonyskapunk tonyskapunk added this to the 2.0.2 milestone Feb 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants