AIX support #1123

wiggin15 · 2017-09-05T08:27:58Z

Following #605, this is the proposed branch for AIX support.

The following methods and methods are not supported in AIX in this branch:

psutil.Process.memory_maps
psutil.Process.num_ctx_switches

Not all unit tests pass on AIX at this time (I didn't make any changes to unit tests yet).

giampaolo

Here's an initial review. Overall, really great work!
There are more issues but the ones I gave thus far are enough to keep you busy for quite a while.

giampaolo · 2017-09-05T09:24:48Z

psutil/TODO.aix

+Process.io_counters read count is always 0
+
+
+TestSystemAPIs.test_pid_exists_2           there are pids in /proc that don't really exist


You mean, say, a /proc/54 directory exists but there's no such PID?

Yes, AIX is weird like that.

That's really unfortunate. How about doing this then?

def pids(): """Returns a list of PIDs currently running on the system.""" return [int(x) for x in os.listdir('/proc') if x.isdigit() \ and _psposix.pid_exists(int(x))]

Does this test pass?

This fixes test_pid_exists_2 but causes test_pids to break because ps still returns the processes that "don't really exist" and then we get this:

File "/root/psutil/psutil/tests/test_posix.py", line 297, in test_pids self.fail("difference: " + str(difference)) AssertionError: difference: [131076, 196614, 1048608, 1114146, 1179684, 1376298, 1638568]

giampaolo · 2017-09-05T09:25:07Z

psutil/TODO.aix

+
+
+TestSystemAPIs.test_pid_exists_2           there are pids in /proc that don't really exist
+TestProcess.test_name                      isolated python calls execve which changes process name


What does this mean?

On our environment we're using a custom version of Python, which on AIX uses a trick utilizing execve. This means the test will work on regular environments but not on ours specifically. I'm not sure this file should really be included in the repo (I'm also not sure it's up-to-date), this file was created for note taking when debugging the tests.

I'm not sure what you mean by this then. Simply make sure name() returns the expected string (in case of a Python process it should be "python".

execve causes the name to change so sys.executable and name don't match.

AssertionError: ('python2.7', 'python2.7.bin')

This is implementation detail in our environment and can be ignored.
I found out that the set of tests that fail is very different from before (it's more stable now too, i.e. less "flaky" tests). This file will be updated in the next commit.

giampaolo · 2017-09-05T09:25:45Z

psutil/TODO.aix

+
+TestSystemAPIs.test_pid_exists_2           there are pids in /proc that don't really exist
+TestProcess.test_name                      isolated python calls execve which changes process name
+TestProcess.test_num_fds                   opening a socket doesn't create fd in /proc/pid/fd (until data is sent??)


Does this mean num_fds() is unreliable?

Hmm, I guess so

Apparently this doesn't fail any more

giampaolo · 2017-09-05T09:26:04Z

psutil/TODO.aix

+TestSystemAPIs.test_pid_exists_2           there are pids in /proc that don't really exist
+TestProcess.test_name                      isolated python calls execve which changes process name
+TestProcess.test_num_fds                   opening a socket doesn't create fd in /proc/pid/fd (until data is sent??)
+TestProcess.test_open_files                /dev/null shows in open_files but it isn't a file


Can't you just skip it?

Yes. There are probably many changes to the tests I can do to make them pass. I deliberately did not change the unit tests for now to see what breaks - but I should before this will be really ready to be merged.

giampaolo · 2017-09-05T09:26:51Z

psutil/TODO.aix

+TestProcess.test_name                      isolated python calls execve which changes process name
+TestProcess.test_num_fds                   opening a socket doesn't create fd in /proc/pid/fd (until data is sent??)
+TestProcess.test_open_files                /dev/null shows in open_files but it isn't a file
+TestProcess.test_pid_0                     pid 0 doesn't have a name on AIX


Does ps list it? If not is there some other task manager on AIX that does? If not I would advice to return "kernel".

"ps" calls PID 0 "swapper". I can return this name specifically for the case of pid=0.

Sounds good. Also make sure pids() return PID 0 and pid_exists(0) return True. If they don't by default then you can simply hard-code it. ps should be used as the leading reference and psutil should behave the same as ps.

giampaolo · 2017-09-05T09:54:01Z

psutil/_psutil_aix.c

+error:
+    Py_XDECREF(py_cputime);
+    Py_DECREF(py_retlist);
+    free(cpu);


I think you should free() it only if it hasn't been malloc()'ed before. You can set cpu = NULL; earlier and do if (cpu != NULL) {free(cpu);} here

We can't get to error without allocating cpu, so this is safe.

giampaolo · 2017-09-05T09:54:23Z

psutil/_psutil_aix.c

+    diskt = (perfstat_disk_t *)calloc(disk_count,
+        sizeof(perfstat_disk_t));
+    if (diskt == NULL) {
+        PyErr_SetFromErrno(PyExc_OSError);


PyErr_NoMemory

giampaolo · 2017-09-05T09:54:50Z

psutil/_psutil_aix.c

+        );
+        if (py_disk_info == NULL)
+            goto error;
+        if (PyDict_SetItemString(py_retdict, diskt[i].name,


Correct way to handle strings:
https://github.com/Infinidat/psutil/blob/c895a4aa1a08e7dac7213ab1c2458b0a28cea343/psutil/_psutil_linux.c#L469-L477

giampaolo · 2017-09-05T09:55:53Z

psutil/_psutil_aix.h

@@ -0,0 +1,25 @@
+/*
+ * Copyright (c) 2009, Giampaolo Rodola'. All rights reserved.


Please also add your name. Also do the same in other new files (copy license and add your name).

giampaolo · 2017-09-05T09:56:25Z

psutil/_psutil_posix.c

@@ -688,7 +692,7 @@ void init_psutil_posix(void)
    PyObject *module = Py_InitModule("_psutil_posix", PsutilMethods);
 #endif

-#if defined(PSUTIL_BSD) || defined(PSUTIL_OSX) || defined(PSUTIL_SUNOS)
+#if defined(PSUTIL_BSD) || defined(PSUTIL_OSX) || defined(PSUTIL_SUNOS) || defined(_AIX)


You should define PSUTIL_AIX in setup.py and use it in here

wiggin15 · 2017-09-07T14:37:13Z

I added a new commit. Still no AIX-specific tests and no unicode string handling, but I fixed some broken functions and started changing the unit tests a bit. I think this looks better.

giampaolo · 2017-09-08T03:21:52Z

psutil/_psaix.py

+        # an empty list means there were no connections for process or
+        # process is no longer active so we force NSP in case the PID
+        # is no longer there.
+        if not ret:


I suggest to remove the if not ret line. There may be some connections but the process may have disappeared beforehand so we basically want to always call os.stat

This is again the same as in _pssunos.py

giampaolo · 2017-09-08T03:23:42Z

psutil/_psaix.py

+    @wrap_exceptions
+    def terminal(self):
+        psinfo = self._proc_basic_info()
+        ttydev = psinfo[-1]


I think you better use a "fixed" integer (5, 6 or whatever) because if you add a new element to the returned list this will break

giampaolo · 2017-09-08T03:24:29Z

psutil/_psaix.py

+        if PY3:
+            stdout, stderr = [x.decode(sys.stdout.encoding)
+                              for x in (stdout, stderr)]
+        if "no such process" in stderr:


perhaps do stderr.lower()?

giampaolo · 2017-09-08T03:24:53Z

psutil/_psaix.py

+            path = path.strip()
+            if path.startswith("//"):
+                path = path[1:]
+            if path == "Cannot be retrieved":


perhaps do path.lower()?

giampaolo · 2017-09-08T03:33:32Z

psutil/_psutil_aix.c

+    Py_XDECREF(py_tuple);
+    Py_DECREF(py_retlist);
+    if (ut != NULL)
+        endutent();


You use endutxent(); on line 288, not this. Why?

Looks like you're right.
Again, this is the same as in the sunos implementation.

giampaolo · 2017-09-08T03:35:58Z

psutil/_psutil_aix.c

+        }
+    }
+    endutxent();
+    if (boot_time != 0.0) {


Are there cases where time can be 0.0? Perhaps we need a comment here?

Not sure. This too is copied from _psutil_sunos.c.

boot_time will be 0.0 if the loop breaks without finding BOOT_TIME. We can add a comment, but again this is common implementation with Solaris.

Oh yeah. I now realize that sunos has different "minor" issues like this one (and others I mentioned here). I will stop being picky and just focus on more important aspects. =)

There are definitely some comments that are relevant and should be fixed in Solaris too. There are also a number of tests that fail in Solaris as well as on AIX that can be fixed in both. I'm keeping track of your comments and may submit a separate PR to fix them after this one.

giampaolo · 2017-09-08T03:37:46Z

psutil/_psutil_aix.c

+        cpu.syscall,
+        cpu.devintrs,
+        cpu.softintrs,
+        cpu.traps


In TODO.aix you state all of these numbers are 0, correct?

Not all of them, only syscall and traps. I'm not sure why.

OK then. If it's an OS problem there's nothing we can do.

Please just add 2 XXX comments to signal that syscall and traps fields are always 0.

Ugh. My bad, there was a bug here (I don't like it when I have bugs :/). The values are ulonglong, so should be built with K instead of I. The values are not 0.

Perfect. Just update the doc/TODO.aix then

giampaolo · 2017-09-08T03:42:59Z

psutil/arch/aix/net_connections.c

+            inet_ntop(fam, raddr, raddr_str, sizeof(raddr_str));
+            return Py_BuildValue("(iii(si)(si)ii)", fd, fam,
+                                 s.so_type, laddr_str, lport, raddr_str,
+                                 rport, state, pid);


I suspect returning 2 tuples in a single shot like this ((si)) will leak memory. You need 3 tuples, 2 per addresses and 1 per connection. It's tedious work but you can pretty much copy & paste from here:

psutil/psutil/_psutil_osx.c

Lines 1332 to 1349 in f435c2b

py_laddr = Py_BuildValue("(si)", lip, lport);

if (!py_laddr)

goto error;

if (rport != 0)

py_raddr = Py_BuildValue("(si)", rip, rport);

else

py_raddr = Py_BuildValue("()");

if (!py_raddr)

goto error;

// construct the python list

py_tuple = Py_BuildValue(

"(iiiNNi)", fd, family, type, py_laddr, py_raddr, state);

if (!py_tuple)

goto error;

if (PyList_Append(py_retlist, py_tuple))

goto error;

Py_DECREF(py_tuple);

Note: you can use make test-memleaks to make sure of this.

What would be the result of make test-memleaks if there is a memory leak? Running this command returned 3 failed test:

test_memory_leaks.TestProcessObjectLeaks.test_num_ctx_switches test_memory_leaks.TestTerminatedProcessLeaks.test_io_counters test_memory_leaks.TestTerminatedProcessLeaks.test_num_ctx_switches

The failures don't seem to be due to memory leaks.

Let's think about memory leaks later, after you fixed unicode strings and /proc fs location.

giampaolo · 2017-09-08T03:45:40Z

psutil/arch/aix/net_connections.c

+    msz = (size_t)(PROCSIZE * PROCINFO_INCR);
+    processes = (struct procentry64 *)malloc(msz);
+    if (!processes) {
+        PyErr_SetFromErrno(PyExc_OSError);


PyErr_NoMemory()

giampaolo · 2017-09-08T03:46:02Z

psutil/arch/aix/net_connections.c

+            msz = (size_t)(PROCSIZE * (Np + PROCINFO_INCR));
+            processes = (struct procentry64 *)realloc((char *)processes, msz);
+            if (!processes) {
+                PyErr_SetFromErrno(PyExc_OSError);


PyErr_NoMemory()

giampaolo · 2017-09-08T03:46:24Z

psutil/arch/aix/net_connections.c

+        if (!fds) {
+            fds = (struct fdsinfo64 *)malloc((size_t)FDSINFOSIZE);
+            if (!fds) {
+                PyErr_SetFromErrno(PyExc_OSError);


PyErr_NoMemory()

giampaolo · 2017-09-08T03:48:15Z

setup.py

@@ -254,6 +264,8 @@ def get_ethtool_macro():
        if platform.release() == '5.10':
            posix_extension.sources.append('psutil/arch/solaris/v10/ifaddrs.c')
            posix_extension.define_macros.append(('PSUTIL_SUNOS10', 1))
+    if AIX:


elif is better

giampaolo · 2017-09-08T03:53:37Z

Good work on addressing the first set of comments! I'm gonna add another big request for change.
Right now "/proc" is hard coded both in the py and the C files. It should be possible to change that via psutil.PROCFS_PATH (it's already possible to do this on Linux and SunOS). Here's how to do that.

Copy this in _psaix.py:
https://github.com/Infinidat/psutil/blob/cac89d522d64571a53011cd0914a654a9c5e2869/psutil/_pslinux.py#L215
If you prefer (even better) you can move that function in _common.py so that you can share it between all the 3 platforms.

Then define a new attr in the Process class like this:
https://github.com/Infinidat/psutil/blob/cac89d522d64571a53011cd0914a654a9c5e2869/psutil/_pslinux.py#L1392
...and pass that string to the underlying C functions that need it.

On the C side of things you'll now have to accept 2 args, the PID and /proc path prefix, like this:
https://github.com/Infinidat/psutil/blob/cac89d522d64571a53011cd0914a654a9c5e2869/psutil/_psutil_sunos.c#L96-L101

Not sure if I've been clear (I hope =)).

wiggin15 · 2017-09-10T09:39:37Z

What's the reason for making /proc changeable? Is this configurable on the operating system?

giampaolo · 2017-09-10T11:29:41Z

What's the reason for making /proc changeable?

#558

wiggin15 · 2017-09-10T12:38:19Z

I think, instead of moving the get function to _common, we can move this function to the main __init__.py file and then we won't have to use the sys.modules trick. Something like this in __init__:

if LINUX or SOLARIS or AIX:
    # This is a public writable constant which is read from platform-specific code
    PROCFS_PATH = "/proc"

    def get_procfs_path():
        return PROCFS_PATH

What do you think?
In addition, I noticed that the BSD code is still using hard-coded /proc. Is it not relevant in that platform? Maybe we can change to if POSIX instead of Linux/Solaris/AIX if the BSD module is changed.

giampaolo · 2017-09-10T14:32:12Z

How can you use that from _ps*.py modules if it's defined in __init__.py? It's __init__ which imports those module not vice versa. Anyway it looks like something which can be dealt with in another PR so for the time being you can also define it in _psaix.py and then we can move it later.

In addition, I noticed that the BSD code is still using hard-coded /proc. Is it not relevant in that platform?

It's not, BSD systems do not rely on /proc.

wiggin15 · 2017-09-10T15:03:46Z

How can you use that from _ps*.py modules if it's defined in init.py?

It's a matter of import order. If you define it in __init__ before importing the platform-specific modules, they will succeed in importing from __init__. Maybe not the best way to do it though.

BSD systems do not rely on /proc.

_psbsd.py has a few readlinks and opens from /proc (the C extension doesn't, but still..)
EDIT: I see it's only on NetBSD

wiggin15 · 2017-09-10T16:27:12Z

Added more commits :)

giampaolo

Great work with unicode handling and /proc changes!
I added some comments about the C code.
When those are fixed I suppose we can start looking at the the output of make test and make test-memleaks.

giampaolo · 2017-09-11T05:02:57Z

psutil/TODO.aix

+    test_procinfo               missing API
+    test_cpu_stats              cpu_stats always returns ctx_switches=0
+    test_disk_usage             df returns "-" for all procfs fields but API returns something real
+    test_pid_exists_2           there are pids in /proc that don't really exist


This still worries me and should be discussed. I suppose I'll have a better idea of what this means once I look at the test results.

giampaolo · 2017-09-11T05:04:18Z

psutil/__init__.py

-        switches performed by this process.
-        """
-        return self._proc.num_ctx_switches()
+    if not AIX:


I suggest using if hasattr(_psplatform.Process, "num_ctx_switches"): instead

giampaolo · 2017-09-11T05:06:08Z

docs/index.rst

@@ -2110,7 +2120,7 @@ Constants
 .. _const-procfs_path:
 .. data:: PROCFS_PATH

-  The path of the /proc filesystem on Linux and Solaris (defaults to
+  The path of the /proc filesystem on Linux, Solaris and AIX (defaults to
  ``"/proc"``).
  You may want to re-set this constant right after importing psutil in case
  your /proc filesystem is mounted elsewhere or if you want to retrieve


Please also add the new AIX constant in the doc and mark it as .. versionadded:: 5.4.0.

The docs have a section called .. _const-oses:. Is it ok if I add AIX to this section with:

.. versionchanged:: 5.4.0 added AIX

in order to keep this const in the same group as the rest?

giampaolo · 2017-09-11T05:07:24Z

psutil/arch/aix/ifaddrs.c

+        fam = ifr->ifr_addr.sa_family;
+
+        if (fam == AF_INET || fam == AF_INET6) {
+            cifa = (struct ifaddrs *) calloc(1, sizeof(struct ifaddrs));


It seems you need to check calloc return value

giampaolo · 2017-09-11T05:13:12Z

psutil/arch/aix/net_connections.c

+    if (lseek64(Kd, (off64_t)addr, L_SET) == (off64_t)-1)
+        return(1);
+    br = read(Kd, buf, len);
+    return((br == len) ? 0 : 1);


If I understand this correctly you return 1 (meaning error) in case of len/size mismatch. Whereas the previous lseek and read calls set errno, this line (53) won't, so I think you need to do this:

if (br != len) { PyErr_SetString(PyExc_RuntimeError, "size mismatch when reading kernel memory fd"); return 1; } return 0;

Also, since you set the exception in here, you should do the same for the 2 calls above (lseek and read) and do PyErr_SetFromErrno(PyExc_OSError);.
Of course the callers of this function should just return NULL.

giampaolo · 2017-09-11T05:27:10Z

psutil/arch/aix/net_connections.c

+    if (fam == AF_INET || fam == AF_INET6) {
+        /* Read protocol control block */
+        if (!s.so_pcb
+        ||  psutil_kread(Kd, (KA_T) s.so_pcb, (char *) &inp, sizeof(inp))) {


giampaolo · 2017-09-11T05:30:01Z

psutil/arch/aix/net_connections.c

+            return NULL;
+        }
+        if ((KA_T) f.f_data != (KA_T) unp.unp_socket) {
+            PyErr_SetFromErrno(PyExc_OSError);


Mmmm I suppose errno will not be set here, so this will fail with an "empty" exception. Perhaps you want RuntimeError?

giampaolo · 2017-09-11T05:31:16Z

psutil/arch/aix/net_connections.c

+    size_t msz;
+    pid32_t requested_pid;
+    pid32_t pid;
+    int Np = 0;          /* number of processes */


Why not call it num_procs? It would be less confusing also because there's another np variable.

This is from lsof code, which is pretty terrible:
https://fossies.org/dox/lsof_4.89_src/aix_2dproc_8c_source.html
Interestingly both np and Np are for the number of processes in processes. Np is the number allocated and np is the number read. I know these are not good names but I prefer to stick to the original names for comparing to the original code (it helped just now).

I see. OK then. And yes, I remember lsof source code: I remember I gave up because I didn't understand it. =)

giampaolo · 2017-09-11T05:33:09Z

psutil/arch/aix/net_connections.c

+
+    Kd = open(KMEM, O_RDONLY, 0);
+    if (Kd < 0) {
+        PyErr_SetFromErrno(PyExc_OSError);


Looks like you can use PyErr_SetFromErrnoWithFilename instead

giampaolo · 2017-09-11T05:35:57Z

psutil/arch/aix/net_connections.c

+    }
+
+    if (i > 0)
+        np += i;


Is this on purpose?

Also from lsof code, line 476. This code is tricky but it's correct. We're reading the processes into p in PROCINFO_INCR increments. We realloc the next "block" for processes every time we read a full PROCINFO_INCR block. The last call can return less than PROCINFO_INCR processes, but we still read that block and need to go over it, so np must be increased by the number of processes in the last block - which is in i. See the docs for getprocs64
Confusing, I know.

wiggin15 · 2017-09-12T09:58:27Z

Added a new commit and uploaded the latest test results to https://gist.github.com/wiggin15/1d76d1dc1ad50e6665fe4dc5c6fa898a
please ignore 2 tests that fail with python2.7.bin, this is caused by a trick we're using on our environment.

giampaolo · 2017-09-13T12:26:25Z

  File "/root/psutil/scripts/procinfo.py", line 228, in run
    print_("ctx-switches", str_ntuple(pinfo['num_ctx_switches']))
KeyError: 'num_ctx_switches'

This is easy: just change scripts/procinfo.py so that it won't print this info on AIX

  File "/root/psutil/psutil/tests/test_posix.py", line 368, in df
    total = int(fields[1]) * 1024
ValueError: invalid literal for int() with base 10: '-'

This means df -k output is different on AIX compared to other POSIXes. You can simply skip thi test on AIX.

  File "/root/psutil/psutil/tests/test_contracts.py", line 541, in cwd
    self.assertIsInstance(ret, str)
AssertionError: None is not an instance of <type 'str'>

Just return an empty string in cwd() method of _psaix.py instead of None.

======================================================================
FAIL: psutil.tests.test_misc.TestMisc.test_setup_script
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/psutil/psutil/tests/test_misc.py", line 369, in test_setup_script
    self.assertRaises(SystemExit, module.setup)
AssertionError: SystemExit not raised

Just skip this one.

======================================================================
FAIL: psutil.tests.test_misc.TestProcessUtils.test_create_zombie_proc
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/psutil/psutil/tests/test_misc.py", line 950, in test_create_zombie_proc
    zpid = create_zombie_proc()
  File "/root/psutil/psutil/tests/__init__.py", line 366, in create_zombie_proc
    call_until(lambda: zproc.status(), "ret == psutil.STATUS_ZOMBIE")
  File "/root/psutil/psutil/tests/__init__.py", line 593, in wrapper
    return fun(*args, **kwargs)
  File "/root/psutil/psutil/tests/__init__.py", line 643, in call_until
    assert eval(expr)
AssertionError

OK, you said create_zombie_proc() doesn't work on AIX. Have you checked whether ps is able to detect it though? I just want to be sure it's not status() method which is broken. If we make sure it's not and we don't find a way to correctly create a zombie process via unit tests then let' just skip this.


  File "/root/psutil/psutil/tests/test_process.py", line 720, in test_cmdline
    ' '.join(cmdline))
AssertionError: '/root/python/bin/python2.7 -c' != '/root/python/bin/python2.7 -c import time; time.sleep(60)'

I would just change the test and do a comparison of the first 30 chars if you're on AIX, else just do what we're currently doing.

  File "/root/psutil/psutil/tests/test_process.py", line 322, in test_io_counters
    self.assertGreater(io2.read_count, io1.read_count)
AssertionError: 0L not greater than 0L

Same here. If we know io_counters()'s read_count is always zero just skip this specific test line when on AIX.

======================================================================
FAIL: psutil.tests.test_process.TestProcess.test_prog_w_funky_name
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/psutil/psutil/tests/test_process.py", line 755, in test_prog_w_funky_name
    self.assertEqual(p.cmdline(), cmdline)
AssertionError: Lists differ: [''] != ['/root/psutil/$testfnfoo bar ...

This is bad and should be fixed. The test may also not pass but we should never have [''] returned.


======================================================================
FAIL: psutil.tests.test_system.TestSystemAPIs.test_cpu_stats
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/psutil/psutil/tests/test_system.py", line 746, in test_cpu_stats
    self.assertGreater(value, 0)
AssertionError: 0 not greater than 0

Again, if we know 'ctx_switches' and 'interrupts' are always 0 just skip this test on solaris (just add a comment / note)

======================================================================
FAIL: psutil.tests.test_system.TestSystemAPIs.test_pid_exists_2
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/psutil/psutil/tests/test_system.py", line 238, in test_pid_exists_2
    self.fail(pid)
AssertionError: 260

This is bad. It means pid_exists() is broken. Does AIX have /proc/{pid}/status? Linux uses a specific logic for pid_exists() as it takes the Tid (thread ID) into account. I wonder if AIX does the same.

What about memory leak tests?

giampaolo · 2017-09-13T12:32:50Z

Note: it may also be pids() returning a PID which does not exist, not necessarily pid_exists which is broken. Anyway, it's either one or the other. ps is your friend and should tell you who's wrong.

wiggin15 · 2017-09-14T15:13:17Z

Added a new commit

Just return an empty string in cwd() method of _psaix.py instead of None.

I noticed that the test for cwd already checks whether the return value is None (which can also happen on BSD and Solaris), but the test checks if it's None after making sure it's a string, which means it's a bug in the test. I fixed the test instead. I also fixed some other failures in test_fetch_all

test_setup_script

I realized why it failed. It's because I used python setup.py test instead of make test to run the unit tests. Running the former makes argv ok for setup.setup() so it doesn't raise SystemExit with usage. I'll use make test from now on.

OK, you said create_zombie_proc() doesn't work on AIX. Have you checked whether ps is able to detect it though? I just want to be sure it's not status()

OK, so it turns out you can create zombie processes and there was indeed a bug in status. It's not simple to determine actual process state from /proc, but I think I added the code required.

I would just change the test and do a comparison of the first 30 chars if you're on AIX

The problem isn't that we cut to 30 characters (according to the struct in /proc the max should be 80), but that there is a single argument import time; time.sleep(60) which is too long so it's somehow truncated from psinfo. If I add another argument to argv after this argument, it will show in psinfo while "skipping" the long argument.

It means pid_exists() is broken

The PIDs in question do exist, in ps and in /proc. They have /proc/<pid>/psinfo, and even psutil.Process(pid) works on them. Yet pid_exists returns False, because the "send a signal" check, even when trying with Python's os.kill, raises OSError: no such process. I noticed these processes are all named wait and it appears that there is 1 per logical CPU. It's an AIX thing...

The test may also not pass but we should never have [''] returned.

I traced this back to an issue with our environment, again. create_exe used to compile C code for the test executable but now it's using sys.executable instead (also, there's a bug in that function where the default code is never used). Apparently sys.executable is not copy-able on our environment (again the execve python2.7.bin trick) so when running it it immediately exits, and that's why we get the empty cmdline. When changing create_exe (not committed, of course), those tests pass except test_prog_w_funky_name which, again, doesn't pass on Solaris either.

What about memory leak tests?

Those tests pass now after a few small changes.

New test results in: https://gist.github.com/wiggin15/d1054f0a4a172a572dd610aad649dfa7

giampaolo · 2017-09-14T16:22:41Z

The PIDs in question do exist, in ps and in /proc. They have /proc//psinfo, and even psutil.Process(pid) works on them. Yet pid_exists returns False, because the "send a signal" check, even when trying with Python's os.kill, raises OSError: no such process. I noticed these processes are all named wait and it appears that there is 1 per logical CPU. It's an AIX thing...

OK, so I suppose the os.kill trick is not reliable. What about return os.path.exitsts("/proc/{pid}")? Does it work? Last possibility is doing return pid in pids() but that's slow.... :-
I will reply to other comments later (need to sleep =)).

giampaolo

Some more comments. We're almost there.

giampaolo · 2017-09-15T04:19:26Z

psutil/TODO.aix

@@ -0,0 +1,14 @@
+AIX support is experimental and incomplete at this time.


I would just say it's experimental at this point, but not incomplete.

giampaolo · 2017-09-15T04:22:37Z

psutil/_psaix.py

+    avail = free * PAGE_SIZE
+    used = inuse * PAGE_SIZE
+    percent = usage_percent((total - avail), total, _round=1)
+    return svmem(total, avail, percent, used, free)


I want to stress this a bit more because I really think it's important. Why can't we check for "close" values? How much do the two values differ? Note, the idiom current used in tests is this one (taken from BSD):

@unittest.skipIf(not MUSE_AVAILABLE, "muse not installed") @retry_before_failing() def test_muse_vmem_active(self): num = muse('Active') self.assertAlmostEqual(psutil.virtual_memory().active, num, delta=MEMORY_TOLERANCE)

I wuld like to have something like this at least for total memory (that value should NOT change).

giampaolo · 2017-09-15T04:25:02Z

psutil/_psaix.py

+    total, free, sin, sout = cext.swap_mem()
+    used = total - free
+    percent = usage_percent(used, total, _round=1)
+    return _common.sswap(total, used, free, percent, sin, sout)


Same here. It would be good to check this against a cmdline tool, at least the total swap.

giampaolo · 2017-09-15T04:26:39Z

psutil/_psaix.py

+    if p.returncode != 0:
+        raise RuntimeError("%r command error\n%s" % (cmd, stderr))
+    processors = stdout.strip().splitlines()
+    return len(processors)


OK (for the C part). Just do return len(processors) or None instead here.

giampaolo · 2017-09-15T04:30:45Z

psutil/_psaix.py

+    ctx_switches, interrupts, soft_interrupts, syscalls, traps = \
+        cext.cpu_stats()
+    return _common.scpustats(
+        ctx_switches, interrupts, soft_interrupts, syscalls)


Wonder if there's some cmdline tool to check one of these against.
I bumped into this:
https://www.ibm.com/developerworks/community/blogs/aixpert/entry/mpstat_d_and_the_undocumented_stats133?lang=en
Not sure if it's easily parsable. That aside, it looks like you can count the number of rows which gives you the number of CPUs. You may want to use that to test cpu_count() function.

giampaolo · 2017-09-15T05:08:01Z

psutil/_psutil_aix.c

+        cpu.syscall,
+        cpu.devintrs,
+        cpu.softintrs,
+        cpu.traps


Please just add 2 XXX comments to signal that syscall and traps fields are always 0.

giampaolo · 2017-09-15T05:11:21Z

psutil/_psutil_aix.c

+            diskt[i].rblks * diskt[i].bsize,
+            diskt[i].wblks * diskt[i].bsize,
+            diskt[i].rserv / 1000 / 1000,  // from nano to milli secs
+            diskt[i].wserv / 1000 / 1000   // from nano to milli secs


Please add a XXX comment telling the field which is always set to 0

These values (disk_io_counters) are ok, it's proc_io_counters that's returning 0s. I will add XXX comments there.

giampaolo · 2017-09-15T05:13:21Z

psutil/arch/aix/ifaddrs.h

+  unsigned int     ifa_flags;
+  struct sockaddr *ifa_addr;
+  struct sockaddr *ifa_netmask;
+  struct sockaddr *ifa_dstaddr;


Please indent by 4

giampaolo · 2017-09-15T05:21:33Z

psutil/tests/test_contracts.py

@@ -282,6 +284,9 @@ def test_fetch_all(self):
            'send_signal', 'suspend', 'resume', 'terminate', 'kill', 'wait',
            'as_dict', 'parent', 'children', 'memory_info_ex', 'oneshot',
        ])
+        if AIX:
+            # "<exiting>" processes really don't have names
+            excluded_names.add('name')


I wouldn't do this. Instead I would change the name method (defined later) like this:

def name(self, ret, proc): self.assertIsInstance(ret, str) if not AIX: assert ret # ...else on AIX "<exiting>" processes (zombies) don't have names

giampaolo · 2017-09-15T05:27:57Z

psutil/tests/test_memory_leaks.py

@@ -288,6 +289,7 @@ def test_num_fds(self):
        self.execute(self.proc.num_fds)

    @skip_if_linux()
+    @unittest.skipIf(AIX, "num_ctx_switches not available on AIX")


Please instead define a HAS_NUM_CTX_SWITCHES global in tests/__init__.py and use it here. I prefer that way so I can keep track of supported things in a single place, in case their support is added later.

giampaolo · 2017-09-15T07:36:58Z

Also, can you please make sure all C lines are less than 80 chars long?

wiggin15 · 2017-09-19T12:35:53Z

Added a new commit with AIX-specific tests

giampaolo · 2017-09-25T16:19:03Z

As far as I'm concerned this looks good enough to be merged at this point so just tell me when you want me to do it.

wiggin15 · 2017-09-26T11:48:22Z

I think that this is ready now, except I'm not sure TODO.aix should be where it is (or stay at all), and maybe you want to update the main docs about the state of support for AIX (experimental?)

giampaolo · 2017-09-26T11:52:37Z

It's OK, I will update doc / README, etc. before the next release, once I'm back from China. In the meantime, thanks a lot. You really did an amazing work. At first I was skeptical about adding support for a platform I can't test against, but the quality of the code you provided is excellent and you were great at addressing all my remarks. I hope you will keep hacking on psutil as a main contributor.

giampaolo requested changes Sep 5, 2017

View reviewed changes

wiggin15 force-pushed the aix branch from 1e0aa8f to cac89d5 Compare September 7, 2017 14:36

giampaolo reviewed Sep 8, 2017

View reviewed changes

Arnon Yaari added 5 commits September 10, 2017 19:25

AIX support

6d733e0

AIX support

ba8bfa7

AIX support

fe5cde3

AIX support - use get_procfs_path() instead of /proc

db52909

AIX support - group sections like in other modules

52816c9

wiggin15 force-pushed the aix branch from cac89d5 to 52816c9 Compare September 10, 2017 16:26

giampaolo requested changes Sep 11, 2017

View reviewed changes

AIX support

e2e6600

AIX support

a4b4387

giampaolo requested changes Sep 15, 2017

View reviewed changes

AIX support

fbb3d62

AIX support - remove unnecessary dict copy

cdbb9d3

giampaolo merged commit 1ebe625 into giampaolo:master Sep 26, 2017

giampaolo mentioned this pull request Oct 12, 2017

support AIX #605

Closed

giampaolo added aix enhancement labels Nov 15, 2020

		Process.io_counters read count is always 0


		TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist



		TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist
		TestProcess.test_name isolated python calls execve which changes process name

		@@ -0,0 +1,25 @@
		/*
		* Copyright (c) 2009, Giampaolo Rodola'. All rights reserved.

	py_laddr = Py_BuildValue("(si)", lip, lport);
	if (!py_laddr)
	goto error;
	if (rport != 0)
	py_raddr = Py_BuildValue("(si)", rip, rport);
	else
	py_raddr = Py_BuildValue("()");
	if (!py_raddr)
	goto error;

	// construct the python list
	py_tuple = Py_BuildValue(
	"(iiiNNi)", fd, family, type, py_laddr, py_raddr, state);
	if (!py_tuple)
	goto error;
	if (PyList_Append(py_retlist, py_tuple))
	goto error;
	Py_DECREF(py_tuple);

		@@ -0,0 +1,14 @@
		AIX support is experimental and incomplete at this time.

AIX support #1123

AIX support #1123

Conversation

wiggin15 commented Sep 5, 2017

giampaolo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wiggin15 commented Sep 7, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

giampaolo commented Sep 8, 2017 • edited Loading

wiggin15 commented Sep 10, 2017

giampaolo commented Sep 10, 2017

wiggin15 commented Sep 10, 2017

giampaolo commented Sep 10, 2017

wiggin15 commented Sep 10, 2017 • edited Loading

wiggin15 commented Sep 10, 2017

giampaolo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wiggin15 commented Sep 12, 2017

giampaolo commented Sep 13, 2017

giampaolo commented Sep 8, 2017 •

edited

Loading

wiggin15 commented Sep 10, 2017 •

edited

Loading