-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIX support #1123
AIX support #1123
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an initial review. Overall, really great work!
There are more issues but the ones I gave thus far are enough to keep you busy for quite a while.
psutil/TODO.aix
Outdated
Process.io_counters read count is always 0 | ||
|
||
|
||
TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean, say, a /proc/54
directory exists but there's no such PID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, AIX is weird like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really unfortunate. How about doing this then?
def pids():
"""Returns a list of PIDs currently running on the system."""
return [int(x) for x in os.listdir('/proc') if x.isdigit() \
and _psposix.pid_exists(int(x))]
Does this test pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes test_pid_exists_2
but causes test_pids
to break because ps
still returns the processes that "don't really exist" and then we get this:
File "/root/psutil/psutil/tests/test_posix.py", line 297, in test_pids
self.fail("difference: " + str(difference))
AssertionError: difference: [131076, 196614, 1048608, 1114146, 1179684, 1376298, 1638568]
psutil/TODO.aix
Outdated
|
||
|
||
TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist | ||
TestProcess.test_name isolated python calls execve which changes process name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On our environment we're using a custom version of Python, which on AIX uses a trick utilizing execve. This means the test will work on regular environments but not on ours specifically. I'm not sure this file should really be included in the repo (I'm also not sure it's up-to-date), this file was created for note taking when debugging the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by this then. Simply make sure name()
returns the expected string (in case of a Python process it should be "python".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execve causes the name to change so sys.executable
and name
don't match.
AssertionError: ('python2.7', 'python2.7.bin')
This is implementation detail in our environment and can be ignored.
I found out that the set of tests that fail is very different from before (it's more stable now too, i.e. less "flaky" tests). This file will be updated in the next commit.
psutil/TODO.aix
Outdated
|
||
TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist | ||
TestProcess.test_name isolated python calls execve which changes process name | ||
TestProcess.test_num_fds opening a socket doesn't create fd in /proc/pid/fd (until data is sent??) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean num_fds()
is unreliable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I guess so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently this doesn't fail any more
psutil/TODO.aix
Outdated
TestSystemAPIs.test_pid_exists_2 there are pids in /proc that don't really exist | ||
TestProcess.test_name isolated python calls execve which changes process name | ||
TestProcess.test_num_fds opening a socket doesn't create fd in /proc/pid/fd (until data is sent??) | ||
TestProcess.test_open_files /dev/null shows in open_files but it isn't a file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you just skip it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. There are probably many changes to the tests I can do to make them pass. I deliberately did not change the unit tests for now to see what breaks - but I should before this will be really ready to be merged.
psutil/TODO.aix
Outdated
TestProcess.test_name isolated python calls execve which changes process name | ||
TestProcess.test_num_fds opening a socket doesn't create fd in /proc/pid/fd (until data is sent??) | ||
TestProcess.test_open_files /dev/null shows in open_files but it isn't a file | ||
TestProcess.test_pid_0 pid 0 doesn't have a name on AIX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ps
list it? If not is there some other task manager on AIX that does? If not I would advice to return "kernel"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ps" calls PID 0 "swapper". I can return this name specifically for the case of pid=0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Also make sure pids()
return PID 0 and pid_exists(0)
return True. If they don't by default then you can simply hard-code it. ps
should be used as the leading reference and psutil should behave the same as ps.
psutil/_psutil_aix.c
Outdated
error: | ||
Py_XDECREF(py_cputime); | ||
Py_DECREF(py_retlist); | ||
free(cpu); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should free() it only if it hasn't been malloc()'ed before. You can set cpu = NULL;
earlier and do if (cpu != NULL) {free(cpu);}
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't get to error
without allocating cpu
, so this is safe.
psutil/_psutil_aix.c
Outdated
diskt = (perfstat_disk_t *)calloc(disk_count, | ||
sizeof(perfstat_disk_t)); | ||
if (diskt == NULL) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyErr_NoMemory
); | ||
if (py_disk_info == NULL) | ||
goto error; | ||
if (PyDict_SetItemString(py_retdict, diskt[i].name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct way to handle strings:
https://github.com/Infinidat/psutil/blob/c895a4aa1a08e7dac7213ab1c2458b0a28cea343/psutil/_psutil_linux.c#L469-L477
psutil/_psutil_aix.h
Outdated
@@ -0,0 +1,25 @@ | |||
/* | |||
* Copyright (c) 2009, Giampaolo Rodola'. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add your name. Also do the same in other new files (copy license and add your name).
psutil/_psutil_posix.c
Outdated
@@ -688,7 +692,7 @@ void init_psutil_posix(void) | |||
PyObject *module = Py_InitModule("_psutil_posix", PsutilMethods); | |||
#endif | |||
|
|||
#if defined(PSUTIL_BSD) || defined(PSUTIL_OSX) || defined(PSUTIL_SUNOS) | |||
#if defined(PSUTIL_BSD) || defined(PSUTIL_OSX) || defined(PSUTIL_SUNOS) || defined(_AIX) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should define PSUTIL_AIX
in setup.py and use it in here
I added a new commit. Still no AIX-specific tests and no unicode string handling, but I fixed some broken functions and started changing the unit tests a bit. I think this looks better. |
# an empty list means there were no connections for process or | ||
# process is no longer active so we force NSP in case the PID | ||
# is no longer there. | ||
if not ret: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to remove the if not ret
line. There may be some connections but the process may have disappeared beforehand so we basically want to always call os.stat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is again the same as in _pssunos.py
psutil/_psaix.py
Outdated
@wrap_exceptions | ||
def terminal(self): | ||
psinfo = self._proc_basic_info() | ||
ttydev = psinfo[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you better use a "fixed" integer (5, 6 or whatever) because if you add a new element to the returned list this will break
psutil/_psaix.py
Outdated
if PY3: | ||
stdout, stderr = [x.decode(sys.stdout.encoding) | ||
for x in (stdout, stderr)] | ||
if "no such process" in stderr: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps do stderr.lower()
?
psutil/_psaix.py
Outdated
path = path.strip() | ||
if path.startswith("//"): | ||
path = path[1:] | ||
if path == "Cannot be retrieved": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps do path.lower()
?
psutil/_psutil_aix.c
Outdated
Py_XDECREF(py_tuple); | ||
Py_DECREF(py_retlist); | ||
if (ut != NULL) | ||
endutent(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use endutxent();
on line 288, not this. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you're right.
Again, this is the same as in the sunos implementation.
psutil/_psutil_aix.c
Outdated
} | ||
} | ||
endutxent(); | ||
if (boot_time != 0.0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where time can be 0.0? Perhaps we need a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. This too is copied from _psutil_sunos.c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
boot_time
will be 0.0
if the loop breaks without finding BOOT_TIME
. We can add a comment, but again this is common implementation with Solaris.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah. I now realize that sunos has different "minor" issues like this one (and others I mentioned here). I will stop being picky and just focus on more important aspects. =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are definitely some comments that are relevant and should be fixed in Solaris too. There are also a number of tests that fail in Solaris as well as on AIX that can be fixed in both. I'm keeping track of your comments and may submit a separate PR to fix them after this one.
psutil/_psutil_aix.c
Outdated
cpu.syscall, | ||
cpu.devintrs, | ||
cpu.softintrs, | ||
cpu.traps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In TODO.aix you state all of these numbers are 0, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all of them, only syscall
and traps
. I'm not sure why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK then. If it's an OS problem there's nothing we can do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please just add 2 XXX comments to signal that syscall and traps fields are always 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. My bad, there was a bug here (I don't like it when I have bugs :/). The values are ulonglong, so should be built with K
instead of I
. The values are not 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Just update the doc/TODO.aix then
inet_ntop(fam, raddr, raddr_str, sizeof(raddr_str)); | ||
return Py_BuildValue("(iii(si)(si)ii)", fd, fam, | ||
s.so_type, laddr_str, lport, raddr_str, | ||
rport, state, pid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect returning 2 tuples in a single shot like this ((si)
) will leak memory. You need 3 tuples, 2 per addresses and 1 per connection. It's tedious work but you can pretty much copy & paste from here:
Lines 1332 to 1349 in f435c2b
py_laddr = Py_BuildValue("(si)", lip, lport); | |
if (!py_laddr) | |
goto error; | |
if (rport != 0) | |
py_raddr = Py_BuildValue("(si)", rip, rport); | |
else | |
py_raddr = Py_BuildValue("()"); | |
if (!py_raddr) | |
goto error; | |
// construct the python list | |
py_tuple = Py_BuildValue( | |
"(iiiNNi)", fd, family, type, py_laddr, py_raddr, state); | |
if (!py_tuple) | |
goto error; | |
if (PyList_Append(py_retlist, py_tuple)) | |
goto error; | |
Py_DECREF(py_tuple); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: you can use make test-memleaks
to make sure of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the result of make test-memleaks
if there is a memory leak? Running this command returned 3 failed test:
test_memory_leaks.TestProcessObjectLeaks.test_num_ctx_switches
test_memory_leaks.TestTerminatedProcessLeaks.test_io_counters
test_memory_leaks.TestTerminatedProcessLeaks.test_num_ctx_switches
The failures don't seem to be due to memory leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think about memory leaks later, after you fixed unicode strings and /proc fs location.
psutil/arch/aix/net_connections.c
Outdated
msz = (size_t)(PROCSIZE * PROCINFO_INCR); | ||
processes = (struct procentry64 *)malloc(msz); | ||
if (!processes) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyErr_NoMemory()
psutil/arch/aix/net_connections.c
Outdated
msz = (size_t)(PROCSIZE * (Np + PROCINFO_INCR)); | ||
processes = (struct procentry64 *)realloc((char *)processes, msz); | ||
if (!processes) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyErr_NoMemory()
psutil/arch/aix/net_connections.c
Outdated
if (!fds) { | ||
fds = (struct fdsinfo64 *)malloc((size_t)FDSINFOSIZE); | ||
if (!fds) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyErr_NoMemory()
setup.py
Outdated
@@ -254,6 +264,8 @@ def get_ethtool_macro(): | |||
if platform.release() == '5.10': | |||
posix_extension.sources.append('psutil/arch/solaris/v10/ifaddrs.c') | |||
posix_extension.define_macros.append(('PSUTIL_SUNOS10', 1)) | |||
if AIX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif
is better
Good work on addressing the first set of comments! I'm gonna add another big request for change. Copy this in Then define a new attr in the On the C side of things you'll now have to accept 2 args, the PID and /proc path prefix, like this: Not sure if I've been clear (I hope =)). |
What's the reason for making |
|
I think, instead of moving the
What do you think? |
How can you use that from
It's not, BSD systems do not rely on /proc. |
It's a matter of import order. If you define it in
|
Added more commits :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work with unicode handling and /proc changes!
I added some comments about the C code.
When those are fixed I suppose we can start looking at the the output of make test
and make test-memleaks
.
psutil/TODO.aix
Outdated
test_procinfo missing API | ||
test_cpu_stats cpu_stats always returns ctx_switches=0 | ||
test_disk_usage df returns "-" for all procfs fields but API returns something real | ||
test_pid_exists_2 there are pids in /proc that don't really exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still worries me and should be discussed. I suppose I'll have a better idea of what this means once I look at the test results.
psutil/__init__.py
Outdated
switches performed by this process. | ||
""" | ||
return self._proc.num_ctx_switches() | ||
if not AIX: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest using if hasattr(_psplatform.Process, "num_ctx_switches"):
instead
@@ -2110,7 +2120,7 @@ Constants | |||
.. _const-procfs_path: | |||
.. data:: PROCFS_PATH | |||
|
|||
The path of the /proc filesystem on Linux and Solaris (defaults to | |||
The path of the /proc filesystem on Linux, Solaris and AIX (defaults to | |||
``"/proc"``). | |||
You may want to re-set this constant right after importing psutil in case | |||
your /proc filesystem is mounted elsewhere or if you want to retrieve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add the new AIX
constant in the doc and mark it as .. versionadded:: 5.4.0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs have a section called .. _const-oses:
. Is it ok if I add AIX
to this section with:
.. versionchanged:: 5.4.0 added AIX
in order to keep this const in the same group as the rest?
fam = ifr->ifr_addr.sa_family; | ||
|
||
if (fam == AF_INET || fam == AF_INET6) { | ||
cifa = (struct ifaddrs *) calloc(1, sizeof(struct ifaddrs)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you need to check calloc
return value
psutil/arch/aix/net_connections.c
Outdated
if (lseek64(Kd, (off64_t)addr, L_SET) == (off64_t)-1) | ||
return(1); | ||
br = read(Kd, buf, len); | ||
return((br == len) ? 0 : 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly you return 1
(meaning error) in case of len/size mismatch. Whereas the previous lseek
and read
calls set errno, this line (53) won't, so I think you need to do this:
if (br != len) {
PyErr_SetString(PyExc_RuntimeError,
"size mismatch when reading kernel memory fd");
return 1;
}
return 0;
Also, since you set the exception in here, you should do the same for the 2 calls above (lseek
and read
) and do PyErr_SetFromErrno(PyExc_OSError);
.
Of course the callers of this function should just return NULL.
psutil/arch/aix/net_connections.c
Outdated
if (fam == AF_INET || fam == AF_INET6) { | ||
/* Read protocol control block */ | ||
if (!s.so_pcb | ||
|| psutil_kread(Kd, (KA_T) s.so_pcb, (char *) &inp, sizeof(inp))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here
psutil/arch/aix/net_connections.c
Outdated
return NULL; | ||
} | ||
if ((KA_T) f.f_data != (KA_T) unp.unp_socket) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmm I suppose errno
will not be set here, so this will fail with an "empty" exception. Perhaps you want RuntimeError?
size_t msz; | ||
pid32_t requested_pid; | ||
pid32_t pid; | ||
int Np = 0; /* number of processes */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not call it num_procs
? It would be less confusing also because there's another np
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is from lsof code, which is pretty terrible:
https://fossies.org/dox/lsof_4.89_src/aix_2dproc_8c_source.html
Interestingly both np
and Np
are for the number of processes in processes
. Np
is the number allocated and np
is the number read. I know these are not good names but I prefer to stick to the original names for comparing to the original code (it helped just now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. OK then. And yes, I remember lsof source code: I remember I gave up because I didn't understand it. =)
psutil/arch/aix/net_connections.c
Outdated
|
||
Kd = open(KMEM, O_RDONLY, 0); | ||
if (Kd < 0) { | ||
PyErr_SetFromErrno(PyExc_OSError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can use PyErr_SetFromErrnoWithFilename
instead
} | ||
|
||
if (i > 0) | ||
np += i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also from lsof code, line 476. This code is tricky but it's correct. We're reading the processes into p
in PROCINFO_INCR
increments. We realloc
the next "block" for processes
every time we read a full PROCINFO_INCR
block. The last call can return less than PROCINFO_INCR
processes, but we still read that block and need to go over it, so np
must be increased by the number of processes in the last block - which is in i
. See the docs for getprocs64
Confusing, I know.
Added a new commit and uploaded the latest test results to https://gist.github.com/wiggin15/1d76d1dc1ad50e6665fe4dc5c6fa898a |
This is easy: just change scripts/procinfo.py so that it won't print this info on AIX
This means
Just return an empty string in cwd() method of _psaix.py instead of None.
Just skip this one.
OK, you said
I would just change the test and do a comparison of the first 30 chars if you're on AIX, else just do what we're currently doing.
Same here. If we know io_counters()'s read_count is always zero just skip this specific test line when on AIX.
This is bad and should be fixed. The test may also not pass but we should never have
Again, if we know 'ctx_switches' and 'interrupts' are always 0 just skip this test on solaris (just add a comment / note)
This is bad. It means pid_exists() is broken. Does AIX have /proc/{pid}/status? Linux uses a specific logic for pid_exists() as it takes the Tid (thread ID) into account. I wonder if AIX does the same. What about memory leak tests? |
Note: it may also be |
Added a new commit
I noticed that the test for
I realized why it failed. It's because I used
OK, so it turns out you can create zombie processes and there was indeed a bug in
The problem isn't that we cut to 30 characters (according to the struct in
The PIDs in question do exist, in
I traced this back to an issue with our environment, again.
Those tests pass now after a few small changes. New test results in: https://gist.github.com/wiggin15/d1054f0a4a172a572dd610aad649dfa7 |
OK, so I suppose the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments. We're almost there.
psutil/TODO.aix
Outdated
@@ -0,0 +1,14 @@ | |||
AIX support is experimental and incomplete at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just say it's experimental at this point, but not incomplete.
psutil/_psaix.py
Outdated
avail = free * PAGE_SIZE | ||
used = inuse * PAGE_SIZE | ||
percent = usage_percent((total - avail), total, _round=1) | ||
return svmem(total, avail, percent, used, free) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to stress this a bit more because I really think it's important. Why can't we check for "close" values? How much do the two values differ? Note, the idiom current used in tests is this one (taken from BSD):
@unittest.skipIf(not MUSE_AVAILABLE, "muse not installed")
@retry_before_failing()
def test_muse_vmem_active(self):
num = muse('Active')
self.assertAlmostEqual(psutil.virtual_memory().active, num,
delta=MEMORY_TOLERANCE)
I wuld like to have something like this at least for total
memory (that value should NOT change).
total, free, sin, sout = cext.swap_mem() | ||
used = total - free | ||
percent = usage_percent(used, total, _round=1) | ||
return _common.sswap(total, used, free, percent, sin, sout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. It would be good to check this against a cmdline tool, at least the total
swap.
psutil/_psaix.py
Outdated
if p.returncode != 0: | ||
raise RuntimeError("%r command error\n%s" % (cmd, stderr)) | ||
processors = stdout.strip().splitlines() | ||
return len(processors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK (for the C part). Just do return len(processors) or None
instead here.
psutil/_psaix.py
Outdated
ctx_switches, interrupts, soft_interrupts, syscalls, traps = \ | ||
cext.cpu_stats() | ||
return _common.scpustats( | ||
ctx_switches, interrupts, soft_interrupts, syscalls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if there's some cmdline tool to check one of these against.
I bumped into this:
https://www.ibm.com/developerworks/community/blogs/aixpert/entry/mpstat_d_and_the_undocumented_stats133?lang=en
Not sure if it's easily parsable. That aside, it looks like you can count the number of rows which gives you the number of CPUs. You may want to use that to test cpu_count()
function.
psutil/_psutil_aix.c
Outdated
cpu.syscall, | ||
cpu.devintrs, | ||
cpu.softintrs, | ||
cpu.traps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please just add 2 XXX comments to signal that syscall and traps fields are always 0.
diskt[i].rblks * diskt[i].bsize, | ||
diskt[i].wblks * diskt[i].bsize, | ||
diskt[i].rserv / 1000 / 1000, // from nano to milli secs | ||
diskt[i].wserv / 1000 / 1000 // from nano to milli secs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a XXX comment telling the field which is always set to 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values (disk_io_counters) are ok, it's proc_io_counters
that's returning 0s. I will add XXX
comments there.
psutil/arch/aix/ifaddrs.h
Outdated
unsigned int ifa_flags; | ||
struct sockaddr *ifa_addr; | ||
struct sockaddr *ifa_netmask; | ||
struct sockaddr *ifa_dstaddr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please indent by 4
psutil/tests/test_contracts.py
Outdated
@@ -282,6 +284,9 @@ def test_fetch_all(self): | |||
'send_signal', 'suspend', 'resume', 'terminate', 'kill', 'wait', | |||
'as_dict', 'parent', 'children', 'memory_info_ex', 'oneshot', | |||
]) | |||
if AIX: | |||
# "<exiting>" processes really don't have names | |||
excluded_names.add('name') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't do this. Instead I would change the name
method (defined later) like this:
def name(self, ret, proc):
self.assertIsInstance(ret, str)
if not AIX:
assert ret
# ...else on AIX "<exiting>" processes (zombies) don't have names
psutil/tests/test_memory_leaks.py
Outdated
@@ -288,6 +289,7 @@ def test_num_fds(self): | |||
self.execute(self.proc.num_fds) | |||
|
|||
@skip_if_linux() | |||
@unittest.skipIf(AIX, "num_ctx_switches not available on AIX") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please instead define a HAS_NUM_CTX_SWITCHES global in tests/__init__.py
and use it here. I prefer that way so I can keep track of supported things in a single place, in case their support is added later.
Also, can you please make sure all C lines are less than 80 chars long? |
Added a new commit with AIX-specific tests |
As far as I'm concerned this looks good enough to be merged at this point so just tell me when you want me to do it. |
I think that this is ready now, except I'm not sure TODO.aix should be where it is (or stay at all), and maybe you want to update the main docs about the state of support for AIX (experimental?) |
It's OK, I will update doc / README, etc. before the next release, once I'm back from China. In the meantime, thanks a lot. You really did an amazing work. At first I was skeptical about adding support for a platform I can't test against, but the quality of the code you provided is excellent and you were great at addressing all my remarks. I hope you will keep hacking on psutil as a main contributor. |
Following #605, this is the proposed branch for AIX support.
The following methods and methods are not supported in AIX in this branch:
Not all unit tests pass on AIX at this time (I didn't make any changes to unit tests yet).