Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

Closed
h-vetinari opened this issue Nov 15, 2018 · 8 comments · Fixed by #23799
Closed

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

h-vetinari opened this issue Nov 15, 2018 · 8 comments · Fixed by #23799
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@h-vetinari
Copy link
Contributor

Following Ian Fleming ("Once is happenstance. Twice is coincidence. Three times is enemy action"), here are the three most recent builds from #23582 -- which only changes tests -- wherein there is a single failure/crash in the travis-27.yaml build for pandas/tests/series/test_rank.py::test_pct_max_many_rows, which looks like this:

[...]
........................................................................ [ 87%]
..........................................................[gw0] node down: Not properly terminated
fReplacing crashed worker gw0
gw2 [39907] / gw1 [39907]. [ 87%]
........................................................................ [ 87%]
........................................................................ [ 87%]
[...]
........................................................................ [ 99%]
.....
=================================== FAILURES ===================================
_______________________ pandas/tests/series/test_rank.py _______________________
[gw0] linux2 -- Python 2.7.15 /home/travis/miniconda3/envs/pandas/bin/python
Worker 'gw0' crashed while running 'pandas/tests/series/test_rank.py::test_pct_max_many_rows'

https://travis-ci.org/pandas-dev/pandas/jobs/455175414
https://travis-ci.org/pandas-dev/pandas/jobs/455259318
https://travis-ci.org/pandas-dev/pandas/jobs/455642129

Interestingly, the azure 2.7 builds pass both for windows and linux. Any ideas?

@jbrockmendel
Copy link
Member

I’ve noticed this too. I’m guessing related to #23688. @jschendel any ideas?

@h-vetinari
Copy link
Contributor Author

@jbrockmendel
Definitely something weird going on.

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series(np.arange(2**24 + 1))
>>>
>>> s.astype(int).rank(pct=True).max()
1.0
>>> s.astype(np.int32).rank(pct=True).max()
1.0
>>> s.astype(np.int64).rank(pct=True).max()
1.0
>>> s.astype(np.uint32).rank(pct=True).max()
1.0
>>> s.astype(np.uint64).rank(pct=True).max()
1.0
>>> s.astype(float).rank(pct=True).max()
1.0
>>> s.astype(np.float64).rank(pct=True).max()
1.0
>>> s.astype(np.float32).rank(pct=True).max()
MemoryError
>>> # all following calls to s.astype(<whatever>).rank(pct=True), e.g.
>>> s.astype(int).rank(pct=True).max()
MemoryError
>>> # Interestingly, even after the MemoryError, pct=False works
>>> s.astype(int).rank().max()
16777217.0

I can't really see how anything in that test should cast to float32, but maybe there's a resource leak somewhere?

@TomAugspurger
Copy link
Contributor

Is this crash 2.7 only, or have we seen it on python 3 builds?

@h-vetinari
Copy link
Contributor Author

I've only ever seen it in the travis-27 build.

@TomAugspurger
Copy link
Contributor

Unfortunately I haven't been able to reproduce the segfault locally on my Mac.

@gfyoung gfyoung added 2/3 Compat Needs Tests Unit test(s) needed to prevent regressions Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Nov 16, 2018
@h-vetinari
Copy link
Contributor Author

I can reproduce the failure locally (on windows), but I only get a MemoryError, not a segfault. If it were a MemoryError, it could be an explanation why it only fails sometimes - depending on how busy the worker is with other code at that moment.

@jschendel
Copy link
Member

jschendel commented Nov 19, 2018

Yeah, my suspicion is that this is happening due to high memory usage combined with pytest-xdist running things in parallel, which could cause different tests to be run simultaneously on different runs. At least my google searches seem to indicate that this type of error primarily occurs with pytest-xdist specifically.

I'd be somewhat surprised if this was due to a segfault, as different components of the failing test always appear to be passing in other tests. For example, the 1d version in test_algos does the exact same thing as the failing test, with the data just not being wrapped in a Series:

@pytest.mark.parametrize('values', [
np.arange(2**24 + 1),
np.arange(2**25 + 2).reshape(2**24 + 1, 2)],
ids=['1d', '2d'])
def test_pct_max_many_rows(self, values):
# GH 18271
result = algos.rank(values, pct=True).max()
assert result == 1

Additionally, the DataFrame version of this test is essentially testing the same thing on two columns of similar data, and calls the same code in generic.py that the Series version does:

def test_pct_max_many_rows(self):
# GH 18271
df = DataFrame({'A': np.arange(2**24 + 1),
'B': np.arange(2**24 + 1, 0, -1)})
result = df.rank(pct=True).max()
assert (result == 1).all()

So if there is a segfault, it seems like it has to be something tied to the Series constructor, otherwise we'd see failures in the other tests? And on the Azure 2.7 builds too, presumably?

To test this theory, I've made a commit where I marked these tests with pytest.mark.single, which I believe marks a subset of tests to be run in non-distributed mode (correct me if I'm wrong). I have travis hooked up to my local repo, and it looks to be passing with these changes: https://travis-ci.org/jschendel/pandas/builds/457135619

Will open a PR with the changes above soon.

@jreback
Copy link
Contributor

jreback commented Nov 19, 2018

yeah marking for single is the best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants