CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

h-vetinari · 2018-11-15T21:09:21Z

Following Ian Fleming ("Once is happenstance. Twice is coincidence. Three times is enemy action"), here are the three most recent builds from #23582 -- which only changes tests -- wherein there is a single failure/crash in the travis-27.yaml build for pandas/tests/series/test_rank.py::test_pct_max_many_rows, which looks like this:

[...]
........................................................................ [ 87%]
..........................................................[gw0] node down: Not properly terminated
fReplacing crashed worker gw0
gw2 [39907] / gw1 [39907]. [ 87%]
........................................................................ [ 87%]
........................................................................ [ 87%]
[...]
........................................................................ [ 99%]
.....
=================================== FAILURES ===================================
_______________________ pandas/tests/series/test_rank.py _______________________
[gw0] linux2 -- Python 2.7.15 /home/travis/miniconda3/envs/pandas/bin/python
Worker 'gw0' crashed while running 'pandas/tests/series/test_rank.py::test_pct_max_many_rows'

https://travis-ci.org/pandas-dev/pandas/jobs/455175414
https://travis-ci.org/pandas-dev/pandas/jobs/455259318
https://travis-ci.org/pandas-dev/pandas/jobs/455642129

Interestingly, the azure 2.7 builds pass both for windows and linux. Any ideas?

The text was updated successfully, but these errors were encountered:

jbrockmendel · 2018-11-15T22:04:17Z

I’ve noticed this too. I’m guessing related to #23688. @jschendel any ideas?

h-vetinari · 2018-11-15T23:22:24Z

@jbrockmendel
Definitely something weird going on.

>>> import numpy as np
>>> import pandas as pd
>>> s = pd.Series(np.arange(2**24 + 1))
>>>
>>> s.astype(int).rank(pct=True).max()
1.0
>>> s.astype(np.int32).rank(pct=True).max()
1.0
>>> s.astype(np.int64).rank(pct=True).max()
1.0
>>> s.astype(np.uint32).rank(pct=True).max()
1.0
>>> s.astype(np.uint64).rank(pct=True).max()
1.0
>>> s.astype(float).rank(pct=True).max()
1.0
>>> s.astype(np.float64).rank(pct=True).max()
1.0
>>> s.astype(np.float32).rank(pct=True).max()
MemoryError
>>> # all following calls to s.astype(<whatever>).rank(pct=True), e.g.
>>> s.astype(int).rank(pct=True).max()
MemoryError
>>> # Interestingly, even after the MemoryError, pct=False works
>>> s.astype(int).rank().max()
16777217.0

I can't really see how anything in that test should cast to float32, but maybe there's a resource leak somewhere?

TomAugspurger · 2018-11-16T15:29:34Z

Is this crash 2.7 only, or have we seen it on python 3 builds?

h-vetinari · 2018-11-16T16:44:12Z

I've only ever seen it in the travis-27 build.

TomAugspurger · 2018-11-16T17:24:43Z

Unfortunately I haven't been able to reproduce the segfault locally on my Mac.

h-vetinari · 2018-11-19T22:33:59Z

I can reproduce the failure locally (on windows), but I only get a MemoryError, not a segfault. If it were a MemoryError, it could be an explanation why it only fails sometimes - depending on how busy the worker is with other code at that moment.

jschendel · 2018-11-19T23:12:26Z

Yeah, my suspicion is that this is happening due to high memory usage combined with pytest-xdist running things in parallel, which could cause different tests to be run simultaneously on different runs. At least my google searches seem to indicate that this type of error primarily occurs with pytest-xdist specifically.

I'd be somewhat surprised if this was due to a segfault, as different components of the failing test always appear to be passing in other tests. For example, the 1d version in test_algos does the exact same thing as the failing test, with the data just not being wrapped in a Series:

pandas/pandas/tests/test_algos.py

Lines 1465 to 1472 in deb7b4d

    
               @pytest.mark.parametrize('values', [ 
        
                   np.arange(2**24 + 1), 
        
                   np.arange(2**25 + 2).reshape(2**24 + 1, 2)], 
        
                   ids=['1d', '2d']) 
        
               def test_pct_max_many_rows(self, values): 
        
                   # GH 18271 
        
                   result = algos.rank(values, pct=True).max() 
        
                   assert result == 1

Additionally, the DataFrame version of this test is essentially testing the same thing on two columns of similar data, and calls the same code in generic.py that the Series version does:

pandas/pandas/tests/frame/test_rank.py

Lines 313 to 318 in deb7b4d

    
           def test_pct_max_many_rows(self): 
        
               # GH 18271 
        
               df = DataFrame({'A': np.arange(2**24 + 1), 
        
                               'B': np.arange(2**24 + 1, 0, -1)}) 
        
               result = df.rank(pct=True).max() 
        
               assert (result == 1).all()

So if there is a segfault, it seems like it has to be something tied to the Series constructor, otherwise we'd see failures in the other tests? And on the Azure 2.7 builds too, presumably?

To test this theory, I've made a commit where I marked these tests with pytest.mark.single, which I believe marks a subset of tests to be run in non-distributed mode (correct me if I'm wrong). I have travis hooked up to my local repo, and it looks to be passing with these changes: https://travis-ci.org/jschendel/pandas/builds/457135619

Will open a PR with the changes above soon.

jreback · 2018-11-19T23:16:23Z

yeah marking for single is the best

h-vetinari mentioned this issue Nov 15, 2018

TST: add method/dtype coverage to str-accessor; precursor to #23167 #23582

Merged

3 tasks

This was referenced Nov 16, 2018

TST/CLN: Fix/clean pytables test #23732

Merged

DOC: more consistent flake8-commands in contributing.rst #23724

Merged

dcreekp mentioned this issue Nov 16, 2018

TST: clean up to change setup.cfg to xfail_strict = True (GH23057) #23721

Merged

4 tasks

gfyoung added 2/3 Compat Needs Tests Unit test(s) needed to prevent regressions Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff labels Nov 16, 2018

simonjayhawkins mentioned this issue Nov 17, 2018

TST/CLN: refactor tests\io\formats\test_to_html.py #23747

Merged

2 tasks

h-vetinari mentioned this issue Nov 17, 2018

CI for boto: fix errors; add coverage; add skip for uncatchable ResourceWarning #23731

Merged

2 tasks

jschendel mentioned this issue Nov 20, 2018

TST: Mark test_pct_max_many_rows with pytest.mark.single #23799

Merged

2 tasks

jreback added this to the 0.24.0 milestone Nov 20, 2018

jreback closed this as completed in #23799 Nov 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

h-vetinari commented Nov 15, 2018

jbrockmendel commented Nov 15, 2018

h-vetinari commented Nov 15, 2018

TomAugspurger commented Nov 16, 2018

h-vetinari commented Nov 16, 2018

TomAugspurger commented Nov 16, 2018

h-vetinari commented Nov 19, 2018

jschendel commented Nov 19, 2018 •

edited

Loading

jreback commented Nov 19, 2018

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

CI/BUG?: test_pct_max_many_rows crashes on travis-27. #23726

Comments

h-vetinari commented Nov 15, 2018

jbrockmendel commented Nov 15, 2018

h-vetinari commented Nov 15, 2018

TomAugspurger commented Nov 16, 2018

h-vetinari commented Nov 16, 2018

TomAugspurger commented Nov 16, 2018

h-vetinari commented Nov 19, 2018

jschendel commented Nov 19, 2018 • edited Loading

jreback commented Nov 19, 2018

jschendel commented Nov 19, 2018 •

edited

Loading