Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: (iloc) Comparisons of Series have different indexing semantics from DataFrame comparisons #4581

Closed
mairas opened this issue Aug 16, 2013 · 3 comments · Fixed by #13894
Closed
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@mairas
Copy link

mairas commented Aug 16, 2013

Consider the following:

In [3]: df = pd.DataFrame(np.arange(5), columns=["foo"])

In [4]: df["foo"].iloc[:-1] != df["foo"].iloc[1:]
Out[4]:
0    True
1    True
2    True
3    True
dtype: bool

In [5]: df["foo"].iloc[1:] != df["foo"].iloc[:-1]
Out[5]:
1    True
2    True
3    True
4    True
dtype: bool

Obviously, the ne operator uses positional indexing when comparing the values and applies the index of the first argument to the result. I don't think this is the expected behaviour.

Again, the semantics are different when performing addition:

In [6]: df["foo"].iloc[:-1] + df["foo"].iloc[1:]
Out[6]:
0   NaN
1     2
2     4
3     6
4   NaN
dtype: float64

In [7]: df["foo"].values[:-1] + df["foo"].values[1:]
Out[7]: array([1, 3, 5, 7])

Here, everything works as expected.

I also tried the dataframe ne method:

In [8]: df.iloc[:-1].ne(df.iloc[1:])
Out[8]:
     foo
0   True
1  False
2  False
3  False
4   True

This works as expected, and differently from Series comparisons. The respective method doesn't exist for Series objects, so I couldn't test that.

@hayd
Copy link
Contributor

hayd commented Aug 16, 2013

Also, weirdly, if you use != rather than ne on DataFrame you get:

df.iloc[:-1] != df.iloc[1:]
Exception: Can only compare identically-labeled DataFrame objects

this behaviour may change after #3482 is merged, cc #4324 @jreback

@jreback
Copy link
Contributor

jreback commented Aug 16, 2013

I think a bug in iloc/Series impl. (they are separate for efficiency) so this case not tested

In [12]: df["foo"].iloc[1:] != df["foo"].iloc[:-1]
Out[12]: 
1    True
2    True
3    True
4    True
dtype: bool

In [13]: df["foo"].iloc[1:] == df["foo"].iloc[:-1]
Out[13]: 
1    False
2    False
3    False
4    False
dtype: bool

In [14]: df['foo'].iloc[1:]
Out[14]: 
1    1
2    2
3    3
4    4
dtype: int64

In [15]: df['foo'].iloc[-1:]
Out[15]: 
4    4
dtype: int64

Works for frames though

In [17]: df.iloc[1:] == df.iloc[4]
Out[17]: 
     foo
1  False
2  False
3  False
4   True

@jreback
Copy link
Contributor

jreback commented Feb 26, 2014

I think this is a bug in ne actually; it should raise (and Series should raise)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants