Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/CLN: datetimelike Index.equals may return True with non-Index #13986

Closed
wants to merge 1 commit into from

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Aug 13, 2016

@codecov-io
Copy link

codecov-io commented Aug 13, 2016

Current coverage is 85.26% (diff: 87.50%)

Merging #13986 into master will decrease coverage by <.01%

@@             master     #13986   diff @@
==========================================
  Files           139        139          
  Lines         50562      50557     -5   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          43116      43108     -8   
- Misses         7446       7449     +3   
  Partials          0          0          

Powered by Codecov. Last update 59524af...580151a

@sinhrks sinhrks changed the title BUG/CLN: datetimelike Index.equals may return True other than Index BUG/CLN: datetimelike Index.equals may return True with non-Index Aug 13, 2016
self.assertTrue(idx.equals(idx))
self.assertTrue(idx.equals(idx.copy()))
self.assertTrue(idx.equals(idx.astype(object)))
self.assertFalse(idx.equals(list(idx)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currenlty I get

In [2]: idx = pd.Index([1,2])

In [3]: idx.equals(list(idx))
Out[3]: True

But from the test I understand that this should return False. What am I missing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche thx for the catch. Found numerics uses different impl, I misunderstood it has been tested...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I don't understand how your tests passed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we can astype on the type, but it must be an Index for equal to compare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it must be an Index for equal to compare.

@jreback that is not the current behaviour (at least for numeric indexes, see example above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What "doesn't" ?
The examples I show are from master, so at the moment the dtype does not matter (as long as the values are equal).

I understand that it is not because it is the current behaviour, that it should be like that.
However, I think it does make some sense:

  • Index.identical -> strict version, also checks dtype and attributes (and therefore, other has to be an Index, as series/list/array does not have the same metadata)
  • Index.equal -> only looks at the values (and therefore, other does not need to be Index as it is the content that matters)

Note that the current PR of @sinhrks also does not check dtype in equals (as we try to astype before checking equivalence)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as base Index has the logic to check input class, i understand it is a original intension.

that is true. But the reality is that all other subclasses except base index do not have this check. Restricting those would be a back incompat API change, while we also have the option to relax it for the base class (which is also an API change, but IMO less likely to break code)

Copy link
Member Author

@sinhrks sinhrks Aug 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche astype is introduced by myself, it intends to keep DatetimeIndex.equals(object Index with same content) result as True.

But I feel it shouldn't cast at all, because following result is bit surprising.

# on current master
i1 = pd.Index([np.datetime64('2011-01-01')], dtype=object)
i2 = pd.Index([pd.Timestamp('2011-01-01')], dtype=object)
i1.equals(i2)
# False

pd.DatetimeIndex(i1).equals(i2)
# True

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed ok with coercing an Index but not anything else (ndarray, list, Series)

so much can go wrong with auto coercion in this kind of test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though have to be careful about
and Index of datetime strings being compared equal these need to be strict for that reason

@sinhrks sinhrks force-pushed the dti_equals branch 3 times, most recently from f7588f5 to 430e624 Compare August 14, 2016 13:26
@sinhrks
Copy link
Member Author

sinhrks commented Aug 14, 2016

Fixed impl and tests (added tests are not properly executed because it was overriden.

I'm still +1 to .equals should be False for non-Index, but there seems to be some tests which accepts list and returns True.

So now I think we should keep current behavior for a while. Needs to decide whether to change it in future.

@@ -400,7 +400,7 @@ def test_astype(self):
casted = self.intIndex.astype('i8')
self.assertEqual(casted.name, 'foobar')

def test_equals(self):
def test_equals_object(self):
# same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tests doesn't propogate to the other tests classes. So maybe need a more generic one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm. I see we override this in the sub-classes.

@jreback
Copy link
Contributor

jreback commented Aug 26, 2016

lgtm. very minor comments. ping on green.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Aug 27, 2016

@sinhrks Can you add some tests that explicitly test the behaviour for indexes with other attributes? (see my comment here #13986 (comment)) If it is not there yet of course.

I would also clarify this behaviour in the documentation.

@jorisvandenbossche
Copy link
Member

Further, I would say that changing Index.equals to return False with non-Index objects is an API change, and not a bug fix, so would add a note about that in the whatsnew.

@jreback
Copy link
Contributor

jreback commented Sep 2, 2016

@sinhrks can you rebase

@jreback jreback closed this in 4488f18 Sep 3, 2016
@jreback
Copy link
Contributor

jreback commented Sep 3, 2016

thanks @sinhrks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG/API: Index.equals should check input class?
4 participants