-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG/CLN: datetimelike Index.equals may return True with non-Index #13986
Conversation
Current coverage is 85.26% (diff: 87.50%)@@ master #13986 diff @@
==========================================
Files 139 139
Lines 50562 50557 -5
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 43116 43108 -8
- Misses 7446 7449 +3
Partials 0 0
|
self.assertTrue(idx.equals(idx)) | ||
self.assertTrue(idx.equals(idx.copy())) | ||
self.assertTrue(idx.equals(idx.astype(object))) | ||
self.assertFalse(idx.equals(list(idx))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currenlty I get
In [2]: idx = pd.Index([1,2])
In [3]: idx.equals(list(idx))
Out[3]: True
But from the test I understand that this should return False. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche thx for the catch. Found numerics uses different impl, I misunderstood it has been tested...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I don't understand how your tests passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah we can astype on the type, but it must be an Index for equal to compare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it must be an Index for equal to compare.
@jreback that is not the current behaviour (at least for numeric indexes, see example above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What "doesn't" ?
The examples I show are from master, so at the moment the dtype does not matter (as long as the values are equal).
I understand that it is not because it is the current behaviour, that it should be like that.
However, I think it does make some sense:
Index.identical
-> strict version, also checks dtype and attributes (and therefore, other has to be an Index, as series/list/array does not have the same metadata)Index.equal
-> only looks at the values (and therefore, other does not need to be Index as it is the content that matters)
Note that the current PR of @sinhrks also does not check dtype in equals
(as we try to astype before checking equivalence)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as base Index has the logic to check input class, i understand it is a original intension.
that is true. But the reality is that all other subclasses except base index do not have this check. Restricting those would be a back incompat API change, while we also have the option to relax it for the base class (which is also an API change, but IMO less likely to break code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche astype
is introduced by myself, it intends to keep DatetimeIndex.equals(object Index with same content)
result as True
.
But I feel it shouldn't cast at all, because following result is bit surprising.
# on current master
i1 = pd.Index([np.datetime64('2011-01-01')], dtype=object)
i2 = pd.Index([pd.Timestamp('2011-01-01')], dtype=object)
i1.equals(i2)
# False
pd.DatetimeIndex(i1).equals(i2)
# True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed ok with coercing an Index but not anything else (ndarray, list, Series)
so much can go wrong with auto coercion in this kind of test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though have to be careful about
and Index of datetime strings being compared equal these need to be strict for that reason
f7588f5
to
430e624
Compare
Fixed impl and tests (added tests are not properly executed because it was overriden. I'm still +1 to So now I think we should keep current behavior for a while. Needs to decide whether to change it in future. |
@@ -400,7 +400,7 @@ def test_astype(self): | |||
casted = self.intIndex.astype('i8') | |||
self.assertEqual(casted.name, 'foobar') | |||
|
|||
def test_equals(self): | |||
def test_equals_object(self): | |||
# same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this tests doesn't propogate to the other tests classes. So maybe need a more generic one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm. I see we override this in the sub-classes.
lgtm. very minor comments. ping on green. |
@sinhrks Can you add some tests that explicitly test the behaviour for indexes with other attributes? (see my comment here #13986 (comment)) If it is not there yet of course. I would also clarify this behaviour in the documentation. |
Further, I would say that changing Index.equals to return False with non-Index objects is an API change, and not a bug fix, so would add a note about that in the whatsnew. |
@sinhrks can you rebase |
thanks @sinhrks |
git diff upstream/master | flake8 --diff