Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear cache hash on de-serialization #489

Merged
merged 15 commits into from
Feb 2, 2019

Conversation

gabbard
Copy link
Member

@gabbard gabbard commented Jan 22, 2019

This fixes a bug ( #482 ) where deserialized objects could have stale cached hash values. There is some remaining work for a rare case which I have shifted to ( #494 ) because it is out-of-scope for my needs and tricky to fix.

Pull Request Check List

This is just a reminder about the most common mistakes. Please make sure that you tick all appropriate boxes. But please read our contribution guide at least once, it will save you unnecessary review cycles!

If an item doesn't apply to your pull request, check it anyway to make it apparent that there's nothing to do.

  • Added tests for changed code.
  • New features have been added to our Hypothesis testing strategy. (n/a)
  • Changes or additions to public APIs are reflected in our type stubs (files ending in .pyi). n/a
    • ...and used in the stub test file tests/typing_example.py.
  • Updated documentation for changed code. n/a
    • New functions/classes have to be added to docs/api.rst by hand.
    • Changes to the signature of @attr.s() have to be added by hand too.
    • Changed/added classes/methods/functions have appropriate versionadded, versionchanged, or deprecated directives.
  • Documentation in .rst files is written using semantic newlines. n/a
  • Changes (and possible deprecations) have news fragments in changelog.d.

If you have any questions to any of the points above, just submit and ask! This checklist is here to help you, not to deter you from contributing!

Ryan Gabbard added 3 commits October 31, 2018 17:04
 Because the hash code cache field gets serialized and deserialized by
 Pickle, previously when you deserialize a cache_hash=True attrs object, the
 hashcode will be the hashcode the object had at serialization-time.
 However, if your object had fields with hash codes which were not
 deterministic between interpreter runs, then on a new interpreter run
 your deserialized object would have a hash code which differs from a
 newly created identical object.

 This commit fixes that by clearing the cache on deserialization. It
 needs to override the __setstate__ method to do so, so this commit
 also forbids using a custom __setstate__ on a cache_hash=True object.

Closes python-attrs#482 .
@gabbard gabbard force-pushed the 482-deserialized-cached-hash branch from 4ea79f3 to 42e8653 Compare January 22, 2019 20:04
@gabbard gabbard changed the title [WIP] Clear cache hash on de-serialization Clear cache hash on de-serialization Jan 22, 2019
Copy link
Member

@hynek hynek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty good first shot! Needs some more polish, but we should be able to merge it soon.

I'm not happy with the way we patch __setstate__ but what can we do…

@@ -0,0 +1,2 @@
Fixes a bug where deserialized objects with ``cache_hash=True`` could
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changelog entry updated

src/attr/_make.py Outdated Show resolved Hide resolved
# Clearing the hash code cache on deserialization is needed
# because hash codes can change from run to run. See issue
# https://github.com/python-attrs/attrs/issues/482 .
# Note that this code only handles setstate for slots classes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same about nomenclature, also in the next line.

# Note that this code only handles setstate for non-slots classes.
# For slots classes, see similar code in _create_slots_class .
if self._cache_hash:
if hasattr(cls, "__setstate__"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no hasattr in my codebases. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, thanks for enlightening me on the awfulness of hasattr :-)

# For slots classes, see similar code in _create_slots_class .
if self._cache_hash:
if hasattr(cls, "__setstate__"):
raise NotImplementedError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this should be marked as a breaking change? 🤔 It should be mentioned that this isn't possible anymore in the newsfragment.


# these are for use in TestAddHash.test_cache_hash_serialization
# they need to be out here so they can be un-pickled
HashCacheSerializationTestUncached = make_class(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason why you use make_class here instead of defining the classes the usual way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason - I've switched it to the usual way

https://github.com/python-attrs/attrs/issues/482 .
"""

# first, check that our fix didn't break serialization without
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the elaborate explanations here because this problem is indeed rather obscure and easy to forget about. But – and I do realize this is a bit petty 😔 – could you use proper capitalization throughout it please? 😇

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem - I've fixed it.

@gabbard
Copy link
Member Author

gabbard commented Jan 26, 2019

@hynek : Thanks for the review! I'll get to these Monday or Tuesday.

@gabbard
Copy link
Member Author

gabbard commented Jan 28, 2019

@hynek : ready for a second review pass. I believe I've addressed all your comments.

@hynek
Copy link
Member

hynek commented Feb 2, 2019

Thanks!

@hynek hynek merged commit d2be76c into python-attrs:master Feb 2, 2019
@gabbard
Copy link
Member Author

gabbard commented Feb 13, 2019

@hynek : Do you know when the next release is expected? This bug is causing us trouble in several places, so we'd be very appreciative if there were an updated release which had the fix. :-)

@hynek
Copy link
Member

hynek commented Feb 14, 2019

Are you complaining that my software is inconveniencing you with a bug you put there? ;P

I'd like to fix #500 and #503 first and then release. Last release before the “import attrs storm” 😱

@gabbard
Copy link
Member Author

gabbard commented Feb 14, 2019

It's a nice change of pace from the usual situation of my own software inconveniencing me with the bugs I put there. :-)

Good motivation for me to address #503 quickly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants