-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make serde pickle version configurable #160
Changes from 13 commits
93d4d39
d46ab71
33be5f1
e4443b8
7cfd6a4
5316ee3
9a4a2e7
cae7bcb
ab81cf4
61bb368
837401e
696df15
02ea2cc
09ed5c5
dc000c7
aec88e4
11a042d
e0c9cf0
681f48e
62e347e
a84d0a9
edceb08
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,170 @@ | ||
# Copyright 2012 Pinterest.com | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import abc | ||
import logging | ||
from io import BytesIO | ||
import six | ||
from six.moves import cPickle as pickle | ||
|
||
log = logging.getLogger(__name__) | ||
|
||
try: | ||
long_type = long # noqa | ||
except NameError: | ||
long_type = None | ||
|
||
|
||
@six.add_metaclass(abc.ABCMeta) | ||
class ICodec(object): | ||
""" | ||
Interface for serializers. | ||
""" | ||
|
||
@abc.abstractmethod | ||
def from_python(self, key, value): | ||
""" | ||
Serialize a python object. | ||
|
||
:param str|unicode key: Key | ||
:param str|unicode value: Value | ||
:return tuple[str, int]: tuple(value, flags) | ||
""" | ||
raise NotImplementedError() | ||
|
||
@abc.abstractmethod | ||
def to_python(self, key, value, flags): | ||
""" | ||
Deserialize a value into a python object. | ||
|
||
:param str|unicode key: Key | ||
:param str|unicode value: Value | ||
:param int flags: Bitflag containing flags used to specify how to | ||
deserialize this object. | ||
:return object: Deserialized python object. | ||
""" | ||
raise NotImplementedError() | ||
|
||
|
||
class Serde(ICodec): | ||
""" | ||
Serialization handler. | ||
|
||
Meant to be compatible with `python-memcached`. | ||
""" | ||
|
||
FLAG_BYTES = 0 | ||
FLAG_PICKLE = 1 << 0 | ||
FLAG_INTEGER = 1 << 1 | ||
FLAG_LONG = 1 << 2 | ||
FLAG_COMPRESSED = 1 << 3 # unused, to main compatibility with python-memcached | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll conform to VT100 in an update There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thanks! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
FLAG_TEXT = 1 << 4 | ||
|
||
pickle_version = 0 | ||
|
||
def __init__(self, pickle_version=0): | ||
""" | ||
Init | ||
|
||
:param int pickle_version: Pickle version to use (from python). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not the pickle version but rather the pickle protocol version https://docs.python.org/3/library/pickle.html#data-stream-format There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, will update. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
Use `-1` to use the highest supported at runtime. | ||
Deserialization is not affected by this parameter. | ||
|
||
A forewarning with `0` (the default): If somewhere in your value lies | ||
a slotted object, ie defines `__slots__`, even if you do not include | ||
it in your pickleable state via `__getstate__`, python will raise: | ||
``` | ||
TypeError: a class that defines __slots__ without defining | ||
__getstate__ cannot be pickled | ||
``` | ||
""" | ||
if pickle_version is not None: | ||
self.pickle_version = pickle_version | ||
|
||
def from_python(self, key, value): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how about call this function There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I changed it because in my experience serialize and deserialize are not the best names. Using a common anchor it seems to stick with more juniors easier and avoids headache. Still, I can update if you want, but I'd recommend not. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
""" | ||
Serialize a python object. | ||
|
||
:param str|unicode key: Key | ||
:param str|unicode value: Value | ||
:return tuple[str, int]: tuple(value, flags) | ||
""" | ||
flags = 0 | ||
value_type = type(value) | ||
|
||
# Check against exact types so that subclasses of native types will be | ||
# restored as their native type | ||
if value_type is bytes: | ||
pass | ||
|
||
elif value_type is six.text_type: | ||
flags |= self.FLAG_TEXT | ||
value = value.encode('utf8') | ||
|
||
elif value_type is int: | ||
flags |= self.FLAG_INTEGER | ||
value = "%d" % value | ||
|
||
elif six.PY2 and value_type is long_type: | ||
flags |= self.FLAG_LONG | ||
value = "%d" % value | ||
|
||
else: | ||
flags |= self.FLAG_PICKLE | ||
|
||
output = BytesIO() | ||
pickler = pickle.Pickler(output, self.pickle_version) | ||
pickler.dump(value) | ||
value = output.getvalue() | ||
|
||
return value, flags | ||
|
||
def to_python(self, key, value, flags): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
""" | ||
Deserialize a value into a python object. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the doc There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nicochar, anytime for you bud. |
||
|
||
:param str|unicode key: Key | ||
:param str|unicode value: Value | ||
:param int flags: Bitflag containing flags used to specify how to | ||
deserialize this object. | ||
:return object: Deserialized python object. | ||
""" | ||
if flags == 0: | ||
return value | ||
|
||
elif flags & self.FLAG_TEXT: | ||
return value.decode('utf8') | ||
|
||
elif flags & self.FLAG_INTEGER: | ||
return int(value) | ||
|
||
elif flags & self.FLAG_LONG: | ||
if six.PY3: | ||
return int(value) | ||
else: | ||
return long_type(value) | ||
|
||
elif flags & self.FLAG_PICKLE: | ||
try: | ||
buf = BytesIO(value) | ||
unpickler = pickle.Unpickler(buf) | ||
return unpickler.load() | ||
except Exception as exc: | ||
# This includes exc as a string for troubleshooting as well as | ||
# providing a trace. | ||
log.exception('Could not depickle value (len=%d): %s', | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unclear why this is better: import logging
try:
raise Exception("raised")
except Exception as exc:
logging.exception("TEST: %s", exc)
try:
raise Exception("raised")
except Exception:
logging.exception("TEST")
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because from the initial line, you know what the exc was that caused it. This helps when an exception storm hits and you have N of them in your log interspersed. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Logging entries are usually configured with a better format than the default that you used, which contain things such as the process and thread ID. |
||
len(value), exc) | ||
return None | ||
|
||
return value |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,98 +1,10 @@ | ||
# Copyright 2012 Pinterest.com | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
""" | ||
Backwards compatibility with the older serialization api previously | ||
provided by this module. | ||
""" | ||
from . import codecs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I prefer to use absolute imports here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PEP8 allows relative imports. Is there a reason to not use them and opt for the less featureful alternative? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's keep it consistent with the rest of the project, which uses absolute imports. Both positions are defendable but I think there's a strong argument for consistency There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done, consistency > opinions |
||
|
||
import logging | ||
from io import BytesIO | ||
import six | ||
from six.moves import cPickle as pickle | ||
_SERDE = codecs.Serde() | ||
|
||
try: | ||
long_type = long # noqa | ||
except NameError: | ||
long_type = None | ||
|
||
|
||
FLAG_BYTES = 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Although I don't think removing these from this file will be a problem it is technically a backwards incompatible change and don't want to break backwards compatibility. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO These should have been private to begin with, plus can't you just bump the major? I'll import them into serde.py if you want though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand why this is backward incompatible? I think this is OK. Am I missing something @jogo? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I already imported the API methods, and I don't think anyone should really be using constants in an arbitrary location in a library. |
||
FLAG_PICKLE = 1 << 0 | ||
FLAG_INTEGER = 1 << 1 | ||
FLAG_LONG = 1 << 2 | ||
FLAG_COMPRESSED = 1 << 3 # unused, to main compatability with python-memcached | ||
FLAG_TEXT = 1 << 4 | ||
|
||
# Pickle protocol version (-1 for highest available to runtime) | ||
# Warning with `0`: If somewhere in your value lies a slotted object, | ||
# ie defines `__slots__`, even if you do not include it in your pickleable | ||
# state via `__getstate__`, python will complain with something like: | ||
# TypeError: a class that defines __slots__ without defining __getstate__ | ||
# cannot be pickled | ||
PICKLE_VERSION = -1 | ||
|
||
|
||
def python_memcache_serializer(key, value): | ||
flags = 0 | ||
value_type = type(value) | ||
|
||
# Check against exact types so that subclasses of native types will be | ||
# restored as their native type | ||
if value_type is bytes: | ||
pass | ||
|
||
elif value_type is six.text_type: | ||
flags |= FLAG_TEXT | ||
value = value.encode('utf8') | ||
|
||
elif value_type is int: | ||
flags |= FLAG_INTEGER | ||
value = "%d" % value | ||
|
||
elif six.PY2 and value_type is long_type: | ||
flags |= FLAG_LONG | ||
value = "%d" % value | ||
|
||
else: | ||
flags |= FLAG_PICKLE | ||
output = BytesIO() | ||
pickler = pickle.Pickler(output, PICKLE_VERSION) | ||
pickler.dump(value) | ||
value = output.getvalue() | ||
|
||
return value, flags | ||
|
||
|
||
def python_memcache_deserializer(key, value, flags): | ||
if flags == 0: | ||
return value | ||
|
||
elif flags & FLAG_TEXT: | ||
return value.decode('utf8') | ||
|
||
elif flags & FLAG_INTEGER: | ||
return int(value) | ||
|
||
elif flags & FLAG_LONG: | ||
if six.PY3: | ||
return int(value) | ||
else: | ||
return long_type(value) | ||
|
||
elif flags & FLAG_PICKLE: | ||
try: | ||
buf = BytesIO(value) | ||
unpickler = pickle.Unpickler(buf) | ||
return unpickler.load() | ||
except Exception: | ||
logging.info('Pickle error', exc_info=True) | ||
return None | ||
|
||
return value | ||
python_memcache_serializer = _SERDE.from_python | ||
python_memcache_deserializer = _SERDE.to_python |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# -*- coding: utf-8 -*- | ||
from unittest import TestCase | ||
|
||
from pymemcache import codecs | ||
import pytest | ||
import six | ||
|
||
try: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should import this the same way we do elsewhere in this code: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While I can do that, it breaks tab completion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
import cPickle as pickle | ||
except ImportError: | ||
import pickle | ||
|
||
|
||
class CustomInt(int): | ||
""" | ||
Custom integer type for testing. | ||
|
||
Entirely useless, but used to show that built in types get serialized and | ||
deserialized back as the same type of object. | ||
""" | ||
pass | ||
|
||
|
||
@pytest.mark.unit() | ||
class TestSerde(TestCase): | ||
Serde = codecs.Serde | ||
|
||
def check(self, value, expected_flags, pickle_version=None): | ||
s = self.Serde(pickle_version=pickle_version) | ||
|
||
serialized, flags = s.from_python(b'key', value) | ||
assert flags == expected_flags | ||
|
||
# pymemcache stores values as byte strings, so we immediately the value | ||
# if needed so deserialized works as it would with a real server | ||
if not isinstance(serialized, six.binary_type): | ||
serialized = six.text_type(serialized).encode('ascii') | ||
|
||
deserialized = s.to_python(b'key', serialized, flags) | ||
assert deserialized == value | ||
|
||
def test_bytes(self): | ||
self.check(b'value', self.Serde.FLAG_BYTES) | ||
self.check(b'\xc2\xa3 $ \xe2\x82\xac', self.Serde.FLAG_BYTES) # £ $ € | ||
|
||
def test_unicode(self): | ||
self.check(u'value', self.Serde.FLAG_TEXT) | ||
self.check(u'£ $ €', self.Serde.FLAG_TEXT) | ||
|
||
def test_int(self): | ||
self.check(1, self.Serde.FLAG_INTEGER) | ||
|
||
def test_long(self): | ||
# long only exists with Python 2, so we're just testing for another | ||
# integer with Python 3 | ||
if six.PY2: | ||
expected_flags = self.Serde.FLAG_LONG | ||
else: | ||
expected_flags = self.Serde.FLAG_INTEGER | ||
self.check(123123123123123123123, expected_flags) | ||
|
||
def test_pickleable(self): | ||
self.check({'a': 'dict'}, self.Serde.FLAG_PICKLE) | ||
|
||
def test_subtype(self): | ||
# Subclass of a native type will be restored as the same type | ||
self.check(CustomInt(123123), self.Serde.FLAG_PICKLE) | ||
|
||
def test_pickle_version(self): | ||
for pickle_version in range(-1, pickle.HIGHEST_PROTOCOL): | ||
self.check( | ||
dict(whoa='nelly', humans=u'horrid', answer=42), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer to use a different key value pair for humans. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please assist me with what you would like. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, you mean humans aren't horrid? But we are! Lol, I'll update it! |
||
self.Serde.FLAG_PICKLE, | ||
pickle_version=pickle_version, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why rename the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To provide backwards compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Without polluting the primary namespace)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine.