-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode support #135
Unicode support #135
Conversation
Interestingly the failing test doesn't fail for me locally on py27... Will push a commit shortly. |
cf_units/__init__.py
Outdated
@@ -809,6 +809,10 @@ def __init__(self, unit, calendar=None): | |||
else: | |||
unit = str(unit).strip() | |||
|
|||
# For the sake of python 2, ensure that the string is a unicode. | |||
if not isinstance(unit, six.text_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tempting to put a six.PY2 test here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have done that @bjlittle
Also, allow a file encoding in the coding standards test, so that we can have some literal unicode characters for testing with.
…o see that the code can be deleted when the codebase becomes py3 only.
# Not all unicode characters are allowed. | ||
msg = '[UT_UNKNOWN] Failed to parse unit "ø"' | ||
with self.assertRaises(ValueError, msg=msg): | ||
Unit('ø') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice...
ba5fbcd
to
1645170
Compare
@pelson Looks good to me... I'll let travis do it's thing, then merge 👍 |
@pelson No worries... I was expecting the rebase 😉 |
OK, so something isn't quite right yet for py2k...
|
LICENSE_RE_PATTERN = r'(\#\!.*\n)?' + LICENSE_RE_PATTERN | ||
LICENSE_RE = re.compile(LICENSE_RE_PATTERN, re.MULTILINE) | ||
SHEBANG = r'(\#\!.*\n)?' | ||
ENCODING = r'(\# \-\*\- coding\: .* \-\*\-\n)?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pelson You might want to account for white space after the trailing -*-
...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh. Happy if you want me to, but why be generous on this? Do it right, or fix it is my opinion 😉
@pelson I'll let you chase down the py2k unicode barf... ping me when you want me to re-review 😉 |
…eturns a non-unicode for __str__ (unless sys.getdefaultencoding says otherwise).
Nice, thanks @pelson 👍 |
What a right old shambles py2's unicode handling is. I think I've now got this right. From the start, this was a trivial change for py3k - all of the work (and there has now been several hours worth) has been dealing with the py2 fallout. With the current setup, the following will occur for py2 users:
I think I can do better than that though, so there is another PR incoming. |
As promised #137. |
Follows up from #134 (so that needs merging first) to allow unicode units, such as
π m²
, as is supported by udunits2.In addition to this, I've extended the coding standard test to handle an encoding preamble.
Closes #133.