Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing fails with LC_ALL=C|POSIX #20

Closed
ghisvail opened this issue Mar 30, 2017 · 6 comments
Closed

Testing fails with LC_ALL=C|POSIX #20

ghisvail opened this issue Mar 30, 2017 · 6 comments

Comments

@ghisvail
Copy link
Contributor

Hi Remi,

The test suite of the package runs fine in a standard Linux install where a Unicode locale would be typically setup. However, the tests fail if the system is setup or overridden with LC_ALL=C or LC_ALL=POSIX.

> python3 setup.py build
> cp -r tests build/lib
> cd build/lib
> python3 -m unittest discover
..............................
----------------------------------------------------------------------
Ran 30 tests in 0.011s

OK
> LC_ALL=C python3 -m unittest discover

......................E.....
======================================================================
ERROR: setUpClass (tests.test_concrete.TestLists)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/tests/test_concrete.py", line 93, in setUpClass
    cls.tmp.open('w', 'r\xE9mi\'s thing').close()
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/rpaths.py", line 950, in open
    return io.open((self / name).path, mode=mode, **kwargs)
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/rpaths.py", line 122, in __div__
    return self.__class__(self, other)
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/rpaths.py", line 105, in __init__
    self._lib.join(*[self._to_backend(p) for p in parts]))
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/rpaths.py", line 105, in <listcomp>
    self._lib.join(*[self._to_backend(p) for p in parts]))
  File "/home/gvaillan/debian/packages/python-rpaths/build/lib/rpaths.py", line 87, in _to_backend
    'surrogateescape' if PY3 else 'strict')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 1: ordinal not in range(128)

----------------------------------------------------------------------
Ran 27 tests in 0.009s

FAILED (errors=1)
@remram44
Copy link
Owner

Of course... god I hate PY3. Will fix ASAP. Thanks for the report!

@remram44
Copy link
Owner

remram44 commented Mar 30, 2017

Alright, so what happens is that Path uses sys.getfilesystemencoding() to decide what encoding to use to convert filenames between bytes and unicode, and that returns 'ascii' if LC_ALL='C'. I'm not sure this is wrong. I'm not sure I should use something else than getfilesystemencoding() (though I suppose I could default to utf-8).

Possible actions:

  • Trust getfilesystemencoding(). Keep current behavior. With C locale, people feeding non-ASCII see an encoding exception like you did, which really is the right behavior (your locale doesn't support those paths); just skip this in test suite.
  • Use UTF-8 if getfilesystemencoding() returns 'ascii'. Keeps compatibility. People setting their locale to insane values get their insane paths (not UTF-8), which doesn't match abstract path behavior (but what can you do). People running with C locale get expected behavior except if they feed non-ASCII paths, then they get UTF-8 instead of exception.
  • Not use getfilesystemencoding() for concrete paths either. It is already not used when using abstract paths (they might not be meant for the current machine): POSIX always uses UTF-8 and Windows always uses windows-1252. Problem: people setting their locale to insane values get UTF-8 instead, which breaks things. But at least that matches abstract path behavior.

Bonus:

  • Make it easier to override the encoding of paths, especially abstract paths.

(insane encoding = everything not UTF-8)

@ghisvail
Copy link
Contributor Author

I don't mind which behavior is implemented amongst the options you listed, but the tests should pass regardless of the encoding of the build system. If the current behavior is kept, then you would have to decorate the tests requiring a UTF-8 locale with a unittest.skipIf.

@remram44
Copy link
Owner

remram44 commented Apr 12, 2017

I'm going with option 1. For the tests, I use a Path subclass that forces UTF-8 (d7c6516).

Please tell me if that solves your issue and I'll do a release (build passes with LC_ALL=C: https://travis-ci.org/remram44/rpaths/builds/221525718)

@ghisvail
Copy link
Contributor Author

Looks good to me. Please go ahead.

@remram44
Copy link
Owner

Fixed in 0.13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants