Skip to content

Commit

Permalink
Force environment variables to str with Python2 on Windows (#101)
Browse files Browse the repository at this point in the history
* Force environment variables to str

With Python2 on Windows, force environment variables to str to avoid "TypeError: environment can only contain strings " in Python's subprocess.py.

* PEP8 fix

* Add Python3 compatible

* Fix flake8 fails with F821

* Fix flake8 fails with F821
  • Loading branch information
greyli authored and theskumar committed Oct 31, 2018
1 parent 27afc40 commit 43af2c5
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 1 deletion.
5 changes: 5 additions & 0 deletions dotenv/compat.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
import sys
try:
from StringIO import StringIO # noqa
except ImportError:
from io import StringIO # noqa

PY2 = sys.version_info[0] == 2
WIN = sys.platform.startswith('win')
text_type = unicode if PY2 else str # noqa
8 changes: 7 additions & 1 deletion dotenv/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
import warnings
from collections import OrderedDict

from .compat import StringIO
from .compat import StringIO, PY2, WIN, text_type

__escape_decoder = codecs.getdecoder('unicode_escape')
__posix_variable = re.compile('\$\{[^\}]*\}') # noqa
Expand Down Expand Up @@ -95,6 +95,12 @@ def set_as_environment_variables(self, override=False):
for k, v in self.dict().items():
if k in os.environ and not override:
continue
# With Python2 on Windows, force environment variables to str to avoid
# "TypeError: environment can only contain strings" in Python's subprocess.py.
if PY2 and WIN:
if isinstance(k, text_type) or isinstance(v, text_type):
k = k.encode('ascii')
v = v.encode('ascii')

This comment has been minimized.

Copy link
@SpotlightKid

SpotlightKid Oct 31, 2018

Is ASCII really the right encoding to use here? Environment variables often contain filesystem paths and on Windows these often contain usernames, which may have non-ASCII chars in them.

I think the default encoding to use here should be mbcs, but really load_dotenv and set_as_environment_variables should allow to pass an encoding.

This comment has been minimized.

Copy link
@greyli

greyli Nov 1, 2018

Author Contributor

Yes, you are right. It will fail with non-ASCII words. I found Werkzeug use iso-8859-1(Latin-1), maybe it's a better choice? I don't quite familiar with Python encoding...

This comment has been minimized.

Copy link
@SpotlightKid

SpotlightKid Nov 4, 2018

Can you give a link to which piece of code in Werkzeug you are referring to? latin-1 seems like a strange choice for Windows and I'd like to check why and how and Werkzeug does this.

This comment has been minimized.

Copy link
@greyli

greyli Nov 5, 2018

Author Contributor

This comment has been minimized.

Copy link
@SpotlightKid

SpotlightKid Nov 7, 2018

Unfortunately neither the PR which added this in werkzeug nor the commit for the previous version of the code, which only encoded the environment variable values give any indication, why latin-1 was chosen as the encoding.

Since the encoding only happens on Windows under Python 2, IMHO - in analogy to the behaviour of os.getenv on Unix described below - mbcs would be the correct encoding to use, because that is what sys.getfilesystemencoding() will return there. Relevant quotes from the Python docs:

Note that on Windows, there is an encoding known as “mbcs”, which uses an encoding specific to your current locale. In many cases, and particularly when working with COM, this may be an appropriate default encoding to use.

https://docs.python.org/2/faq/programming.html?highlight=mbcs#what-does-unicodeerror-ascii-decoding-encoding-error-ordinal-not-in-range-128-mean

sys.getfilesystemencoding()
Return the name of the encoding used to convert between Unicode filenames and bytes filenames.
[...]
In the UTF-8 mode, the encoding is utf-8 on any platform.
On Mac OS X, the encoding is 'utf-8'.
On Unix, the encoding is the locale encoding.
On Windows, the encoding may be 'utf-8' or 'mbcs', depending on user configuration.
[...]
Changed in version 3.6: Windows is no longer guaranteed to return 'mbcs'. See PEP 529 and _enablelegacywindowsfsencoding() for more information.
Changed in version 3.7: Return ‘utf-8’ in the UTF-8 mode.

https://docs.python.org/3/library/sys.html?highlight=mbcs#sys.getfilesystemencoding

Most of the operating systems in common use today support filenames that contain arbitrary Unicode characters.
Usually this is implemented by converting the Unicode string into some encoding that varies depending on the system.
For example, Mac OS X uses UTF-8 while Windows uses a configurable encoding; on Windows, Python uses the name
“mbcs” to refer to whatever the currently configured encoding is. [...]

https://docs.python.org/3/howto/unicode.html?highlight=mbcs#unicode-filenames

Return the value of the environment variable key if it exists, or default if it doesn’t. key, default and the result are str.
On Unix, keys and values are decoded with sys.getfilesystemencoding() and 'surrogateescape' error handler. [...]

https://docs.python.org/3.7/library/os.html#os.getenv

This comment has been minimized.

Copy link
@greyli

greyli Nov 8, 2018

Author Contributor

It's quite reasonable, just make a PR or issue for it!

os.environ[k] = v

return True
Expand Down

0 comments on commit 43af2c5

Please sign in to comment.