-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Properly handle unicode and byte streams with pantsd for Python 3 #7130
Conversation
We use bytes for local pants runner, so should also with pantsd.
In Py2, we had to add code to decode all files from a watchman's `event["files"]` (see pantsbuild#3951). This was necessary to fix a unicode bug. However, in Py3, it appears that this value is already unicode, as we get an error saying `str` does not have an attribute `decode()`. So, we conditionally check if we need to decode or not based off of the interpreter.
This was causing an issue with daemon_pants_runner when it tried to use sys.stdout.buffer. Now the two correspond. Note this could completely be wrong..
…ts w/ py3 constraints Curious to see if @pants_daemon is still failing
It looks like this PR will fix a couple failing targets! (Which I wasn't expecting because they timeout locally...) Multiple targets are still failing for other reasons, which we leave off.
I've fixed the unicode issues, but it leads to a new issue. Any test with To reproduce, pull this PR and run I haven't been able to find yet what might be causing this. Note that the daemon does work as intended with Python 2. |
Going to take a stab at this one. |
After investigation, it seems that these are fixed. They fail on osx due to non fork safe osx lib (see https://bugs.python.org/issue28342) But this bug also affects py2k.
Question for reviewers: can you please pull down this PR and run the command On macOS with Python 3.7, the third test never terminates for me, even when removing the timeout functionality with: diff --git a/src/python/pants/pantsd/process_manager.py b/src/python/pants/pantsd/process_manager.py
index e42cbfa6e..e01629420 100644
--- a/src/python/pants/pantsd/process_manager.py
+++ b/src/python/pants/pantsd/process_manager.py
@@ -131,8 +131,8 @@ class ProcessMetadataManager(object):
return True
now = time.time()
- if now > deadline:
- raise cls.Timeout('exceeded timeout of {} seconds while waiting for {}'.format(timeout, action_msg))
+ # if now > deadline:
+ # raise cls.Timeout('exceeded timeout of {} seconds while waiting for {}'.format(timeout, action_msg))
if now > info_deadline:
logger.info('waiting for {}...'.format(action_msg)) However, Mathieu cannot reproduce this with either 3.6 on macOS or 3.6 on Ubuntu, I also can't reproduce with 3.6 on Ubuntu, and CI works (which runs 3.6 and Ubuntu). I'm trying to figure out if this is just weirdness with my mac's setup or a 3.7 issue. |
Update: we discovered the timeout issue is indeed a problem with Python 3.7... Opened #7160 to track. No need to run the tests locally. I think we should not let this Python 3.7 issue block this PR. It does fix 3.6 behavior, which is the primary interpreter we're targeting, and the fixes are also likely useful for 3.7. Before we announce "Full Python 3 support", I think we need to close #7160, but this gets us closer in the meantime to "Experimental Python 3 support". |
Contrib tests were failing because they did not include the recent commit that modifies contrib/go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for digging in here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation of .buffer
that slyly changes things to unicode feels surprising to me -- what if we wanted to write something with an unknown encoding to the logger stream? Making a subclass as suggested feels easy. I'm warming up to the thought, but making sys.stdout.buffer
(the way this stream is accessed) the same as not using .buffer
feels suprising to me.
"""A sys.{stdout,stderr} replacement that pipes output to a logger.""" | ||
"""A sys.std{out,err} replacement that pipes output to a logger. | ||
|
||
N.B. the Logger object expects unicode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't figure out how to insert a multi-line suggestion, but I wouldn't leave this line alone from the rest, and would fill it to just:
N.B. the Logger object expects unicode. However, most of our outstream logic, such as in
`Exiter.py`, will use `sys.std{out,err}.buffer` and thus a bytes interface when running with
Python 3. So, we must provide a `buffer` property, and change the semantics of the buffer to
always convert the message to unicode.
@@ -135,7 +136,7 @@ def _process_event_queue(self): | |||
try: | |||
subscription, is_initial_event, files = (event['subscription'], | |||
event['is_fresh_instance'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if others would find this appropriate, but fixing the indentation of this line sounds like a good idea and a small change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After slack convo, this is fantastic if the bit about logging.Logger
requiring unicode is added to the docstring!
Thanks Danny for running through a couple iterations!
Enormous thank you to @OniOni for spending two days diving into this and figuring it all out! Down to six targets 🎉 |
Problem
Any test with
@ensure_daemon
fails when ran with./pants3
. This is due to unicode issues.For example,
./pants3 test tests/python/pants_test/base:exiter_integration
will fail with:Solution
Use
sys.std{out,err}.buffer
with Py3.We reaffirmed in #7073 a prior decision that the
Exiter
related code should be using a bytes interface. However, we did not fix the pantsd related code because it was causing regressions. We now fix these usages.Note that in
pants_daemon.py
, we overridesys.stdout
to our own custom_LoggerStream
object. To ensure Python 3 support, we add abuffer
property.Issues running on macOS
tests/python/pants_test/base:exiter_integration
will fail on macOS still, due to an upstream non fork safe osx lib (see https://bugs.python.org/issue28342). But this bug also affects Python 2.