Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Pants to run with Python 3 via ./pants3 script #6959

Merged
merged 25 commits into from
Jan 14, 2019

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented Dec 18, 2018

Problem

We want to be able to use Pants using either Python 2 or Python 3 under-the-hood. (Note --setup-python-interpreter-constraints only constrains subprocesses, not the actual main Pants process.)

Further, the solution must still support Python 2.

An ideal solution requires:

  • Allow easily running using Python 3.
  • Allow quickly falling back to Python 2 if Python 3 version is not working.
  • Be transparent to the user which interpreter they are using.

Solution

Add a new script ./pants3, that functions identically to ./pants beyond using Python 3 under-the-hood.

The user can first attempt the command with ./pants3, and if they're having an issue they can simply revert to ./pants (along with filing an issue).

This solution has two coexisting venv folders, to avoid redownloading the virtual environment every time.

Other solutions considered

  • Command line flag like --py3.
    • Too much typing for user every time. Is not easy to use Python 3.
    • Adds substantial boilerplate to ./pants code.
  • Environment variable
    • Not transparent which interpreter is being used.
    • Incorporated already into the ./pants3 solution. If the user does want to use an environment variable, they can as an alternative to ./pants3.

Once ./pants3 works better

./pants3 will be renamed to ./pants, which will always use Python 3. ./pants will be renamed to ./pants2.

We'll make this swap once ./pants3 passes all CI.

Now that we allow both interpreters, we need to specify Py2 shards should use Py2, rather than just relying on the default defined in pants.ini
@Eric-Arellano Eric-Arellano changed the title WIP - Allow Pants to run with Python 3 Allow Pants to run with Python 3 via ./pants3 script Dec 22, 2018
Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why offer a fall back at all? This script is only used by pants devs and CI. If you altered the existing pants script in this PR and CI was green, it would seem better to force subsequent failures, if any, to be fixed then and there.

@Eric-Arellano
Copy link
Contributor Author

I didn't realize this script is not consumed externally! That does make the fallback idea less relevant, although it is still helpful when doing Pants development to quickly compare Py2 behavior vs Py3 behavior.

--

Regardless, for CI we need to be able to test running Pants both using Python 2 and using Python 3, for at least the next two releases until we kill Py2. This solution should help.

The next step I was going to do in the migration is to change ci.sh to use ./pants3 (or set the env variable PANTS_USE_PYTHON3) to identify which commands fail.

Once I get that green and merged, we would by default use ./pants3 for any shard, and use ./pants when we specify the flag in ci.sh to use py2. So, you're right that we would no longer be using ./pants3 for the argument of having a fall back, but it would help to simplify ci.sh.

@Eric-Arellano
Copy link
Contributor Author

Eric-Arellano commented Dec 22, 2018

Edited (removed initial hypothesis about Travis)
Edit again (think I found a solution)

The Linux engine is failing with the following error:

Exception message: caught OSError(2, "No such file or directory: u'/travis/workdir/build-support/pants_dev_deps.venv/bin/python2.7'")

I originally thought this was the result of the Travis image being out-of-date, but found this can be reproduced locally by running:

./pants lint.python-eval src/python/pants/bin:pants_local_binary

The solution to this locally was to nuke ~/.cache/pants/python_cache/interpreters, because the symlink to pants_dev_deps.venv was saved there.

--

I think this is why the Travis job for Linux Build Engine is failing, particularly this note we have to explicitly use the ~/.cache in CI:

# Mount ${HOME} to cache the ${HOME}/.cache/pants/rust-toolchain
        - docker run --rm -t
          -v "${HOME}:/travis/home"
          -v "${TRAVIS_BUILD_DIR}:/travis/workdir"
          travis_ci:latest
          sh -c "git clean -xfd && ./pants --version && ./build-support/bin/release.sh -n"

Thus, we need to wipe the Travis cache when making this change to prevent the CI failure. I'm waiting to see what happens after deleting the cache for this PR.

Thanks Yi!
The Linux Build Engine shard was failing due to using the Python bin path stored in ~/.cache/pants before renaming pants_dev_deps.venv to pants_dev_deps.py{2,3}.venv.

I realized we can get around this by deleting the cache through Travis's UI, but we need to also change what we cache for the future.
Now that we're using the `.py2.venv` and `.py3.venv` folders, it's confusing to keep this original one. Simply remove it the first time the user sets up the new folders.
Failing contrib/python/../test_checkstyle.py tests raised the question of if we want to set an upper limit for compatibility, which is required for those tests to function properly.

I realized an upper limit makes sense. Even though it's not yet released, Pants would not support Python 3.8 at the moment, due to several major deprecations become removals. We should only turn on Python 3.8 support when we are ready to do so (in 1 year at release, hopefully).
Now that we by default allow Python 3.6, the former whitelist being tested of allowing Python 3.6 became irrelevant. Instead, we must whitelist an interpreter that we do not allow by default, e.g. Python 3.8.
Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo!

@Eric-Arellano
Copy link
Contributor Author

Even though CI is green, I'm not going to merge this until #6985 gets resolved and possibly until we can get #6981 green.

Any help much appreciated! No time rush though, enjoy the holidays!

@benjyw
Copy link
Contributor

benjyw commented Dec 24, 2018

I would argue that very soon we want to rename ./pants -> ./pants2 and ./pants3 -> ./pants. Maybe even in this change. We want all pants devs to be kicking the tires on this as much as possible, as soon as possible, and ./pants is so idiomatic that most people will continue doing that by default.

@Eric-Arellano
Copy link
Contributor Author

I support that renaming, Benjy. Especially because we're going to wait to merge this until it's a bit more robust (possibly all CI is green in #6981), I would feel much more confident in doing this.

--

Daniel and I also discovered that there's a matrix of what you use to build Pants x interpreter constraints. For example, currently with this PR, ./pants3 builds Pants with Python 3, yet subprocesses like Pytest use Python 2.

It would be too complex & too long in CI for us to test all 4 members of that matrix, so I think in CI we will want to test Python 3 build w/ Python 3 interpreter constraints + for the Cron job and Py2 unit test, use Python 2 build w/ default Python 2 interpreter constraints.

This fixes the _Py_Dealloc issue when changing between `./pants` and `./pants3`, as the native engine does not work with Python 2 when it was built using Python 3.
There is a matrix of what interpreter we use for under-the-hood and what we use for subprocesses. Presumably, when calling `./pants3`, you would expect to also run your tests and subprocesses with Py3.

This change is not only nice-to-have but also required because the native engine does not work when bootstrapped with Python 3 and running a Python 2 subprocess *if* that subprocess uses `native_engine.so`, as several of our tests do.

This may result in conflicts when trying to run Py2-only targets, e.g. python_antlr_library.py. If this is the case, we can resort to pantsbuild#7048 as an alternative solution.
Leaving off the `cargo:` part meant this line was ignored. Now we can consistently switch between `./pants` and `./pants3`, and we won't get the _Py_Dealloc issue. Note that it does not always rebuild, as we seem to cache the interpreter-specific native_engine.so after the first time it's built (in ~/.cache/pants/bin/native-engine), which is neat! So we can safely switch between `./pants` and `./pants3`, without having to wait for it to always recompile!

I also checked to make sure it's idempotent, i.e. it doesn't recompile when you keep doing `./pants3` or keep doing `./pants`.

This was tested a few times on macOS and Ubuntu.
@Eric-Arellano
Copy link
Contributor Author

This is ready to be merged. I've been consistently switching between ./pants and ./pants3 the past week with no problems on macOS (beyond one just addressed), and stress tested just now on Ubuntu.

The major issue we had before was building the engine with Py3 would make running with Py2 fail—Daniel helped me fix this by forcing a recompile of the engine when changing versions. (If you're testing locally, this is something that would be great to try out: switch between ./pants and ./pants3.)

#6981 is now also green, which will allow us to use ./pants3 in CI once this one gets merged.

--

There are 3 major quality of life frustrations to be aware of, however.

  1. Some DeprecationWarnings still print when simply running ./pants3. I fixed 90% of them, and we just have left Twitter commons (need to publish) and Future (someone submitted PR last month, not yet merged). The only intermediate solution is to silence DeprecationWarnings from 3rd parties. I think we can punt on this decision.
  2. Error message stacktrace is not pretty printed, so \n is rendered as a literal, e.g. https://travis-ci.org/pantsbuild/pants/jobs/478918458#L3180. I've been getting around this by opening up iPython, copying the message, and running print("""the_stack_trace"""). This will make our CI more difficult to deal with, once we merge Use ./pants3 for majority of CI #6981. I spent about an hour trying to find the source of this, and could not figure out what's causing us to output the error stack trace as a byte string (went down rabbit hole into Pantsd).
  3. Certain goals result in the below message. It does not actually impact anything, just is extra noise. I couldn't find how to fix this yet.
                     Exception ignored in: <function _after_fork at 0x10bb1fb70>
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/threading.py", line 1330, in _after_fork
    thread._stop()
TypeError: 'Event' object is not callable

I think these can be addressed later, so that you all can start using ./pants3 and we can switch over quicker to using ./pants3 in CI. But if you all deem any of these blockers, let me know and I can fix them first.

--

I'll wait to merge to give you all some time to review the new updates and/or pull it down locally to try it out.

Thanks!

@cosmicexplorer cosmicexplorer mentioned this pull request Jan 13, 2019
3 tasks
Copy link
Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look great. I'm on the fence about whether #2 and #3 of the "quality of life" issues you mention should prevent checking this in. We absolutely should fix those properly before switching CI to ./pants3, and certainly before renaming ./pants3 to ./pants. What bothers me is less the symptoms than the fact that we don't understand why they occur. Would like to hear some other opinions.

@Eric-Arellano
Copy link
Contributor Author

I think I solved #2 via #7073. It fixes the issue locally, but I'm not yet convinced it solves 100% of the cases so I'll run it on the Py3 CI pr. I agree that we must solve #2 before swapping CI, or everyone will go crazy trying to make sense of stacktraces.

#3 I'm still not sure what's going on, as there's no stacktrace. Haven't spent much time on it though.

@cosmicexplorer
Copy link
Contributor

I've checked out this branch and have had issues that I've discussed with Eric that aren't reproing in anyone else's environment, so I would appreciate it if someone else could take a look at this PR while I figure this out.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked for me locally, and looks good :)

Any idea where this warning is coming from? I couldn't find an import imp in our code anywhere...

$ ./pants3 goals >/dev/null
/Users/dwagnerhall/src/github.com/pantsbuild/pants/build-support/pants_dev_deps.py3.venv/lib/python3.6/site-packages/future/standard_library/__init__.py:65: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
  import imp

@Eric-Arellano
Copy link
Contributor Author

@illicitonion yes, this is issue #1, the deprecation warnings from libraries we use :/ I've fixed them all for our codebase and upgraded every library possible, but we have twitter.commons and future still not working.

We could apply a small patch to (either temporarily or permanently) ignore DeprecationWarnings with Py3. I'm not 100% sure the implications, particularly what this means for end users of Pants who arguably do want deprecation warnings for their own code. So, I've been punting on the discussion until you all have ./pants3 locally and can decide how much the warnings drive you crazy or not.

--

@benjyw, issue #2 of \n not rendering in stacktraces does appear completely fixed by
#7073!

To recap, the remaining issues associated with ./pants3 are:

  1. DeprecationWarnings
  2. The TypeError: 'Event' object is not callable error that gets ignored and seems to have no noticeable impact.
  3. Not everything works for ./pants3 of course. See the blacklist + py3 pex with contrib fails + py3 lint & jvm tests fails.

--

I would nonetheless advocate for merging this PR today, particularly given that ./pants works 100% normally and it's so quick to switch between ./pants and ./pants3 (the first time you do it, engine will recompile, but only the first time). Merging will make it much easier for you all to test how your code works on both Py3 and Py2, and will make it easier for me to collaborate with you all.

But that's also reasonable if any of the above issues are blockers, and I can try to fix them first.

I'll hold off until tonight to merge, unless there's further followup.

Thanks!

@Eric-Arellano Eric-Arellano merged commit 6b9ac97 into pantsbuild:master Jan 14, 2019
@Eric-Arellano Eric-Arellano deleted the py3 branch January 14, 2019 18:28
Eric-Arellano added a commit that referenced this pull request Jan 16, 2019
### Problem
Right now our CI exclusively uses Python 2 under-the-hood. It tries to test Python 3 support by setting`--setup-python-interpreter-constraints` to Python 3, but this only constrains subprocesses like tests and is not the same thing as fully supporting Python 3. 

We want our CI to test using both Python 2 under-the-hood and Python 3 under-the-hood, because for the next two releases are promising users that either interpreter is fully supported. 

Daily CI should run primarily with Python 3, and the nightly cron job should check for Python 2 regressions.

### Solution
Use the new `./pants3` script added in #6959.

#### Installing Python 3 on each shard
* OSX: git clone pyenv to consistently install 3.6.8. Necessary because some older images come with an outdated version of pyenv. We don't use Travis's `language` feature because several shards are supposed to be `language: general`.
* Linux build engine, temporarily modify `travis_ci/Dockerfile` to download pyenv and install Python 3. This change is being pulled out the `centos6` image in #7064 to avoid rebuilding Py3 every CI run, but it must be published to Docker hub so will be addressed in a followup PR.
* Normal Linux shards, use Travis's `language: python` feature

#### Add integration test blacklist
23 integration test targets are failing when ran with Py3 under-the-hood, so we restore support for an integration test blacklist. This allows us to track Py3 fixes we make and to collaborate more easily.

Further, the contrib tests, linter, and osx rust shards when ran with Py3. So, we run these 3 shards and the blacklist with a Py2 pex but Py3 subprocesses, which is what we currently do in master.

#### Expand Cron job
We add non-integration test jobs to the nightly cron run to check for any regressions in shards like `Linux build wheels`, now that we run daily with a Py3 PEX and need to check that they still work with a Py2 PEX. 

#### Default to Python 3 in CI
Defaulting to Py3 reflects that we are transitioning towards exclusive Py3 use.

This is not user-facing, and only impacts the `.travis.yml` code, so this change is safe to make, whereas we want to wait to rename `./pants3` -> `./pants` until it is a bit more robust.

### FYI - new .travis.yml idiom
This required a major rewrite of `.travis.yml` to avoid duplication. The new idiom we use when defining a shard is to have a `base_my_new_shard` which defines the minimum amount of config necessary for that job, and then to have a `py2_my_new_shard` and `py3_my_new_shard` that extend that base along with their corresponding OS + python version config.
Eric-Arellano added a commit to Eric-Arellano/pants that referenced this pull request Jan 24, 2019
I'm not sure why it was added to begin with. Even back in pantsbuild#6959, which first introduced `./pants3`, the tests were passing.

Maybe it was a flaky test? But I can't find a ticket for that. So maybe it was a typo?
Eric-Arellano added a commit that referenced this pull request Jan 24, 2019
I'm not sure why it was added to begin with. Even back in #6959, which first introduced `./pants3`, the tests were passing.

Maybe it was a flaky test? But I can't find a ticket for that. So maybe it was a typo?

--

This also redistributes CI shards by one in favor of Python 3, as this and the surrounding PRs remove multiple targets from the blacklist.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants