Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rules_python failing with Bazel@HEAD in Bazel Downstream pipeline #856

Closed
meteorcloudy opened this issue Oct 13, 2022 · 13 comments · Fixed by bazelbuild/continuous-integration#1535 or #1062
Assignees

Comments

@meteorcloudy
Copy link
Member

https://buildkite.com/bazel/bazel-at-head-plus-downstream/builds/2675#0183d0a7-84bd-4541-b543-01d162c313ed

======================================================================
FAIL: test_match_toolchain (__main__.TestPythonVersion)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/eb882ad14e6111536f596011b6d53a9e/sandbox/linux-sandbox/48/execroot/rules_python/bazel-out/k8-fastbuild/bin/python/tests/toolchains/run_test_3.8.13.sh.runfiles/rules_python/python/tests/toolchains/3.8.13/run_acceptance_test.py", line 52, in test_match_toolchain
    self.assertEqual(output, "Python 3.8.13")
AssertionError: '' != 'Python 3.8.13'
+ Python 3.8.13
==================== Test output for //:test_vendored:
4c4
< from //:requirements.txt
---
> from @//:requirements.txt
FAIL: files "requirements.bzl" and "requirements.clean.bzl" differ. Please run:  bazel run //:vendor_requirements

Possible culprits are: #855 and #851

@meteorcloudy
Copy link
Member Author

/cc @alexeagle @f0rmiga

@meteorcloudy
Copy link
Member Author

gently ping~

@f0rmiga
Copy link
Collaborator

f0rmiga commented Oct 17, 2022

Sorry, my son was born the next day you posted this, and I didn't have enough time to dig into it yet. From a first pass, the failure is coming from the fact that I expanded the CI jobs and we were restricting the Bazel versions we tested against in the integration tests.

I think we should fix this forward, as CI is more correct now.

@f0rmiga
Copy link
Collaborator

f0rmiga commented Oct 17, 2022

If this is blocking anything, we can at least disable them while the proper fix is not done.

@jvolkman
Copy link
Contributor

congrats @f0rmiga!

@f0rmiga
Copy link
Collaborator

f0rmiga commented Oct 17, 2022

congrats @f0rmiga!

Thanks @jvolkman!

@meteorcloudy
Copy link
Member Author

Oh, congratualtions! @f0rmiga

Maybe someone else can look into this in the meantime? /cc @rickeylev It's not block anything, but it's causing the Bazel downstream pipeline to be red and could shadow any other breakage we could detect for rules_python at Bazel@HEAD.

@fweikert
Copy link
Member

Congratulations and a friendly ping :)

@f0rmiga
Copy link
Collaborator

f0rmiga commented Oct 21, 2022

Congratulations and a friendly ping :)

A friendly pong.

@groodt groodt added Can Close? Will close in 30 days if there is no new activity and removed Can Close? Will close in 30 days if there is no new activity labels Jan 10, 2023
@rickeylev rickeylev self-assigned this Jan 20, 2023
@rickeylev
Copy link
Collaborator

I think this is fixed, though I don't know what did so. Running the following succeeds:

USE_BAZEL_VERSION=last_green bazelisk test //python/tests/toolchains/...

I'll send a PR to re-enable our tests in the Bazel config

rickeylev added a commit to rickeylev/continuous-integration that referenced this issue Jan 20, 2023
@rickeylev
Copy link
Collaborator

Ok, turns out its not, insofar as the CI is concerned:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //python/tests/toolchains:python_3_8_13_x86_64-unknown-linux-gnu_test
-----------------------------------------------------------------------------
2023/01/23 06:08:24 could not link local Bazel: could not copy file from /tmp/tmp6e5y4k72/bazel to /var/lib/buildkite-agent/.cache/bazelisk/local/-tmp-tmp6e5y4k72-bazel/bin/bazel: open /tmp/tmp6e5y4k72/bazel: no such file or directory
F/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/0bbb1d249f832c379464d3efbbd58cab/execroot/rules_python/external/python_3_11_1_x86_64-unknown-linux-gnu/lib/python3.11/subprocess.py:1125: ResourceWarning: subprocess 18 is still running
  _warn("subprocess %s is still running" % self.pid,
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/0bbb1d249f832c379464d3efbbd58cab/execroot/rules_python/external/python_3_11_1_x86_64-unknown-linux-gnu/lib/python3.11/unittest/case.py:622: ResourceWarning: unclosed file <_io.TextIOWrapper name=3 encoding='utf-8'>
  with outcome.testPartExecutor(self):
ResourceWarning: Enable tracemalloc to get the object allocation traceback

======================================================================
FAIL: test_match_toolchain (__main__.TestPythonVersion.test_match_toolchain)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/var/lib/buildkite-agent/.cache/bazel/_bazel_buildkite-agent/0bbb1d249f832c379464d3efbbd58cab/sandbox/linux-sandbox/48/execroot/rules_python/bazel-out/k8-fastbuild/bin/python/tests/toolchains/run_test_3.8.13.sh.runfiles/rules_python/python/tests/toolchains/3.8.13/run_acceptance_test.py", line 52, in test_match_toolchain
    self.assertEqual(output, "Python 3.8.13")
AssertionError: '' != 'Python 3.8.13'
+ Python 3.8.13

----------------------------------------------------------------------
Ran 1 test in 0.084s

FAILED (failures=1)

The only clue in there is something about not being able to copy bazel from one temp location to another? I'll have to dig into it a bit.

@rickeylev
Copy link
Collaborator

Ok, after digging a bit, I found the cause: The bazel CI sets USE_BAZEL_VERSION=/tmp/random/bazel, but then sets --sandbox_tmpfs_path=/tmp, which essentially hides /tmp from the tests. I don't think there's any way to make our tests invoke bazel as a subprocess when invoked like that.

We might be able to rewrite these tests to not use a subprocess, though. I'm thinking we could use the multi-version python rules to do the same -- all these "acceptance tests" seem to be doing is running Python and checking the version output.

@rickeylev
Copy link
Collaborator

note to self: another idea to investigate: a genrule that sets toolchains=. Then run $(PYTHON) --version and check the output (or some equiv of this)

rickeylev added a commit that referenced this issue Feb 11, 2023
…ine.

The latest Bazel build continous integration testing pipeline sets
several flags and environment variables that end up interfering with
each other:
 * `--sandbox_tmpfs_path=/tmp`
 * `--test_env=USE_BAZEL_VERSION`
 * `USE_BAZEL_VERSION=/tmp/<something>`
 * And Bazelisk is used to run Bazel

What happens is `USE_BAZEL_VERSION` points to Bazel in /tmp, but then
the `--sandbox_tmpfs_path` flag prevents it from being readable. Later,
when a test wants to run Bazel, Bazelisk is invoked. It is able to see
that it should use a custom Bazel binary because of `--test_env`, but
then can't read the file because of `--sandbox_tmpfs_path`, so then
fails.

To fix, make the test runner that will run `bazel` unset
`USE_BAZEL_VERSION` so Bazelisk doesn't try to use it.

This also exposed an issue with Bazelisk demanding a cache directory be
specified, so set that environment variable to the test's temp dir to
keep Bazelisk happy.

Fixes #856
fweikert pushed a commit to bazelbuild/continuous-integration that referenced this issue Feb 15, 2023
fmeum pushed a commit to fmeum/continuous-integration that referenced this issue Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants