Check stderr first before stdout on VCS Install #9234

Mikuana · 2020-12-06T22:26:13Z

When working with large amounts of data, Git reports on stderr instead of
stdout. For some reason, on Git for Windows (I have not been able to reproduce
this on Linux), this can cause the subprocess to completely stall while asking
for a return from the stdout. In the context of a pip install git+https://,
this results in the Clone step freezing, without providing any errors or
context about what's happening (or what has gone wrong).

This fix circumvents that by first check stderr for output, and then checking
stdout (if none is found).

When working with large amounts of data, Git reports on stderr instead of stdout. For some reason, on Git for Windows (I have not been able to reproduce this on Linux), this can cause the subprocess to completely stall while asking for a return from the stdout. In the context of a `pip install git+https://`, this results in the Clone step freezing, without providing any errors or context about what's happening (or what has gone wrong). This fix circumvents that by first check stderr for output, and then checking stdout (if none is found).

uranusjr · 2020-12-07T03:34:08Z

Can this be done simpler with Popen(..., stderr=subprocess.STDOUT)?

Mikuana · 2020-12-08T03:54:03Z

Can this be done simpler with Popen(..., stderr=subprocess.STDOUT)?

I'm not sure if that would address the Git issue, as it may effectively resolve to the same thing as the current solution. I'll try it though, as it would definitely make for a cleaner change.

When working with large amounts of data, Git reports on stderr instead of stdout. For some reason, on Git for Windows (I have not been able to reproduce this on Linux), this can cause the subprocess to completely stall while asking for a return from the stdout. In the context of a `pip install git+https://`, this results in the Clone step freezing, without providing any errors or context about what's happening (or what has gone wrong).

Pipe stderr to stdout

Mikuana · 2020-12-08T04:25:47Z

Can this be done simpler with Popen(..., stderr=subprocess.STDOUT)?

That was a good suggestion. It works on my Windows environment with a Git install. Updated pull request with the change.

Mikuana · 2020-12-08T04:26:19Z

FYI - automated checks are failing, but that's happening when I pull down master and run it on my local as well.

uranusjr · 2020-12-08T05:53:00Z

Hmm, reading the code again, stderr=subprocess.STDOUT may not be a good idea after all, since it affects showing the subprocess error afterwards (content of the all_output variable).

Mikuana · 2020-12-09T21:57:48Z

@uranusjr I have to admit that I don't have enough experience with subprocess to know whether or not this is a good idea. If we reverted to my first suggestion, and used readlines on sterr and then stdout for each printed line, would this avoid the later error display?

uranusjr · 2020-12-10T06:12:07Z

So in the original implementation, only contents from stdout would be added to all_outputs, but in both the initial and current implementation in this PR, both stdout and stderr are added to all_output.

But reading the original implementation yet again, stderr was piped but never read in any way before being closed. So I’m going to assume it’s actually a bug, and this reflects the original author’s intention. We can always fix it if we’re wrong 🙂

news/8876.bugfix.rst

Co-authored-by: Tzu-ping Chung <[email protected]>

Mikuana · 2020-12-10T23:00:59Z

So in the original implementation, only contents from stdout would be added to all_outputs, but in both the initial and current implementation in this PR, both stdout and stderr are added to all_output.

But reading the original implementation yet again, stderr was piped but never read in any way before being closed. So I’m going to assume it’s actually a bug, and this reflects the original author’s intention. We can always fix it if we’re wrong

Oh gotcha. So even if we capture details that we don't intend to from stderr, it won't go anywhere anyways, so there's no practical effect. In that case, I'll keep the current PR version since it is a cleaner change.

sbidoul · 2020-12-11T22:54:06Z

src/pip/_internal/vcs/versioncontrol.py

@@ -121,7 +121,7 @@ def call_subprocess(
            # Convert HiddenText objects to the underlying str.
            reveal_command_args(cmd),
            stdout=subprocess.PIPE,
-            stderr=subprocess.PIPE,
+            stderr=subprocess.STDOUT,


I'm not sure. This bug was introduced in #7969 which was intended to not merge stdout and stderr, because some VCS command log warnings on stderr and we don't want to capture them in the command output. So I'd say we should leave stderr alone to be printed on the console for the user to see.

Actually I was wondering this week why I was seeing pip failing on git exit codes while not showing the error details. That is probably it.

@Mikuana could you check if simply removing this stderr=subprocess.STDOUT, line works ?

@sbidoul that worked as well. I've updated this PR to remove that line instead of piping it to stdout.

Unfortunately tests are red. It's because some calls (one actually, via get_repository_root) need to capture stderr.

What would your recommendation be? Should we modify the tests, or should we trust the test and instead redirect the stderr to stdout as I had it previously?

Sorry if that seems like a silly question.

Redirecting stderr to stdout won't work because that would reopen #7545 and #7968 where the vcs logs warnings which get mixed with the stdout we want to extract and parse.

If we let stderr go to the console, this will create unwanted noise on the console (which is why the tests fail), and bypass the pip logging and verbosity control mechanisms.

So what can we do? Not a silly question indeed.

I see two approach.

1/ The easy one is to use Popen.communicate() which has a safe (multithreaded) mechanism to capture stderr and stdout separately. There are two logging-related drawbacks to this: a) in debug mode it would not display the process output until it has terminated b) stdout and stderr could only be showed one after the other instead of the natural line order produced by the subprocess.

2/ The hard one is to reimplement a variant of communicate to both debug log and capture (a kind of tee)...

So I went for approach 1/ in #9327

This feature may have been the root cause in the introduction of hanging git installs in 20.2.

Remove stderr subprocess

Mikuana · 2020-12-20T17:58:15Z

Closing in favor of #9327

Mikuana mentioned this pull request Dec 6, 2020

pip install [git repo] hangs on clone step on Windows with large repositories #8876

Closed

uranusjr added C: vcs pip's interaction with version control systems like git, svn and bzr type: bugfix labels Dec 7, 2020

Mikuana and others added 2 commits December 7, 2020 20:00

Merge pull request #1 from Mikuana/large-git-install-bugfix-alt

56397ea

Pipe stderr to stdout

uranusjr reviewed Dec 10, 2020

View reviewed changes

news/8876.bugfix.rst Outdated Show resolved Hide resolved

Update news/8876.bugfix.rst

f8aefd1

Co-authored-by: Tzu-ping Chung <[email protected]>

uranusjr approved these changes Dec 11, 2020

View reviewed changes

sbidoul reviewed Dec 11, 2020

View reviewed changes

Mikuana and others added 2 commits December 17, 2020 12:52

Remove stderr subprocess capture in VCS install

d6b06a5

This feature may have been the root cause in the introduction of hanging git installs in 20.2.

Merge pull request #2 from Mikuana/large-git-install-bugfix-rm-stderr

cf436e6

Remove stderr subprocess

This was referenced Dec 18, 2020

Exposed credentials #7841

Closed

Fix VCS subprocess output capture #9327

Closed

Mikuana closed this Dec 20, 2020

Mikuana deleted the large-git-install-bugfix branch December 20, 2020 17:58

sbidoul mentioned this pull request Dec 21, 2020

Revert #7969 and fix VCS stdout/stderr capture #9331

Merged

github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check stderr first before stdout on VCS Install #9234

Check stderr first before stdout on VCS Install #9234

Mikuana commented Dec 6, 2020

uranusjr commented Dec 7, 2020

Mikuana commented Dec 8, 2020

Mikuana commented Dec 8, 2020

Mikuana commented Dec 8, 2020

uranusjr commented Dec 8, 2020

Mikuana commented Dec 9, 2020

uranusjr commented Dec 10, 2020

Mikuana commented Dec 10, 2020

sbidoul Dec 11, 2020

sbidoul Dec 15, 2020

Mikuana Dec 17, 2020

sbidoul Dec 18, 2020

Mikuana Dec 18, 2020

sbidoul Dec 18, 2020

sbidoul Dec 20, 2020

Mikuana commented Dec 20, 2020

Check stderr first before stdout on VCS Install #9234

Check stderr first before stdout on VCS Install #9234

Conversation

Mikuana commented Dec 6, 2020

uranusjr commented Dec 7, 2020

Mikuana commented Dec 8, 2020

Mikuana commented Dec 8, 2020

Mikuana commented Dec 8, 2020

uranusjr commented Dec 8, 2020

Mikuana commented Dec 9, 2020

uranusjr commented Dec 10, 2020

Mikuana commented Dec 10, 2020

sbidoul Dec 11, 2020

Choose a reason for hiding this comment

sbidoul Dec 15, 2020

Choose a reason for hiding this comment

Mikuana Dec 17, 2020

Choose a reason for hiding this comment

sbidoul Dec 18, 2020

Choose a reason for hiding this comment

Mikuana Dec 18, 2020

Choose a reason for hiding this comment

sbidoul Dec 18, 2020

Choose a reason for hiding this comment

sbidoul Dec 20, 2020

Choose a reason for hiding this comment

Mikuana commented Dec 20, 2020