-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect message "Auto packing the repository in background for optimum performance" #2221
Comments
I agree! And I would much prefer the true background operation. Sadly, this seems to be not so trivial, as we cannot So what to do about it? My best idea is to take a step back and reflect what the
That second part is important: on Linux/Unix/macOS, this makes sure that if you terminate the calling process, it won't terminate the spawned process with it. On Windows, the behavior is completely different. If you call Note: there is a way to terminate all console processes that are attached to the same console, which is somewhat similar in spirit, but it does not help us here, as Git often spawns GUI processes (e.g. Git GUI), and the caller of Git is typically attached to the same console. To work around Git's expectations to that end, we have specific code in In particular, the process ID and the parent process ID of each process is looked up, and when we want to terminate a process, the process tree starting at that given process is enumerated and terminated. Now, what happens if a process spawns a child process which in turn spawns a grand child process, and then the child process exits without terminating the spawned grand child process? Then the process tree is broken, and that grand child process is sort of "daemonized". So this is my idea how we could simulate In other words,
An alternative would be to construct a Yet another, completely different alternative would be to change things on the caller's side. But that would be slightly ugly: all callers that run auto-gc would have to be touched, and we would rely on the different behavior of Another downside of that alternative would be that it would work only for auto-gc, not for Speaking of int i, j;
for (i = j = 0; i < saved.argc; i++)
if (strcmp(saved.argv[i], "--detach"))
saved.argv[j++] = saved.argv[i];
saved.argc = j; Gladly, it is that easy because Note: I suspect that even what I wrote above is not yet enough, we probably have to teach I will be honest: this is a moderately challenging project even for me, and I do not currently have the resources to tackle it. Are you up for it, @YngveNPettersen? If so, I would strongly recommend to
to get started, then adding some code to Speaking of CMD: if you want to test things in-place (i.e. without running |
Hi,
Why don't you consider Windows Job API to achieve a similar effect?
What I read was that you can kill all processes of a particular job. Or
that they terminate together.
See
https://docs.microsoft.com/en-us/windows/win32/api/jobapi2/nf-jobapi2-createjobobjectw?redirectedfrom=MSDN
A similar topic was discussed here:
https://stackoverflow.com/questions/604522/performing-equivalent-of-kill-process-tree-in-c-on-windows
This job api seems to be a much easier approach, and closer to what's done
on linux.
Best regards, Mike
…On Wed, Oct 16, 2019, 4:32 AM Johannes Schindelin ***@***.***> wrote:
Considering that the current operation is blocked by this operation, and
no new commands can be run, I think the phrase "in background" is incorrect
and should be removed. Alternatively, the operation should be spawned into
a true background operation.
I agree! And I would much prefer the true background operation.
Sadly, this seems to be not so trivial, as we cannot daemonize()
<https://github.com/git/git/blob/v2.23.0/builtin/gc.c#L620> on Windows:
the daemonize() function
<https://github.com/git/git/blob/v2.23.0/setup.c#L1269-L1290> uses the
fork() syscall <https://en.wikipedia.org/wiki/Fork_%28system_call%29> for
which there is no equivalent in the Win32 API.
So what to do about it?
My best idea is to take a step back and reflect what the daemonize()
thing accomplishes (see also this StackOverflow post
<https://stackoverflow.com/questions/2613104/why-fork-before-setsid>):
1. it spawns a new process
2. crucially, it starts a new process group via setsid()
<https://linux.die.net/man/2/setsid>
That second part is important: on Linux/Unix/macOS, this makes sure that
if you terminate the calling process, it won't terminate the spawned
process with it.
On Windows, the behavior is completely different. If you call
TerminateProcess() on a process that spawned a child process, that child
process will not be terminated.
Note: there is a way to terminate all console processes that are attached
to the same console, which is somewhat similar in spirit, but it does not
help us here, as Git often spawns GUI processes (e.g. Git GUI), *and* the
caller of Git is typically attached to *the same* console.
To work around Git's expectations to that end, we have specific code in
exit-process.h (which has not made it into git.git yet, it exists only in
Git for Windows)
<https://github.com/git-for-windows/git/blob/v2.23.0.windows.1/compat/win32/exit-process.h>
to traverse the process tree so as to simulate the same process group idea
as on Linux.
In particular, the process ID and the parent process ID of each process is
looked up, and when we want to terminate a process, the process tree
starting at that given process is enumerated and terminated.
Now, what happens if a process spawns a child process which in turn spawns
a grand child process, and then the child process exits without terminating
the spawned grand child process? Then the process tree is broken, and that
grand child process is sort of "daemonized".
So this is my idea how we could simulate daemonize() on Windows: by
spawning a new process *with the same exact command-line parameters,
except that the --detach is removed*, then exiting with success.
In other words,
1. A copy of the original argc/argv needs to be preserved, for use by
daemonize(). This can be easily accomplished by using a struct
argv_array saved = ARGV_ARRAY_INIT; and then calling argv_array_pushl(&saved,
"git", "-c", "gc.autodetach=0"); argv_array_pushv(&saved, argv);
before parsing the arguments in cmd_gc(). (Note the gc.autodetach=0 to
force the process *not* to detach.)
2. In daemonize(), in a clause guarded by #ifdef GIT_WINDOWS_NATIVE,
we have to call mingw_spawnvpe(), taking pains to imitate
sanitize_stdfds()
<https://github.com/git/git/blob/v2.23.0/setup.c#L1257-L1267> by
passing appropriate file descriptors for stdin, stdout and stderr: fdin
= open("/dev/null", O_RDONLY, 0); fdout = open("/dev/null", O_WRONLY, 0);
fdin = open("/dev/null", O_WRONLY, 0);. You really need to have three
different ones, as the spawned process might close, say, stdin, but
still might write to stderr.
3. Call exit() after spawning, using an exit code that reflects
whether the child process was spawned successfully.
An alternative would be to construct a child_process with those new
arguments and then calling start_command(). However, we would then have
to force the cleanup_children() function
<https://github.com/git/git/blob/v2.23.0/run-command.c#L32-L72> into *not*
waiting for *that* process. Which boils down to using the clean_on_exit/
RUN_CLEAN_ON_EXIT flag.
Yet another, completely different alternative would be to change things on
the caller's side. But that would be slightly ugly: all callers that run
auto-gc would have to be touched, *and* we would rely on the different
behavior of exit() on Windows vs Linux, I think, where it would wait for
child processes to exit on Linux but not on Windows.
Another downside of that alternative would be that it would work only for
auto-gc, not for git daemon.
Speaking of git daemon: it would need the exact same treatment, as it
also calls daemonize()
<https://github.com/git/git/blob/v2.23.0/daemon.c#L1471>. In this case,
however, we have to edit the parameters to prevent the spawned process from
trying to daemonize, rather than defining a config setting. In other words,
just before passing the saved arguments, any --detach needs to be
removed. There is no convenient API function yet, it would need to be done
like this:
int i, j;
for (i = j = 0; i < saved.argc; i++)
if (strcmp(saved.argv[i], "--detach"))
saved.argv[j++] = saved.argv[i];
saved.argc = j;
Gladly, it is that easy because git daemon does all its command-line
parsing itself (not using parse-options), and all parameters that take
values are of the form --name=value (i.e. *not* --name value), therefore
it is not possible that a parameter's value --detach would be mistaken
for a parameter (think git grep -e --detach: in this case, --detach is
not an parameter, but the value of the -e parameter).
Note: I suspect that even what I wrote above is not yet enough, we
probably have to teach mingw_spawnvpe() a trick where it can optionally
use the flag CREATE_NEW_PROCESS_GROUP and/or DETACHED_PROCESS to prevent
a Ctrl+C in a CMD from terminating the spawned process (see
https://docs.microsoft.com/en-us/windows/win32/procthread/process-creation-flags
for a thorough explanation).
I will be honest: this is a moderately challenging project even for me,
and I do not currently have the resources to tackle it. Are you up for it,
@YngveNPettersen <https://github.com/YngveNPettersen>?
If so, I would strongly recommend to
1. install Git for Windows' SDK
<https://gitforwindows.org/#download-sdk>,
2. sdk cd git,
3. build Git via make -j$(nproc)
to get started, then adding some code to git gc for testing this better.
I could imagine, for example, that a warning("about to sleep");
sleep(10); warning("yawn!"); in the auto-gc, *non*-detached case would
help with testing Ctrl+C in a CMD. You also might want to add another warning("spawned
child"); sleep(10); in daemonize(), to have enough time to press Ctrl+C 😄
Speaking of CMD: if you want to test things in-place (i.e. without running make
install and possibly messing up your SDK), just after building, you can
use .\git --exec-path=C:\git-sdk-64\usr\src\git gc --auto.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2221?email_source=notifications&email_token=ABZH5SGG7GE2YTY3UOYB4KLQO3GTPA5CNFSM4HWJM5JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBLUFFQ#issuecomment-542589590>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZH5SDO7UZCTB4AIAIKXMLQO3GTPANCNFSM4HWJM5JA>
.
|
There was a most crucial line in there: "You can enforce children processes to stay within the Job, by setting appropriate attribute." In other words, you would have to change pretty much every caller. The way the Windows port of Git works is that it tries to be minimally intrusive, often emulating POSIX functions using Win32 API functions. I don't think that would work here, in particular because we cannot control how |
You use a git wrapper executable anyway, right? I think you have control
right there.
Best regards, Mike
…On Wed, Oct 16, 2019, 7:58 AM Johannes Schindelin ***@***.***> wrote:
A similar topic was discussed here:
https://stackoverflow.com/questions/604522/performing-equivalent-of-kill-process-tree-in-c-on-windows
There was a most crucial line in there: "You can enforce children
processes to stay within the Job, by setting appropriate attribute."
In other words, you would have to change pretty much every caller.
The way the Windows port of Git works is that it tries to be minimally
intrusive, often emulating POSIX functions using Win32 API functions. I
don't think that would work here, in particular because we cannot control
how git.exe itself is called. So that puts a pretty hard stop into the
idea of switching to the Jobs API, I would think.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#2221?email_source=notifications&email_token=ABZH5SGK2QF47XSNEWD25S3QO36XFA5CNFSM4HWJM5JKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBMGU2Q#issuecomment-542665322>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZH5SGVX627R75VW3RZE6DQO36XFANCNFSM4HWJM5JA>
.
|
The idea i had was to create a job at the central entry point (there aren't multiple, correct?) of the git wrapper. Then in daemonize() you break off from the job and create a new one. (I do not know much about how git source is designed. I just wanted to contribute an idea. I posted this alternative idea because the above mentioned parent PID handling worried me, as i know that is not reliable. PIDs can be reused by windows once a process is terminated and all process handles closed. I think to guarantee a PID belongs to the the desired process you must hold a valid process handle to it. But... maybe the git code considers all this already. So, please ignore me if thats the case.) |
You are right about all those PID problems, but I am afraid of the amount of complexity the jobs idea would bring. And I am certainly not willing to implement that. The idea I presented here is much more contained, and changes Git's source code only minimally. It therefore has the best chance of giving us the solution we so desire. |
Closing this stale ticket. |
Setup
Win 7 and Win10, both 64-bit
Editor Option: VIM
Custom Editor Path:
Path Option: BashOnly
Plink Path: C:\Program Files\PuTTY\plink.exe
SSH Option: Plink
CURL Option: OpenSSL
CRLF Option: CRLFAlways
Bash Terminal Option: MinTTY
Performance Tweaks FSCache: Enabled
Use Credential Manager: Disabled
Enable Symlinks: Disabled```
to the issue you're seeing?
Big repos
Details
Git Bash
Minimal, Complete, and Verifiable example
this will help us understand the issue.
Complete successfully
Occasionally, especially after a big rebase, the command shell lock up for a while with the message "Auto packing the repository in background for optimum performance"
Considering that the current operation is blocked by this operation, and no new commands can be run, I think the phrase "in background" is incorrect and should be removed. Alternatively, the operation should be spawned into a true background operation.
URL to that repository to help us with testing?
see this on most repos with any kind of local commit activity
The text was updated successfully, but these errors were encountered: