Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually scan $PATH on POSIX systems #717

Merged
merged 2 commits into from
Jun 1, 2022

Conversation

mjcheetham
Copy link
Collaborator

Because the which command is not always located in /usr/bin on all POSIX systems, we cannot directly call /usr/bin/which to do program location for us. Oh the irony..

We might be able to use env to locate which, but again the env utility isn't always found in the same place, so we're stuck.

Instead we now manually scan the $PATH variable directories looking for the program we need to find, in the same way we do on Windows (for different reasons).

This should also be slightly faster as you'd hope .NET's String.Split method and lstat calls are faster than fork/exec-ing another executable?

Fixes #671

Because the `which` command is not always located in `/usr/bin` on all
POSIX systems, we cannot directly call `/usr/bin/which` to do program
location for us. Oh the irony..

We might be able to use `env` to locate `which`, but again the `env`
utility isn't always found in the same place, so we're stuck.

Instead we now manually scan the $PATH variable directories looking for
the program we need to find, in the same way we do on Windows (for
different reasons).

This should also be slightly faster as you'd hope .NET's String.Split
method and lstat calls are faster than fork/exec-ing another executable?
@mjcheetham mjcheetham requested a review from ldennington May 31, 2022 13:56
Copy link
Contributor

@ldennington ldennington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! 🚀 Added a few nitpicky comments, but overall great work!

Comment on lines +109 to +120
// On UNIX-like systems we would normally use the "which" utility to locate a program,
// but since distributions don't always place "which" in a consistent location we cannot
// find it! Oh the irony..
// We could also try using "env" to then locate "which", but the same problem exists in
// that "env" isn't always in a standard location.
//
// On Windows we should avoid using the equivalent utility "where.exe" because this will
// include the current working directory in the search, and we don't want this.
//
// The upshot of the above means we cannot use either of "which" or "where.exe" and must
// instead manually scan the PATH variable looking for the program.
// At least both Windows and UNIX use the same name for the $PATH or %PATH% variable!
Copy link
Contributor

@ldennington ldennington May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we seem to be using POSIX in some places (e.g. the commit message) and UNIX in others (e.g. this comment). Is it ok to use them interchangeably in this way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've not been very strict in the past about POSIX vs UNIX. POSIX is a subset of UNIX-like behaviour/implementation.

Perhaps this should probably be "Unix" and use class names like UnixEnvironment etc because, especially in this case, POSIX doesn't strictly define the file system path formats beyond this.

VMS paths can be pretty different from UNIX paths. I know @dscho has more opinions on UNIX vs POSIX.

In GCM there are various mapped C Standard Library APIs that are under src/Core/Interop/Posix/Native because they're in the POSIX standard, but then there are implementation-specific-isms that live in Interop/Linux/Native and Interop/MacOS/Native.

@mjcheetham
Copy link
Collaborator Author

mjcheetham commented Jun 1, 2022

@dscho I'd like your opinion on this change.

We're moving from /usr/bin/which to manually scanning $PATH (split on : and search for <path>/<program> left to right until we find a file).

My original assumption here was that /usr/bin/env was also not standardised (like the location of which is not), but seems like it may be?

In either case, do you think it's better to stick with this change (scanning $PATH manually) or change to using /usr/bin/env to help locate things by doing /usr/bin/env which <program>? (Bearing in mind we're having to manually scan %PATH% on Windows anyway.)

@dscho
Copy link
Collaborator

dscho commented Jun 1, 2022

We're moving from /usr/bin/which to manually scanning $PATH (split on : and search for <path>/<program> left to right until we find a file).

Makes sense to me. It's not only better-controlled code, it's also substantially quicker than spawning a program.

My original assumption here was that /usr/bin/env was also not standardised (like the location of which is not), but seems like it may be?

The location of /usr/bin/env is pretty much guaranteed by the billions of scripts out there that have #!/usr/bin/env <blablabla> as first line, where <blablabla> is anything from python to perl to ruby.

If anybody would install env anywhere else, all those scripts would fail to work.

In either case, do you think it's better to stick with this change (scanning $PATH manually) or change to using /usr/bin/env to help locate things by doing /usr/bin/env which <program>? (Bearing in mind we're having to manually scan %PATH% on Windows anyway.)

Honestly, I'd rather scan $PATH manually. That gives us much better control (for example, we can choose not to respect PATHEXT and decide that we're only interested in .exe files).

@dscho
Copy link
Collaborator

dscho commented Jun 1, 2022

In either case, do you think it's better to stick with this change (scanning $PATH manually) or change to using /usr/bin/env to help locate things by doing /usr/bin/env which <program>? (Bearing in mind we're having to manually scan %PATH% on Windows anyway.)

Honestly, I'd rather scan $PATH manually. That gives us much better control (for example, we can choose not to respect PATHEXT and decide that we're only interested in .exe files).

Obviously this comment does not quite apply here because this PR is about POSIX systems, but I still like the manual approach better.

@mjcheetham mjcheetham merged commit 8009749 into git-ecosystem:main Jun 1, 2022
@mjcheetham mjcheetham deleted the which-location branch June 1, 2022 18:06
@ldennington ldennington mentioned this pull request Jun 15, 2022
echuber2 added a commit to echuber2/git-credential-manager that referenced this pull request Jun 23, 2022
It may cause trouble to hard-code `/usr/bin/which` anywhere, so for now, why not `/usr/bin/env` to resolve which? See git-ecosystem#717 for example.

(In the long run it may cause trouble to use `which` at all since it's not POSIX standard and Debian is thinking of removing it. But then you could probably just ensure another package is installed first.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

which isn't always located in /usr/bin
3 participants