- Feature Name:
cargo_target_dir_templates
- Start Date: 2023-01-12
- RFC PR: rust-lang/rfcs#3371
Introduce templating to CARGO_TARGET_DIR
, allowing cargo
to adapt its target directory dynamically depending on (at least) the manifest's path with the {manifest-path-hash}
key.
The original motivating issue can be found here: rust-lang/cargo#11156.
- Not having to find and clean all
target/
dirs everywhere while not having all projects collide (which is the effect of settingCARGO_TARGET_DIR
globally) - Being able to easily exclude a directory from backups (Apple's Time Machine, ZFS and btrfs snapshots, ...)
- Allows easily having separate directories for Rust-Analyzer and Cargo itself, allowing concurrent builds (technically already doable with arguments/env vars but
CARGO_TARGET_DIR
collides all projects into big target dir, leading to frequent recompilation because of conflicting features and locking builds) - Allows using a different disk, partition or mount point for cargo artifacts
- Avoids having to set
CARGO_TARGET_DIR
for every project to get the same effect as proposed here
For a single project, it is possible to use the CARGO_TARGET_DIR
environment variable (or the target-dir
TOML config option or the --target-dir
command-line flag) with either an absolute or relative path to change the position of the target/
directory used for build artifacts during compilation with Cargo.
While this option is useful for single-project environments (simple CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines or repos with multiple workspaces, it conflates every build directory under the configured path: CARGO_TARGET_DIR
directly replaces the <workspace>/target/
directory.
Templating introduces one new templating key for CARGO_TARGET_DIR
, in the same spirit as the index configuration format:
{manifest-path-hash}
: a hash of the manifest's absolute path as a path. This is not an absolute path.
It can be used like this: CARGO_TARGET_DIR="$HOME/.cache/cargo-target-dirs/{manifest-path-hash}"
.
When compiling /home/ferris/src/cargo/
with user ferris
, manifest-path-hash
would be something like ab/cd/<rest of hash>
and the artifacts would be found in /home/ferris/.cache/cargo-target-dirs/ab/cd/<rest of hash>/...
.
Note the hash used and the path derived from that for {manifest-path-hash}
are implementation details and the values here are just an example.
Below is an example of the behavior with untemplated versus templated forms:
Consider this directory tree:
/Users/
├─ poliorcetics/
│ ├─ work/
│ │ ├─ work-project/
│ │ │ ├─ Cargo.toml
│ │ │ ├─ crate-1/
│ │ │ │ ├─ Cargo.toml
│ │ │ ├─ crate-2/
│ │ │ │ ├─ Cargo.toml
│ ├─ perso/
│ │ ├─ perso-1/
│ │ │ ├─ Cargo.toml
│ │ ├─ perso-2/
│ │ │ ├─ Cargo.toml
/cargo-cache/
cd /Users/poliorcetics/work/work-project && cargo build
produces artifacts directly in /cargo-cache/debug/...
A subsequent cargo build
in perso-1
works with the same artifact, potentially having conflicting features for dependencies for example.
A cargo clean
deletes the entire /cargo-cache
directory, for all projects at once.
It's possible to produce invalid state in the target dir by having unrelated projects writing in the same place.
It's not possible to have to projects building at once because Cargo locks its target directory during builds.
cd /Users/poliorcetics/work/work-project && cargo build
produces artifacts in /cargo-cache/<manifest-path-hash>/debug/...
(where manifest-path-hash
is a directory or several chained directories unique to the workspace, with an unspecified naming scheme).
A cargo build
in perso-1
produces new artifacts in /cargo-cache/<manifest-path-hash>/debug/...
.
A cargo clean
only removed the /cargo-cache/<manifest-path-hash>/
subdirectory, not all the artifacts for all other projects that are also in the cache.
In this situation, it's much less likely for Cargo to produce invalid state without a build.rs
deliberately writing outside its target directory.
Two projects can be built in parallel without troubles.
CARGO_TARGET_DIR
can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory, and that stays the case with templates.
Templating does not interfere with the resolution order of CARGO_TARGET_DIR
. From less to most specific:
-
Through the
config.toml
:[build] target-dir = "/absolute/path/to/cache/{manifest-path-hash}"
-
Through the environment variable:
CARGO_TARGET_DIR="/absolute/path/to/cache/{manifest-path-hash}" cargo build
-
Through the command line flag:
cargo build --target-dir "/absolute/path/to/cache/{manifest-path-hash}"
In the example in the previous section, {manifest-path-hash}
was replaced with a relative path. This relative path is computed from the full and canonicalized path to the manifest for the workspace Cargo.toml
(or the script.rs
file directly for cargo-scripts).
By being canonicalized, including resolving of symlinks, symlinked projects will share the same target directory. This is following the prior art from bazel
and I have not found any complaints about this.
The hashing and turning the hash into nested directories is not considered stable: the method will probably not change often but cargo
offers no guarantee and may change it in any release. Tools that needs to interact with cargo
's target directory should not rely on its value for more than a single invocation of them: they should instead query cargo metadata
for the actual value each time they are invoked.
To prevent collisions by craftings paths, the <manifest-path-hash>
directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example bazel
uses its version and the current user too).
In the following situation
/Users/
├─ poliorcetics/
│ ├─ projects/
│ │ ├─ actual-crate/
│ │ │ ├─ Cargo.toml
│ │ ├─ symlink-to-crate/ -> actual-crate/
When calling cargo metadata
in the symlink-to-crate
path, the result contains "manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"
and "workspace_root":"/Users/poliorcetics/projects/actual-crate"
. This behaviour means that symlinks won't change the final directory used inside {manifest-path-hash}
, or in other words: symbolic links are resolved.
While a single dev machine is unlikely to have enough projects that the naming scheme of <manifest-path-hash>
will produce enough directories to slow down working in $CARGO_TARGET_DIR/
, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling cargo clean
over years) could also make it happen, if much slower.
To prevent this, cargo
splits the hash into something like $CARGO_TARGET_DIR/hash[:2]/hash[2:4]/hash[4:]/...
. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be changed in a subsequent version of cargo
.
targo
provides forward link (it links from <workspace>/target
to its own target directory) as a way for existing tools to continue working despite there being no explicit CARGO_TARGET_DIR
set for them to find the real target directory.
cargo
currently does not provide them for regular (untemplated) CARGO_TARGET_DIR
. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via target-dir
in the config, meaning others tools cannot easily pick it up (and most external tools don't use cargo-metadata
, which makes them all broken by default, but fixing this situation is not this RFC's purpose).
After this RFC, when the CARGO_TARGET_DIR
will provide the option of creating a forward link, configurable via a new configuration option, target-dir-link
(see below for details).
When creating a forward link cargo
will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted (or error, see the configuration option below) and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge.
A config option (CLI, config.toml
and env var), target-dir-link
, controls this behaviour, it is auto
by default.
Its possible values would be:
true
: create the symlink and produce an error if it fails"auto"
: create the symlink, produce a warning (or note) but do not fail the commandfalse
: don't create the symlink at all (don't touch it if it exists already though)
It is possible to call cargo build
for the same project twice with two different target directories to avoid build locks (common when building with different features or to have Rust-Analyzer work in a different target directory for example), which poses a problem for forward links: if target-dir-link
is active, cargo
could be replacing the ./target
symlink constantly.
Cargo, when trying to create the forward link (so for true
and "auto"
), will handle the situation predictably in the following way:
- If no
target
link or directory is present: create it as expected bytarget-dir-link
- If a
target
link is present: update the link - If a
target
directory is present: consider it a failure, respond accordingly totrue
or"auto"
Callers of cargo will be able to use --config KEY=VALUE
to override it, for example a Rust-Analyzer config could use cargo.extraArgs = ["--config", "target-dir-link=false"]
to ensure R-A never touches forward links.
When calling cargo
with a builtin call (e.g., build
, check
or test
) where a templated CARGO_TARGET_DIR
is active, cargo
will first resolve the effective CARGO_TARGET_DIR
and then proceed with the command as if CARGO_TARGET_DIR
had been set directly. For third party tools (cargo-*
), where cargo does not know about the relevant Cargo.toml
, the tool will have to use cargo_metadata
, as is already expected today, to learn about the effective target directory.
In the same vein, cargo metadata
fills the target directory information with the absolute path and make no mention of the template in CARGO_TARGET_DIR
since it can only be used with a single workspace at once.
Currently, if CARGO_TARGET_DIR
is set to anything but target
for a project, cargo clean
does not delete the target/
directory if it exists, instead deleting the directory pointed by CARGO_TARGET_DIR
. The same behavior is used for the templated version: if it set, cargo clean
deletes /path/to/<manifest-path-hash>/
and not target/
.
During the transition period, any CARGO_TARGET_DIR
that was defined as containing {manifest-path-hash}
will change meaning. cargo
, for at least one stable version of Rust, will provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it.
"How to fix it" will have two solutions: change the configured target directory to not use the new key or use a newer version of cargo (which will not be available at the beginning since it won't exist).
In practice, paths with {
or }
in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows.
- Breaking change for
CARGO_TARGET_DIR
since previously valid settings could become invalid (see "Transition period" section).
This introduces one more option to look at to find the target directory, which may complicate the life of external tools.
This is mitigated by the forward link provided by default by cargo
when using the templated form of CARGO_TARGET_DIR
.
Depending on what naming scheme is used (e.g., a very long hash), we could hit the Windows path length limits if not careful.
A mitigation for this is recommending a short prefix (in CARGO_TARGET_DIR
) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem.
Bash has brace expansion, other shells too. By using {manifest-path-hash}
we risk users getting bitten by that behaviour. Brace expansion is only activated when there are ,
or ..
inside the {}
so cargo should be fine. Since brace expansion is done at the shell level, cargo won't be able to detect it if it happens.
Escaping, using single quotes ('
) or even double quotes ("
) will work to disable brace expansion, making it even easier to work around it if needed.
It is already possible today to use CARGO_TARGET_DIR
to remap workspaces and projects but this has a few problems:
- If done globally, the
CARGO_TARGET_DIR
becomes a hodge-podge of every project, which is not often the goal. - If done per-project, it is very cumbersome to maintain.
targo
by @sunshowers- rust-lang/cargo#11156
- The upcoming
cargo script
command needs someplace to put its cache and having a dedicated directory for that would be nice.
targo
and the cargo issue express a need for either remapping or a global target directory that is not shared between different Cargo workspaces.
For those reason, this option has not been retained and the targo
tool is discussed more in details below.
There are already lots of discussion about .cargo
and .rustup
being home to both cache and config files and why this is annoying for lots of users. What's more, it would not be as helpful to external build tools, they don't care about bringing the registry cache in their build directory for example.
This require an hard-to-break naming scheme (a recent hash algorithm should be good enough in 99% of the cases but collisions are always possible), which is something the cargo
team probably does not want to offer guarantees about. Instead, explicitely telling the naming scheme is not to be considered stable allows more invested people to experiment with the feature and find something solid if stability proves itself necessary.
What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) cargo
can instead reroute people and tools towards untemplated CARGO_TARGET_DIR
/ cargo metadata
instead, which are much more likely to be suited to their use case if they need the path to the target directory.
While a very nice tool, targo
is not integrated with cargo
and has a few shortcomings:
- It uses symlinks, which are not always handled well by other tools. Specifically, since it's not integrated inside
cargo
, it uses atarget
symlink to avoid having to remapcargo
's execution usingCARGO_TARGET_DIR
and such,making it less useful for external build tools that would use this functionality. Using such a symlink without setting theCARGO_TARGET_DIR
env var also meanscargo clean
does not work, it just removes the symlink and not the data. - It completely ignores
CARGO_TARGET_DIR
-related options, which again may break workflows. - It needs more metadata to work well, which means an external tool using it would have to understand that metadata too.
- It uses
$CARGO_HOME/targo
to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept
cargo
's arguments, making it more brittle than an integrated solution. - Its naming scheme is a base58-encoded blake3 hash of the workspace directory (source), not taking into account the use case of thousands of target directories within
$CARGO_HOME/targo
. - It uses the workspace root dir and not manifest, which means a
targo script
would share cache between all the scripts (cargo script
) in a directory, which may not be the desired effect.
Some of those could be fixed of course, and I don't expect cargo
's --target-dir
and --manifest-path
to change or disappear anytime soon, but still, it could happen. An external tool like targo
will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC.
On the other hand, targo
is already here and working for at least one person, making it the most viable alternative for now.
rust-lang/cargo#11156 was originally about remapping the target directory, not about having a central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. In the future, if CARGO_TARGET_DIR_REMAP
is introduced, it could be used to be the prefix to the target directory like so:
- Set
CARGO_TARGET_DIR_REMAP=/home/user/projects=/tmp/cargo-build
- Compile the crate under
/home/user/projects/foo/
withoutCARGO_TARGET_DIR
set - The resulting target directory will be at
/tmp/cargo-build/foo/target
By making the priority order CARGO_TARGET_DIR
> CARGO_TARGET_DIR_REMAP
(when both are absolute paths) we would keep backward compatibility. Or we could disallow having the two set at once, so that they're alternatives and not ordered.
When CARGO_TARGET_DIR
is relative, the result could be /tmp/cargo-build/foo/$CARGO_TARGET_DIR
.
Overall, I feel remapping is much harder to implement well and can be added later without interfering with templates in CARGO_TARGET_DIR
(and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC.
It's possible to achieve most of the same system proposed here by setting a value like CARGO_TARGET_DIR="/base/dir/{manifest-path-dirs}"
, where a manifest in /tmp/test1/test2/Cargo.toml
would resolve the build directory to /base/dir/tmp/test1/test2/
, but most is not all of them:
- Hashes have a fixed length while
manifest-path-dirs
is dependent on the context, making it a hazard for cross-platform compatibility. Say on Windows the target-dir is rooted in a user tmp dir and the manifest path is inside of the user documents. Especially combined with corporate policies on names, those base paths alone can take up a good amount of the character budget without getting into project names, etc. - Encourage interactions through
cargo-metadata
: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work throughcargo-metadata
to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates - Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites
While unlikely, the transition period may break workflows by introducing errors for previously valid target directories. Several alternatives exist for this:
- Do the transition immediately and silently: any
CARGO_TARGET_DIR
previously using the exact new key will change meaning, though as noted before, it is unlikely to have been set to that in practice.
- Rust and Cargo both tend to shy away from making silent changes when they can affect observable behaviour.
- Introduce a config to deactivate it fully, something like
CARGO_TARGET_DIR_USE_TEMPLATING=false
.
- We want to avoid config proliferation, especially a config that is intended to disappear once the transition period is over.
- Ignore the templating if a directory exists at
/path/to/{manifest-path-hash}
(without interpreting the key) and do the transition immediately.
- The newer
cargo
could print a note or warning noting the change occured. - This could work well, though any existing workflow relying on the target dir path would break if not using
cargo metadata
(again, it is very unlikely for this exact key to be present in a target directory).
The bazel
build system has a similar feature called the outputRoot
, which is always active and has default directories on all major build/development platforms (Linux, macOS, Windows).
The naming scheme is as follow: <outputRoot>/_bazel_$USER/
is the outputUserRoot
, used for all builds done by $USER
. Below that, projects are identified by the MD5 hash of the path name of the workspace directory (computed after resolving symlinks).
The outputRoot
can be overridden using --output_base=...
(this is the untemplated $CARGO_TARGET_DIR
when it is used with a template) and the outputUserRoot
with --output_user_root=...
(this is close to using $CARGO_TARGET_DIR
, already possible in today's cargo
).
It should be noted that bazel
is integrated with remote caching and has different needs from cargo
, the latter only working locally.
Conclusion: bazel
shows that a hash-based workflow seems to work well enough, making an argument for the use of it in cargo
too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. cargo
could also do this, though it is not clear what happens when --output_user_root
is set to the same path for two different users.
Note: Bazel 5.4.0 was used as the reference, the latest stable version as of this writing, things may change in the future or be different for older versions.
- Do we want to differentiate according to users ?
bazel
is a generic build tool, whereascargo
is not, so maybe differentiating on users is not necessary for the latter ?
This works in concert with the subsection Introducing more data to the hash
later below: the currently proposed key name introduces a dependency in its name, should we prepare for possible changes by renaming it ?
- Introduce remapping into the concept in some way.
- Introduce a form of garbage collection for these target directories so we don't leak them when projects are deleted, see rust-lang/cargo#13136 (essentially, backlinks)
OS-native cache directories are discussed more in details in rust-lang/cargo#1734, there are semantic and naming issues to resolve before moving forward with them in cargo (and so as template for CARGO_TARGET_DIR
).
As a workaround, it is possible to use CARGO_TARGET_DIR="${XDG_CACHE_HOME:-~/.cache}/cargo-target-directories/{manifest-path-hash}"
.
It won't work in the config.toml
but it will work with the environment variable and the command line option, both of which override the TOML config.
It is certainly possible to add at least {home}
, {cargo-home}
and something like {cargo-default-target-dir}
but it can be done in the future without interfering at all with {manifest-path-hash}
, making it a good option for a future addition without blocking.
Other configuration items in cargo
could eventually benefit from templating, like [build.rustflags]
or [env]
.
While not the goal of this RFC, the design is such that is should easily extend: {...}
is not a valid format anywhere currently except in places with paths or where cargo
directly forwards to either a shell or rustc
:
rustc
can be made to work with cargo and is also very unlikely to introduce{...}
in flags because of the large number of sh-like shells that use this format to resolve globs already: callingrustc
with{...}
as an argument would become very error prone.- paths, as already discussed earlier, are also highly unlikely to contain
{...}
, even less the exact keys thatcargo
will expect. - runners could expect such arguments, but that would only become a problem if cargo introduced a conflicting key and it would need a runner using that pattern, which is again unlikely.
- What default value do we use ? This discussion is already happening in rust-lang/cargo#1734) and does not block this RFC (and also came up with
cargo script
and as been discussed at length there, we would use the same default as chosen for it) - This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here.
This RFC uses only the manifest's full path to produce the hash but we could conceivably introduce more data into it to make it more unique for more use cases:
- By considering the user, we could avoid sharing caches between
sudo cargo build
andcargo build
for example (and more). It could be especially useful for shared artifacts storage. Bazel already does this, but Bazel was built to be a distributed build system, whereas Cargo was not. - By considering features, build flags and a host of other parameters we could share builds of crates that use the same set of features between various projects. This is already discussed in rust-lang/cargo#5931.