-
-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the dependency on zerocopy optional #1574
Comments
Looking quickly at the commit that introduced this, it may seem like zerocopy (which contains lots of unsafe code) was introduced to remove a very small amount of unsafe code from rand. Was that the only motivation? If so, then this change is a huge negative for many users, as the amount of unsafe code in total in our projects has now suddenly increased by a very significant number. If this was the only motivation then please consider reverting this change. |
Using zerocopy should make things safer and not more unsafe. It replaces potentially unsound unsafe code with safe semantics which are carefully checked and relay more on the compiler than on humans to verify. I think it is a bit misguided to only focus on the amount of unsafe code. zerocopy reduces the possibility to write unsound code and that is a good thing in my opinion. You could argue that we can leave some performance and remove unsafe code and zerocopy, but this is a different story. |
I saw there was a discussion on reddit about this, including a number of comments supporting this move. I'm not fully aware of how this affects different groups, but yes, See also this comment by @joshlf in #1349 and also #1502. |
I co-maintain zerocopy and lead Rust's Project Safe Transmute — happy to answer questions about either project. I apologize for any inconvenience this has caused you, but I do believe it's a one-time cost. Zerocopy is a dependency of many major Rust crates, and thus already is a transitive dependency of many non-trivial Rust projects. If zerocopy's use results in more We strive to make this vetting process as easy as possible. Zerocopy has extensive internal documentation, and most of our SAFETY comments directly quote from Rust's stdlib or reference documentation. If there's anything we can do to make the vetting process easier, we'd love to hear it. Stepping back, the use of crates like bytemuck and zerocopy in the Rust ecosystem is incredibly important to Rust's Project Safe Transmute. Functionality requests get funneled to these crates and that, in turn, helps inform the design of the Rust's future support for safer transmutation. |
Let me assure you that I have no doubt that zerocopy is well maintained and of very high quality. Our situation is that we need to vet our dependencies. This is not a one time cost. Every time we upgrade to a new version of a crate, it needs to be vetted again (a delta review at least). This includes (but is not limited to) doing a code audit/review of all new or modified unsafe code. The fact that zerocopy is well documented and well maintained does not actually help that much in this regard. The proliferation of dependencies is a challenge for us. rand was a "good crate" in this regard as it has a limited and well defined purpose, and used no dependencies to achieve this. This ended with version 0.9, when zerocopy was introduced. We will now have to do a code audit of a new version zerocopy whenever we upgrade our dependencies. This will have a significant cost for us. It is probably much cheaper for us long term to replace or fork rand somehow to get rid of the dependency on zerocopy. I suppose the majority of users of rand will be fine with this change. Most/many care little about dependencies at all. However, for the users that are trying to keep the number of dependencies low, or that have to vet their dependencies, the introduction of this dependency on zerocopy will cause pain and higher costs. If a fundamental and extensively used crate like rand does not care about the collateral damage from this, then who will? If this was done only to remove unsafe code, by delegating this to zerocopy, the please consider fixing/reverting this. Alternatively, if this is only necessary for performance optimization, then please consider putting this performance optimization behind an optional feature so that users that do not need it can opt out. |
That does sound like it could be painful. However, |
I understand that reddit is very much in support of this, and seemingly so is a large part of the Rust community. That is disappointing. Unfortunately it means that for plenty of commercial contexts it's very likely that this will create a growing rift in the ecosystem. It's becoming harder and harder to audit Rust projects because of changes like this. I understand the goal of In practical terms this will cause a lot of people to move to hand written RNGs or |
I would say in summary the change to zerocopy was good for most users, but for some it is quite a pain. I think it is feasible to have zerocopy behind a feature-flag and have safe but possible slow code without it (or fast and unsafe code as alternative). There are only a handful of places which would need to change and be maintained (I found 4, in rand and rand_core). Not sure if the added complexity is worth it, but it is also not too bad if enough people feel strongly about it. There is also hope this might become unnecessary with better compiler support. If this is deemed worth it, I can do a PR for it. |
I think if minimal/no dependencies and zero And if you are in the position where this is untenable, forking/vendoring is the answer. But I think this is a wrong move to undo efforts to make |
There are only two places in The dependency on zerocopy was added during a flurry of PRs that proposed adding zerocopy as a dependency to many crates, including one of mine. In my case, I was able to refactor my code to avoid Accordingly, I think spending effort on PRs might be a more useful use of time than the meta-discussion about dependency policies. |
I don't want to drag this out excessively, but I do want to respond to this:
In a world where I control all my dependencies, maybe. But that's just unrealistic. For instance |
I'm sure that is all correct, but that does not help an audit process. You see as far as auditing is concerned, this is just hearsay and would have to be verified every time the a new version is audited. It is impractical to do an analysis of the call graph of rand when auditing zerocopy for a project. It is also possible that other crates have started to depend on it since your last audit. In the end, if you depend on it you do not have much choice other than doing a full audit. |
I think an "almost std" crate like rand should have absolutely minimal (preferably zero) dependencies. This should be a top priority, to avoid forcing users of rand to depend on crates that they would otherwise need. Rand is the de facto standard for random. That success comes with responsibilities, and you should not take those lightly. That should mean avoiding taking any dependencies unless they are absolutely necessary, and zerocopy does not seem necessary. |
One final response and then I'll leave things here:
And |
You are not wrong, but that is a very recent change. It's very disappointing and frustrating to see that crate show up in critical parts of the ecosystem at the moment for what was previously rather lightweight crates. I'm not sure where the push towards zerocopy is coming from but it's disappointing. What is however particularly disappointing in the case of Up until the change in |
Hm, could you say more about this? Both
...is something else in the dependency tree enabling this feature? |
@hpenne Can you say more about who "we" is and what your vendoring processes look like? I work on the Fuchsia team at Google, and we have similar process, so I understand where you're coming from regarding being dependency-skeptical. For this specific issue (leaving |
I did a quick sketch of replacing one of the uses of zerocopy with safe Rust code: PR #1575 The other use of zerocopy is very similar. |
I don't know how to properly evaluate performance of the functions in I would also say that, at least as far as the code in PR #1575 is concerned, if that code isn't being optimized well by the compiler, it's a pretty bad indictment of the compiler; if I were on the compiler team, I would consider it to be a bug. |
Optimisers are amazing, and perform better than most developers can imagine. It's starting to seem like the dependency on zerocopy might be unnecessary from a performance point of view. Perhaps that is worth exploring further? Now on to something more sinister. I'm sure many of you are familiar with the OpenSSH and XZ/liblzma incident, which came very close to compromising a huge number of Linux systems. The community nallowly escaped this, but mostly due to a lucky discovery and the extraordinary efforts of one single person. There was a lesson in this for the offensive state actors/hackers of the world. This lesson is that a small amount of money invested long term in supply chain attacks can have a huge return and allow you to compromise a vast number of systems. Any such state actor that did not already fund long term supply chain attack programs will now be doing so, after the XZ incident. The incident proves that it is stupid not to. The Rust community will inevitably also be the target of such long term supply chain attack efforts. I suspect Google will be an interesting target due to the use of Rust in Android etc., and among Google's crates, zerocopy would be an interesting target indeed. Making rand depend on zerocopy puts rand and all its dependents in harm's way for such a supply chain attack. Is that really a risk that you are willing to take on behalf of your users?
This kept me awake in the early hours of the morning. If you are invested in a long term effort to do a supply chain attack through a crate, one of the things you would like to do while you are trying to get into a position to compromise the target crate is to try to make as may other crates depend on it as possible, in order to widen the impact of your efforts should you succeed. I know that this will be considered paranoia by many, and that I risk offending members of the zerocopy community. That is is not my intention, and for that I apologize. My previous paragraph is likely to be wrong (and I certainly hope so), but Rust certainly will be the target of future supply chain attacks and zerocopy is a very attractive target, and even more so if it is introduced in more widely used crates, like rand. I do hope that you can find the time to consider removing the dependency on zerocopy, especially if there is merit to the ideas I see that some/all of the unsafe might be unnecessary. I'd be happy to help out if I can, and I am reasonably certain that my employer will allow me to spend a little time on that. You should not trust a stranger like me to write unsafe code, but I can probably be trusted to remove some. |
Could we not get into conspiracy theory level thinking about dependencies, please? Let's not go assuming malice out of other open source contributors and maintainers simply because you don't like a dependency. |
Lets... just side-step the political / conspiracy theories and focus on a more technical question: can we drop the I thank @briansmith for getting started here with #1575. On a slightly different but related tack, we already have proposals #934 and #1545 which would replace/drop the dependency on |
It was never my intention to imply that anyone here added this dependency to rand out of malice. I was trying to make the point that there will be malicious actors out there with the intent of doing supply chain attacks, which might be relevant to the discussion. I believe that the push to depend on zerocopy which I quoted is in all likelihood perfectly harmless, but that there is always a small non-zero probability that something like that might not be. However, taking the speculation that far was too much. Please accept my apologies for that. It makes me happy to see that there is work in progress to try to do this without zerocopy. I hope that succeeds. |
If we can find ways to implement things in a way that is safe, readable, and performant (on relevant benchmarks), I'm fine changing the implementation. But if we can't do that (i.e. things would be an unreadable/unsafe mess otherwise), I would want to stick with zerocopy. It's a very well-tested and well-maintained crate, and at this point I would trust it more than unsafe code I would write myself. |
@josephlr the point motivating this issue remains, however: |
Just to clarify what I said above, I think the zerocopy developers were doing a good thing when they contirbuted their PRs to projects. I think in many situations adding the zerocopy dependency wasn't the best solution to the problem, but when they had dozens (IIRC) of crates to contribute to, it wouldn't have been reasonable to create an optimal solution for every instance, especially when there are trade-offs to be made. (I, myself have, multiple times, contributed to dozens of crates in a single day to update them to use a new API.) |
@dhardy Would you consider making zerocopy optional and on-by-default to address everyone's concerns here? It lets people who are concerned about the dependency act on that concern without affecting performance for most users. It's also probably an easier change to make than trying to develop a safe alternative to the current code that achieves performance parity. On Fuchsia, we occasionally have issues with specific dependencies similar to what @hpenne is describing. What we usually do is just patch or fork in our vendor directory so that we don't have to review a particular crate. If rand had zerocopy as an optional feature, then anyone with a concern could just carry a small patch in their vendor directory to disable that feature for all of their dependencies which depend on rand. |
Perhaps it is better to see how the active PR turns out. It is not unlikely that the speed difference will be too small to matter, and keeping zerocopy as a dependency will make zero sense in such a scenario. |
This seems like self-inflicted pain if you review more than what is used by the dependent crate. If you applied some form of cross-crate-dead-code-elimination before review the burden would become much smaller.
Perhaps this is a matter of having a process without adequate tooling? Should |
I'd prefer only having a single implementation of functionality, assuming a reasonable solution can be found. (And, considering that this crate does not have a policy against usage of It looks like we have a solution to the usage of
Edit: the simplest solution to the above two usages would be to make the using Finally, |
@dhardy |
Background
Projects that need to vet their dependencies will suffer greatly now that you have introduced a dependency on zerocopy, which is a big crate with a huge amount of unsafe code. Since just about any major project uses rand, this now makes "every" major Rust project depend on zerocopy as well.
This is really problematic. Please make this optional and off by default.
I understand that some users want the absolute best performance and that you want to support this, but most applications do not need this and introducing a dependency like zerocopy will have a significant negative impact on many users, one that it is currently almost impossible to get out of.
Feature request
Please make the performance optimizations that require zerocopy optional and off by default.
The text was updated successfully, but these errors were encountered: