-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a user-local cache for URL imports makes reproducible builds hard #213
Comments
What if the cache stored a hash of the downloaded package along side each URL. Then, some external tool could be written to generate a "snapshot" of the cache for reproducibility. The same tool could tell you if the packages have changed since the snapshot, by comparing hashes of downloads. Edit: Actually, the external tool could do this hashing part as well. So it could all be implemented by a third-party tool separate from deno. |
Also, @ry mentioned bundling npm with node as a regret, since it made npm the de facto project host and created an unnecessary centralization. Thus, I believe that deno shouldn't be bundled with a package manager. The community should write one (or many), as you mentioned will inevitably happen. Then, multiple package managers or package hosts can compete fairly with one-another. |
Regardless, this could all be done without removing URL imports. Imagine you want to get started on some project quickly... you import the URL and get going. Later on, you can use the third-party tool to declare dependencies and ensure integrity, and potentially remove your URL imports. I'd like to point out a parallel between TS and this strategy: both allow you to get up and running quickly, since you can just write regular JS, and analogously throw in a URL import. Later, perhaps simultaneously, you could add typing and other TS features in, and get rid of the URL imports. The key point is that time(idea to code) is minimized at first. |
|
This is one of the most clear titles I've seen of an issue in a while. That said, I disagree with the premise that deno should decide how to solve reproducible builds. Reproducible builds is a tooling and process feature that is appropriate for some projects but not all. It seems to me that LFRC does not conflict with any of the proposals you mentioned in the 'Vendoring is a better solution' section. I see deno as just a runtime that can defer to the OS on how to resolve a URI. If you do require reproducible builds, then LFRC does not preclude doing anything like...
Baking in any tools or processes will make it harder to change in the future when we come up with better solutions than More importantly, if you care about specificity I think allowing for proposals like #200 capture the spirit of proposals to help without restricting future decisions. |
I personally really like the idea of a "deno get" tool which creates a vendor/ directory, and having The "deno get" tool could be external / third party, but if deno doesn't have the "load URLs only from vendor/ if vendor/ is present" behavior built into it, then the one of the following would have to be true:
|
If you want people to use your vendored files, whether they were installed via dvendor/deno get or anything else you have your imports refer directly to the file you distribute in the repo. No need for deno to treat any directory as special. |
Subresource integrity can help with this. |
#200 does not solve the name clashing problem. This seems to be a point of confusion, so here's a concrete example: Alice writes some program For some reason, Later (after the change), Bob writes some program Now, somebody who wants to use both This problem arises because dependency names may not be unique across projects. If your dependency cache is global across all of your projects (i.e. user-local), then it requires that all dependencies of those projects are uniquely named. As in the case above, this is not always guaranteed to be true. Yes, this is arguably a pathological edge case. But this kind of edge case arises all the time in building real-world programs, and vendoring is the simplest way to address it. Other solutions, like SRI or lockfiles with hashes, pay the same complexity cost (your project has to contain something specifying your dependencies) without reaping all the benefits (SRI/hashes do not solve name clashes or availability). Yes, you can vendor without requiring runtime support and LRFC does not preclude vendoring. But LFRC presents a significant footgun to new users. I think applying the law of least surprise for developer experience suggests that:
LFRC does not guarantee either of these. Reproducible builds are very difficult to solve with tooling bolted on as an afterthought. Go has been down this path -- it requires dependency name rewriting or dependency cache versioning. Vendoring is a simple solution to a real problem. |
I was about to say Go is doing this with no problem but saw this blog post complaining about similar issues! |
This is the purpose of the
I think having a global download cache as default is reasonable as long as it can be circumvented. How do you feel about this, @ilikebits ? |
While I think it is inevitable that some sort of loader/resolver configuration will become necessary it feels a bit premature. Actually a lack of a loader configuration is one of the things that always frustrated me with Node.js while, AMD loaders were configureable. SystemJS provided a similar, but incompatible configuration, but it was sort of on track with the WHATWG, but that has totally stalled IIRC. One of the biggest challenges with ESModules at the moment is this sort of meta problem of how you deal with resolution and loading of modules. My honest opinion is just to track that as closely as possible, though it is still likely to go nowhere fast. The sort of lack of agreement, because it is a complex horrible topic. The only other option, outside of dealing with the configuration of the cache directory, is to make the module resolution "pluggable" from within deno. Of course anything you expose, people will use and become dependent upon and blame the person who created it for the 👣🔫. So even saying all that, it just feels like the best thing to do is hobble along, with some expressed semantics and patterns around hosting a "semver" modules on the web, and find out what problems are really encountered, and solve those problems when they occur. We are always great at trying to solve the problems we think we have, only to find out we didn't. |
@ry that sounds like a reasonable compromise if you're really intent on maintaining the current implementation. I can easily see a hypothetical That said, I really think LFRC is a hidden footgun. It's an edge case that people probably won't run into a lot, but it'll definitely be somebody's surprised and mildly irritated blog post somewhere down the line. I think the trade-offs between LFRC and vendoring are:
I think vendoring sits at a sweet spot of having simple semantics and implementation, being open to extensibility, and being very easy for new users to understand. (In fact, I wouldn't be surprised if beginners are more/just as confused by implicit downloads than by an extra download step.) |
@ilikebits I'm pretty much in agreement. I just want to make sure the module resolution scheme is as dumb as possible. If other tooling wants to hack it - that's up to them - but the base runtime should be very simple. One use-case I'd like to support is:
By having the caching happen globally by default, and having default security, this allows people to distribute and run complex utility programs from any location. I'm not necessarily against |
Sure, trying out |
TL;DR: Go already tried
~/.deno/src
with$GOPATH
, and it sucks when working with multiple projects. They switched to vendoring because it's a lot simpler, and Deno should use vendoring too.This is a continuation of #195. I think the current URL imports implementation has a serious flaw with reproducible builds when working with multiple projects. Even if the implementation isn't changed, I'd like to know what the idiomatic Deno way of handling these use cases should/would be.
To clarify, I think that URL imports is a good idea. I disagree with the implementation detail of how it works ("load on first run, then cache", which I'll call LFRC), because I think this makes reproducible builds with multiple projects unnecessarily difficult.
This issue contains some example use cases where this implementation makes reproducible builds difficult and a proposal for an alternative implementation that keeps the same URL import syntax, maintains the spirit of URL imports (i.e. simple module resolution and no central registry), and makes reproducible builds easy.
Drawbacks of LFRC
I already wrote a bit about difficulties with the single-project workflow in https://github.com/ry/deno/issues/195#issuecomment-395565575, which I'll summarise here:
~/.deno/src
in order to get a reproducible build, because any URLs I'm importing might have changed or become unavailable since the time I first ran my project vs. the time the destination machine first runs the project.These are already serious (albeit surmountable) issues, but the bigger problems arise when working with multiple projects:
~/.deno/src
or do vendoring.~/.deno/src
with the project's provided~/.deno/src
(or at least overwrite the parts of my local~/.deno/src
that specify the downloaded project's dependencies). What if I was working on another project?This model is exactly equivalent to Go's before 1.5 and vendoring (just substitute
$GOPATH
with~/.deno/src
).In order to solve these issues, we'd need to resort to the same heavy-handed, painful, and unintuitive solutions: either rewrite URL import paths (so projects each get a unique URL import so they never clash) or do versioning on
~/.deno/src
(so when we work on different projects, we can switch to the project's own copy of~/.deno/src
).Given a user-local (instead of project-local) dependency cache, it is fundamentally impossible to solve the problem that a single user may have different projects that may have different dependencies that use the same name:
O(n)
different NPMs (wheren
is the number of different dependency authors).~/.deno/src
cache, the location of the project can change (it might move around in the same machine if the user has to move folders, and different users will have different directory structures when they clone the same Git repository) because the concept of a "project" is not tied to any particular filesystem path.In an ideal world with ideal humans and ideal infrastructure, these problems would not exist. Unfortunately, we don't live in that world, so we shouldn't use a dependency model that forces users to rely on dependency providers behaving ideally.
Vendoring is a better solution
Instead of LFRC, URL imports should be vendored (in Go 1.5+ style). This means that projects should have some designated folder (perhaps
vendor/
) that contains all of their dependencies.This fundamentally solves the problem that different projects may have different dependencies that have the same name. It's also easy to implement, easy to understand, and easy to use. Here's how dependency resolution with vendoring could work:
deno get
) or the community builds its own tool.Foo
(whereFoo
is a URL or other string supported by vendoring tool).deno get
, which downloadsFoo
tovendor/Foo
.Foo
, it tries to loadvendor/Foo
. Ifvendor/Foo
does not exist, the program crashes.This behaviour is extremely intuitive: the tool's operation is simple, the failure modes are simple, and writing your own tool is simple. Users who desire complex dependency resolution logic (e.g. for doing version resolution) can easily write their own tool, and that complexity is not baked into the runtime.
I am almost certain that if Deno does not support this natively and becomes popular, then the community will be forced to write this tool. Go has already been down this path. Using a user-local dependency cache provides a very poor developer workflow and is not feasible for users who need to reproducibly build multiple projects.
The text was updated successfully, but these errors were encountered: