-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: node --install #11835
Comments
I'm definitely +1 on having this conversation but if we decide to move forward we need to scope this very carefully and specifically. Some questions:
It's likely easier to ask it this way: what aspects of npm's scope (and possibly yarn's) would this not include? The pitfalls are much more straightforward: registry clients are complex, matching user expectations even more so. What additional amount of work would be required to develop and maintain a reasonably feature-complete install client that tracks well with what npm and yarn currently do. If either of those clients moves in a different direction as far as things like dedupe, version selection, etc, which is the "source of truth" with regards to whether one included in core should change or not? We need to be certain about what amount of work would be expected here. |
Thinking about this from an "optimized for production" use case you might be able to cut a lot of corners here. For instance, you can expect that install is only ever run once, and you can blow away the prior installs deps as a result. Does dedupe still make sense in that context? |
If we are optimizing for production, we may not need local. How do people feel about git based installs in production? |
@iarna should probably comment on this. This is a bad idea. Not an obviously bad idea, but one that very quickly spirals out into a rather large pile of complexity once you start pulling on the thread. Here's a non-exhaustive shortlist of things that you'll likely need to support in order to handle the enterprise use cases y'all probably care about:
Additionally, you need to be aware of security implications of pulling code down from the internet (checking the checksums, etc.) and any cache has to be resilient against being run in multiple parallel processes (which is a real thing that real people do a surprising amount of the time). If installs are going to be kept in a reasonable time frame, you probably also want to build up the dep tree, then dedupe it (without causing any package to get the wrong version of any of its dependencies), and then lay it out on disk. Is this really a thing that y'all wanna take on? You'll end up re-implementing the majority of the hard parts of npm. (Publishing is, by comparison, extremely simple.) The worst case scenario will be developing a package installer that is not compatible with npm, effectively splitting the community with rival authoritative package installers. If you want something different than npm, and can articulate what differences you'd like to see, we're all ears. Maybe it makes sense to have yet another package installer tool, but I doubt it. If you can't articulate the differences that you'd like to see, then this is an even worse idea. |
Dedupe primarily makes sense in a production environment, because most people who use node and npm use it to build assets for the front-end, and shipping more than one copy of something in your webpack bundle is actually costly. |
These are .npmrc files in the project root, which brings up another question: will this feature support NPM configuration paths and environment variables or will it respond to its own? |
Just everything that @isaacs said, plus all of these in combination. Bundled dependencies and shrinkwraps in the same module. Bundled dependencies with verisons that don't match those in a shrinkwrap. Deep-in-the-tree shrinkwraps. Shrinkwraps are not advisory. They're a production artifact, so you can't skip them. (Well, you can, but will what you produce work? ¯\_(ツ)_/¯ Skipping them kind of defies the point of having them in the first place.)
You get 95% of deduping for free just by producing a flat tree and flat trees are necessary for Windows installs, so I'd say yes.
|
👋 It's been an ongoing discussion at least on my team about how to best support stuff like this so it's definitely an interesting suggestion. As far as how node could implement this, my main inclination right now is to mention that it's probably not-a-good-idea to write your own installer, because of how fragile compatibility across the registry can be for stuff like this. It sounds like you're still intending to just call out to npm proper for this, though. For the sake of mentioning near-future things that have the potential to change this discussion, though, here's some stuff about the current projects at npm:
The biggest thing I'd call out is that compatibility is hard, and like @jasnell says, writing an npm-compatible client is a lot of work to get right. In the end, npm-proper (or tools that npm-proper itself uses) is the only way to ensure maximum ecosystem compatibility. Other tools often do a great job, but they necessarily require users to deal with compatibility issues they simply didn't have to before. You can't have "minimal npm install" without all the things rebecca and isaac mentioned. But you can have an npm that doesn't include the capability to publish, login, manage permissions, manually manage the cache, check outdated deps, or update dependencies. I assumed these are the "extra bits" you're talking about. Ignoring shrinkwrap, bundleDeps, etc, are a good way of breaking random packages and are both heavily used in non-registry "production" apps. |
Looking at |
p.s. if this means that I can get the node project to help us out with building these things I would be super thrilled about that. These projects are meant to be easier for the community to participate in, because the npm CLI is so bloody huge it takes a year for anyone to onboard with it. I want to fix that too :< |
To clarify: Is the suggestion, then, that this would be in an alternative build (maybe called something like "minimal"/"production"/"infrastructure"), with the standard build, including npm, being the default? Or would both installers be shipped in the standard build? |
@bengl correct, this would open up the option of producing a build that didn't include all of npm. However, all builds would include Also, we should keep in mind that in the near future we'll probably have a lot more build types than we have now as we start to support addition vm's. |
@mikeal would this build extricate parts of node core itself, too? It's hard to think of what you'd gain from that (or, frankly, what you'd gain from not including the full CLI sources as-is) |
@zkat it opens it up as an option, but it may not end up being something we want to support. In the wild I've heard of people removing the npm binary from their docker images after preparing them, so something like this is already happening whether we produce a build like this or not. |
A minimal production install of Node.js could likely benefit from not having any publish capabilities (because it's unnecessary code in production) and as @mikeal indicates there are scenarios where the npm cliient is removed. I wouldn't say that's a majority of cases by any stretch of the imagination. |
WOW RUDE Though that makes sense. One thing about this sort of thing is that it's often better to understand the usecase-in-general before tackling a single solution like this. It sounds like "better embedded support" is the one in this case? @jasnell the thing about that is that literally everything other than the installer is a fairly small chunk of npm. I don't know how much smaller right now, but most dependencies and code in npm itself are tied up with installation itself. Yanking those secondary tools out isn't going to help. Publish is a good example: probably the biggest part of the But wait there's more! We've been talking about running lifecycle scripts inside git dependencies, so people can rely on artifact builds from git dependencies (this means that if someone is using a registry dep, then forks the dep's repo with their own mods, and points their local dep at their own fork, they'll be able to rely on the regular build that would be done, rather than having to publish under a different name). If we do that, that means we would be installing a dependency's devDeps, which this proposal, as-is, does not at all take into account. What I'm getting at is you basically have a choice between keeping the bulk of npm (with all the necessary bells and whistles the installer itself needs), or having to put a ton of your own effort into building Yet Another Installer™ that will only be a greenspunned version of the existing installer, and potentially make life harder for users that expect the command to actually work. If deployment is the concern, perhaps it's a better approach to provide users (either through npm, or through node), with a way to bundle/treeshake a single |
Node.js continues to be used in ways we never anticipated. I don't know how long we can continue to produce a single artifact that works for all of these use cases given the expansion we've seen in the last few years. What I would hate to see happen is for these ever widening use cases to start to impact the default experience, which I believe is already happening indirectly. I strongly believe that the most important constituency for Node.js installs is the developer community that builds applications and publishes modules. Catering to this constituency and continuing to grow it is the most important thing we do because these are the people that continue to build and mature the Node.js ecosystem. Thinking about how we might be able to produce builds specific to other use cases is a good way to reduce the pressure we put on the default build and give us more of an opportunity to grow the list of developer niceties that Node.js ships with by default. As we've grown I've been in more and more conversations with people who don't see why we can't just ship, by default, without a publish command, or without a debugger. There are a lot of people out there that think developers should just jump through some extra hoops in order to participate at that level. As we grow this sentiment will also grow if we don't produce something that addresses the concerns of these other use cases. It's encouraging to see that npm is already breaking off the components that would allow us to cater to this "install only" production use case without actually writing the logic all over again. I don't think it's very valuable for us to re-produce that logic and there's a big advantage to standardizing it. But make no mistake, developers need npm. If we produce a build without it, that's not a build for developers to use directly. |
Isn't this an awful large amount of logic? Also, regardless of whether we say we don't want to write a replacement for npm there will undoubtedly be plenty of people who would want it to become that and so the scope creep here is immense. Edit: Mikeal does make a very good point above, though. |
Perhaps we can start with a smaller, more general problem set. (Smaller, not small.) To draft With that baseline established, we can much better assess Node.js' options and the registry clients also can concisely indicate which versions of Node.js (and perhaps even JavaScript) they are compatible with. |
What would be the value of having the installer in the binary (i.e. |
@dshaw The absolute minimal and smallest that you can make this general problem, to satisfy the installer contract, is this: Given a folder with a Since shrinkwraps can be encountered at any point along the package traversal, and dependencies can be bundled in packages, satisfying the user expectation is extremely non-trivial. Also, All, Can we drive towards what y'all would actually want to see from this? Because I just have so many questions.
I've already said I think it's a bad idea (or at least, extremely wasteful and expensive), but that's largely because I know very intimately the significant cost involved in doing this, and I anticipate that it's a bad use of Node.js project resources. Maybe I'm wrong! But until someone can clearly answer at least the first two questions in that list, there's really nothing to discuss here. Most of the conversation in this thread so far is jumping to try to answer (7). That is very premature. |
@isaacs don't forget both ends of the It also leaves the question open of what sorts of errors this would provide? Does it check for invalid deps? missing peerDeps? Private packages? Private git repos? Like, do y'all actually get what you're asking folks implementing this to get into? This is literally my full time job and I barely have time for it, and I'm working primarily on stuff that would be directly used by this proposal. I am also terrified of breaking compatibility with 450k+ packages. Most people who use our stuff, we never even talk to. We'll just ruin their day. Because we wanted to save a few kb of data (yes, it's probably less than a couple hundred kb that would be saved by yanking out everything not installer-related). You're better off making |
For one thing, it doesn't take an argument for the package name :) I think the primary motivator is simple: reduce the surface area to what is needed for a production use case. I think the size of all of npm is a bit of red herring, it's mostly about reducing what can be done to what needs to be done. In terms of security this is a good practice and not one that I can easily argue against. It's a pretty natural inclination for people building large production systems to reduce what is available in those systems to the necessities. Is it better? No, it's just less.
I think we're slightly over-estimating the impact of this feature. It's new behavior and has much less impact than existing behavior millions of people are dependent on already like |
@mikeal What I'm getting at is I think you're far underestimating the "additional maintenance burden", and the ongoing cost of potentially spreading registry incompatibilities if the main clients don't sync up well enough or introduce separate bugs. And doing all this just because of a small fraction of the download of node. There's not even any guarantee that your end-product of a full re-implementation would actually be smaller, code-wise. And your sugar syntax is trivially achievable right now. I'm trying to understand what the difference between "several kloc of installer code" and "several kloc of installer code with a couple hundred unused loc" really is, to you. And if it's the interface you care about, why a bash script that's literally just Like, you now have two teams of multiple people being paid full time and receiving further community support who have actual expertise in doing it, who are each maintaining two different registry clients, and there's already a bunch of excellent effort that's been put into making those two alone be compatible. Where is this third team coming from? And what's even the point at that point? |
By most estimates infrastructure accounts for far more of our downloads than developers do. |
@mikeal I literally just deleted everything not having to do with the installer from the published version of [email protected]:
The latter includes removal of all docs, AUTHORS, manpages, build scripts, all dependencies and subcommands that, off the top of my head, are not critical for the installer only to work. Only about half of that drop came from removing code. The installer is the vast majority of the code for the CLI as-is, and that's what you're talking about rewriting. npm isn't some fat tool that has literally everything you could ever want. It is, for the most part, just an installer with a couple of allowances. You'd get a bigger boost by taking the current codebase and minifying it tbh. Have you considered that? I mean, are we trying to conserve disk space, or achieve some standard of purity "untouched" by "unneeded" things?
I meant the total distribution size. Full-fat npm currently uses 3.2M tarred. The |
Let's keep in mind that the originally intent of this discussion is to discuss a hypothetical option we could take, what the scope of that action would need to be, and whether it would make sense to keep exploring it. So far we've have a great discussion from a small group of people whose points of view are pretty well established in this space. Let's make sure we don't rabbit hole too much right out the gate and end up discouraging others from weighing in also. I'd really like to get input from the larger group of @nodejs/collaborators on this. |
I'm just going to point out that if I wanted a stripped-down "production only" node, I wouldn't want install stuff either because I would be supplying the tree from an outside build process. imo if you're running an install on a prod cloud box, you're doing it wrong. I would consider this cruft. |
@jasnell It's worth noting that three of the people with the most experience in this particular aspect have given their informed technical decisions here. I feel "point of view" is a bit dismissive of how familiar some of us are with the scope and technical concerns of the solution that was proposed. I'm not speaking here from some political leaning about the purity and supremacy of https://github.com/npm/npm. I'm speaking as someone who has internalized just how big the scope of this really is. I am super interested in hearing more from folks like @jfhbrook about what their deployment processes and woes are. There is, in my opinion, a really interesting problem space there that hasn't quite been tackled yet. And as I said above, I'm all for looking for (and helping with!) solutions that solve the distribution issues users have run into. And that does include having a standalone installer. I'm just trying to make a strong point that, from my perspective, the benefits for the stated use-case are minimal, and the cost is basically astronomical, if the solution is "write our own thing again". |
I mean that while we are locked to using a dependency, due to the sheer complexity of attempting to support their backwards compat, like
How clients place things from |
To an extent, yes. But documented better and more completely so that other implementations can produce a valid result on disk that node can use successfully without having to reverse engineer what the npm client is doing.
This entire paragraph about things that annoy you is off topic and is not constructive to the conversation.
Ok. Good to hear that it's on the radar. That said, this is definitely something that should not be done unilaterally then thrown over the wall at some point. There are many approaches to this and several paths need to be explored collaboratively and openly within the ecosystem to determine which is best. I'll be very happy when something practical emerges here. (I will also say that I share @bmeck's concerns around using GPG, but that's a different conversation for a different thread.)
No one has suggested such a thing. What has been suggested is a spec that more completely and strictly describes the minimal footprint of an installed module so that tools can be implemented to meet those requirements without being required to reverse engineer the npm client. That's quite different than nailing "down exactly what npm et al do today".
Assuming you mean |
Yeah, but that paragraph also made what I think is a fair criticism of the node foundation, which I'll also quote because I think it's important that you at least meditate on this offline:
It would be a mistake to ignore that merely because it's "off topic".
Can we please at least call this something else, then? OP was very clearly talking about a very different feature. |
@jasnell as @jfhbrook says, this thread is deviating for a very different feature. I think you're pretty much right that this feature request, as stated, can be safely considered dead in the water. Here's some other stuff that got brought up in the thread, some of which is good stuff to keep thinking about:
To summarize, I think this feature request should be closed because:
|
I take an approach where the standard needs to be made and implementation feedback needs to come in at the same time, thats why I have been spending time doing some work on webpackage. Moving all work into the ecosystem doesn't allow people to express core use cases. Often, when we create implementations of features, we only are thinking about our own internal use cases. That is why developing a specification and an implementation are important instead of retrofitting a design to work for a use case.
I don't fully agree on this conclusion that it has fallen flat. I do agree that if it is seen as an
Having been through the nightmare of talking about ESM, I can fairly safely say when developing pretty much anything involving modules. We need to have a central place to talk. As stated in the original issue text by @mikeal :
This issue is open to change scope and attempt to reach an agreement about what should be done.
As I have stated earlier, I personally have no desire for this to do package management. I am still hoping to talk about the scope and future concerns as listed in above comments. |
This thread is full of "solutions" without any problem statements. Many of which aren't obviously related to the original text. For example, package signing and deployment both seem way off topic if we're just discussing @Mikael's feature request. As best as I can tell they came up because people were trying to backfill what problem the feature request was supposed to be solving and then proposed additional new solutions. While I'm sure the intent wasn't to derail the issue, that's all this discussion is doing. This thread would do well to be started over with a problem summary, rather than a feature request, and then proposals that address that problem summary can actually be weighed on how well they solve that problem. Without that, I'm unlikely to involve myself any more, it being a poor use of any of our time. |
to throw my hat in the ring i'd like to take a few steps back. it sounds like the problem here is defining a problem. @mikeal tries to define the problem like this:
in order to even think about what the solution should be, we should be asking the community to respond with statements that say "npm is not the ideal tool for the job of running on my infrastructure because X, it would be great if it could do Y, or not do Z. maybe a new thing A that just does B would also work. here's an example of my workflow where this is a pain point: i would like to suggest that we open up an issue that asks "what are your pain points" that and see what we get. like many open source issues, this one is distracted by the suggestion of a potential solution and an insistence on the owner of such a solution (Node Core). this is a tricky question to ask because it is such a large and shared community but we can only serve the community's interests if we talk to them. the data that @mikeal is basing this thread on is completely elided. i have no idea where this idea is coming from! i've been around for a second and no one has come close to asking for something like this, so i'm definitely intrigued about this idea's origin. let's do some research and talk to our users before we have a massive thread of basically only npm and node core developers. without that data we'll fail to build anything that helps anyone. talking to users will help us know what problems we are trying to solve so that we can best decide on what a solution should be and who should own it. shameless plug: we have a Community Committee who might be an AMAZING resource for soliciting and guiding community feedback. |
OMG JINX @iarna 😆 😅 |
Maybe we can just move this to https://github.com/nodejs/node-eps 😏 |
@zkat @ashleygwilliams @iarna as stated in the original issue text.
EPs are supposed to have a fully fleshed out feature in mind. This issue was created with the intention to discuss scope. |
So, yes, let's do that. I'm reasonably sure that by now we know what is out of scope, let's talk about what is. Given:
I have a few answers of my own in mind for these but I'd like to see what direction you're mind is heading on it. |
(fwiw my thumbs up of the eps move was a bit of sarcasm, which i am sorry for) to reiterate: we can't scope a feature if we don't know the problem it solves and we don't have any evidence that it is something that users actually want. worse: if we make the feature we have no way to tell if it is successful since we dont know what failure looks like. EDIT: also if we don't know what problem this solves how do we explain it to users so they can make informed decisions about using it? |
Without registry resolution, why would anyone bother with this versus a simple wget + untar in a build/deploy script? What I think might actually have some value for deployment is a As several of the npm folks have expressed though: the problem trying to be solved here seems rather undefined so far. Can anyone share how they'd actually use this feature and not just what it should look like? |
fwiw, if node were going to have a "help pull down prod artifacts" command (please don't call this --install) that would currently look like But if it's literally the same as a curl command, I'd just as soon use curl, and I think that's true of most people running node in production. I'd like to +1 @ashleygwilliams's suggestion that you do user research on this. I'm basically volunteering to fill out surveys about deploying node right now! Uhh, but I honestly don't know what problem @jasnell's proposal is trying to solve. I don't mean that as a dig. I just never had a problem that could be solved with a glorified |
Lol.. I haven't made any proposal. I've asked questions so I can understand what others are thinking. |
Also: I'd find run-scripts support in core a little more useful. A lot of PaaS's use 'npm start' as the command for starting the node app--or at least nodejitsu did. I think heroku as well? So there's some precedence for using run-scripts as the command for kicking off their services. I've usually seen people implement this directly in systemd or upstart in practice, since it kinda sucks when the process your init system is controlling isn't actually the node app. There's some parallels here with foreman--or at least, a way you can use foreman. I've been doing python for my day job lately, and we use this tool called 'honcho' (a foreman port) to source an env file before running some command. I feel like npm fills a similar niche, in that it sets up an environment. A tool that could reliably source an env file (good luck parsing it, both docker and honcho--edit: AND systemd-- behave differently than bash, and you can't use bash because the export keyword isn't in there) and set up npm-like environment variables (NODE_ENV, add ./node_modules/.bin to path) and then exec a command... would be useful. But also a pretty big project that's going to be hilariously broken in subtle ways, not one I'm sure I want integrated with my runtime. Again, in practice people use templates to generate systemd configs based on the env file and some configured start command (which may or may not be inside the package.json). I feel like this isn't worth trying to implement (in core) because it's either integrated with chef/puppet/ansible/etc, the person doing it has very specific opinions about how this thing should look, or both. |
Maybe "proposal" was the wrong word. you can call them "things I imagined" if you like. But what I'm referring to is things like:
I see this as one of the more concrete suggestions in this thread. So, I guess I was charitable? But my understanding is that this suggests a command that maybe has a subset of curl's capabilities and, in today's use cases, some of tar's capabilities and basically just dumps the contents of the module (which I hope has all its deps bundled--this almost never happens nobody afaik does this) into node_modules. Which again, is honestly just weird. I would never want to do that, and I don't see who this caters to. |
No need to be charitable at all. At this point I'm wanting to tease out more details of what @bmeck may have in mind because if it's limited to just what I am imagining, I'm not seeing a lot of value with having it built in to core. Others have much better imaginations than I, however. |
That is not what I noted. What I noted was that @bmeck has an understanding of this feature that seems to be very different from everyone else. I don't see "several individuals" suggesting this. What I do see is clear consensus that "npm lite in core" is a bad idea, and the degree of badness that each person sees in that idea seems closely correlated to their degree of understanding of the problem. What I also see is a lot of assumption that That has led me (and a few others) to conclude, (a) "npm lite in core" is an implementation detail no one wants, addressing a problem that is not well specified, and (b) that's what most people seem to assume So, it is clear that either the OP feature here is either ill-conceived or spelled wrong. I suggest the following:
Node's module loading behavior is well documented, and that specification is properly owned by the CTC. npm's dependency contract could certainly be more concisely articulated (right now it exists, but is spread out across a few different places). |
Could not agree more. (My not-so-secret agenda in driving towards userland experimentation is that this usually means core doesn't have to add anything ;) |
I don't think what @bmeck is talking is necessarily the same but I do think that it ends up covering some of the package install points here that would be more suited for us to solve, ontop of likely solving other use-cases. (That is to say, I don't think he is the only one.) |
I'd really love to have a discussion around self-extracting and/or egg-like archives (that can be run with node via a shebang or @Fishrock123 what package install points here do you think are suited for node core to solve? |
Is it worth revisiting this discussion considering the current state of https://mobile.twitter.com/palmerj3/status/1141797296004325376
I feel it would be positive step forward to have a |
You file issues on the discourse, which has been the case for awhile. Separately, a lack of commits doesn't necessarily indicate any problem. |
I agree that a lack of changes to a code basis is an insufficient metric to warrant a conversation of change, but the relationship between an application and transmission is continuously worthy of discussion. |
I'm going to try, as best I can, to distill more than a year of conversations about this topic.
This is a touchy subject. There have been numerous threads with various people advocating significant changes to npm or to replace it entirely. This is neither of those things.
npm
by default.With that out of the way, and the frame of debate set within those boundaries, I think we can have a productive conversation.
As it has matured
npm
has become a large and significantly tool for software development. It includes features for multiple development workflows and optimizes itself for developer ergonomics. I don't think we could ask for a better tool for developers.The problem is, not every Node.js install is used by a developer. Many installs happen in infrastructure. These installs run an application and are never touched by anything but infrastructure automation. Yet, these installs still include npm and, in fact, require npm in many cases because it is the best mechanism we have for installing the dependencies the application needs.
node --install
npm install
.npm install --production
.Because the use cases for this are much more narrow than
npm
you can see a future in which additional features are added that are in high demand by production users but make the developer ergonomics more difficult (multiple registry endpoints for instance).I'd like to use this thread to reach a consensus about the scope of this feature, potential pitfalls, and whether or not this is something we agree should be added. From there I can work on a proper Enhancement Proposal.
The text was updated successfully, but these errors were encountered: