Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe performance degradation after upgrading to 2.2.0 #63

Closed
nemeth-z opened this issue Jan 21, 2020 · 29 comments
Closed

Severe performance degradation after upgrading to 2.2.0 #63

nemeth-z opened this issue Jan 21, 2020 · 29 comments

Comments

@nemeth-z
Copy link

I am in the process of upgrading a project to Gradle 6 and I was happy to find this maintained fork. After upgrading and switching the plugin, our Angular based module build time is increased by ~10 minutes. After analyzing what is going on, it turned out 95% of the build time is spent on task output overlap detection (scanning files in node_modules). There is a recent open Gradle issue about it (gradle/gradle#11792). However this detection was not changed (afaik) with the new release.

After started checking the changes in this fork I noticed the node_modules directory is handled differently in the npmInstall task. It went from an output directory to output files, which could be the reason why the overlap detection went haywire. Additionally a file tree is used which according to the docs also disables caching.

I am not sure yet if this the only or even the real issue though.

@rkrisztian
Copy link

rkrisztian commented Jan 21, 2020

I'm noticing +5 mins of extra time during different NpmTasks like the ones executing Karma tests and ng lint.

@bsautel
Copy link
Contributor

bsautel commented Jan 21, 2020

Are you a Windows user?

The up-to-date detection is slow on Windows, mainly the first time it runs.

You are right, in this fork of the original plugin, the inputs and outputs of the tasks were declared so that they run only when necessary and can be cached if the build cache is enabled.

The npmInstall task creates or adds some files and directory to the node_modules directory. For this reason, it is legitimate that this directory and its files and subdirectories are declared as output. We already discussed about that in issue #50 (we did not take any decision for now). This is the same question regarding the yarn task.

As a workaround, you can ignore the content of the node_module directory, this will solve your issue. But if for some reason, something in the node_modules directory is modified or removed, the npmInstall task will not run whereas it should.

npmInstall {
  nodeModulesOutputFilter = { it.exclude("**") }
}

I am not sure about the ** pattern, I do not have enough time right now to test it. Note that the it variable contains a Gradle ConfigurableFileTree whose root directory is the node_modules directory.

Hope my message will help you!

@rkrisztian
Copy link

rkrisztian commented Jan 21, 2020

In my case, I was on Linux, and the slowness was perceived almost every time.

I did not try your workaround yet, as for me this issue became less important. However, it looks to me that the suggested workaround would break correctness. :(

@bsautel
Copy link
Contributor

bsautel commented Jan 21, 2020

@rkrisztian, do you know which tasks have been slowed down since you use this fork?

I was talking about the fact the node_modules directory is now an @OutputFiles in the npmInstall task. You may not be affected by the same issue.

I assume you do not use the npmInstall task to run commands such as ng lint. If you use NodeTask, NpmTask or NpxTask to do that (NpxTask is the recommended way), the node_modules is not declared as an input or an output by default.

@bsautel
Copy link
Contributor

bsautel commented Jan 21, 2020

@nemeth-z I did not think about that but you are right. Using the @OutputFiles annotation with a FileTree disables caching. That's logical, in a FilleTree some files can be ignored and will not be cached. Using cache would not lead to the exact same output directory.

We changed this recently to fix issue #38.

We do not have any build cache tests for now, this is the reason why we did not notice that the cache no longer works. We should add some tests to ensure the cache works as expected.

I will try to find an issue to get both issue #38 and caching work.

@rkrisztian
Copy link

rkrisztian commented Jan 21, 2020

In my case, my NpmTask calling ng lint depends on npmInstall. Once npmInstall is finished, the build gets stuck for ~5 mins. So I assume the problem is with npmInstall, but I didn't debug it yet... (My test task works similarly.)

@deepy
Copy link
Member

deepy commented Jan 21, 2020

@rkrisztian are you able to share a gradle buildscan or the result of --profile?

@rkrisztian
Copy link

@deepy I'll sketch up a test project for you, please give me some time.

@rkrisztian
Copy link

See test-node-perf.zip.

$ gw npmRunLint
Using gradle at '/home/rkrisztian/projects/testProjects/test-node-perf/gradlew' to run buildfile '/home/rkrisztian/projects/testProjects/test-node-perf/build.gradle':

Configuration on demand is an incubating feature.
> Task :test1:nodeSetup UP-TO-DATE
> Task :test2:nodeSetup UP-TO-DATE
> Task :test1:npmSetup UP-TO-DATE
> Task :test2:npmSetup UP-TO-DATE
> Task :test1:npmInstall UP-TO-DATE
> Task :test2:npmInstall UP-TO-DATE

> Task :test1:npmRunLint

> [email protected] lint /home/rkrisztian/projects/testProjects/test-node-perf/test1
> ng lint

Linting "test1"...
All files pass linting.

> Task :test2:npmRunLint

> [email protected] lint /home/rkrisztian/projects/testProjects/test-node-perf/test2
> ng lint

Linting "test2"...
All files pass linting.

BUILD SUCCESSFUL in 1m 34s
8 actionable tasks: 2 executed, 6 up-to-date

$ gw npmRunLint --no-parallel
Using gradle at '/home/rkrisztian/projects/testProjects/test-node-perf/gradlew' to run buildfile '/home/rkrisztian/projects/testProjects/test-node-perf/build.gradle':

Configuration on demand is an incubating feature.
> Task :test1:nodeSetup UP-TO-DATE
> Task :test1:npmSetup UP-TO-DATE
> Task :test1:npmInstall UP-TO-DATE

> Task :test1:npmRunLint

> [email protected] lint /home/rkrisztian/projects/testProjects/test-node-perf/test1
> ng lint

Linting "test1"...
All files pass linting.

> Task :test2:nodeSetup UP-TO-DATE
> Task :test2:npmSetup UP-TO-DATE
> Task :test2:npmInstall UP-TO-DATE

> Task :test2:npmRunLint

> [email protected] lint /home/rkrisztian/projects/testProjects/test-node-perf/test2
> ng lint

Linting "test2"...
All files pass linting.

BUILD SUCCESSFUL in 15s
8 actionable tasks: 2 executed, 6 up-to-date

This is exactly what I'm experiencing in a real project but with the mentioned much higher build times when building the project in parallel. However, I am not sure if my problem is related to what was originally reported.

@bsautel
Copy link
Contributor

bsautel commented Jan 22, 2020

Thanks @rkrisztian for this project that reproduces this issue. I now understand what's happening.

The main project I am working on contains two Angular applications (similar but much bigger than these ones) and I have this severe performance degradation.

In fact I am not using the plugin's npmInstall task, I use npm ci via a custom task in which the node_modules directory is declared as @OutputDirectory.

The performance drop you are facing (and probably @nemeth-z also) seems to be due to the #38 fix. In 8f4d0bb I replaced the @OutputDirectory annotation by the @OutputFiles. It sounds like this is the cause of this performance drop. It also broke caching.

I tried to build your project with the version 2.1.1 of this plugin and it is fast (at least on Linux). I did not test it on Windows and I assume it will not be fast because it sounds like using @OutputDirectory is slow on Windows.

As a workaround, for now, you can either use the 2.1.1 version of this plugin or use a custom task to replace the npmInstall one. On my side I am going to look for a solution which is fast and fixes #38.

Thanks for your help!

@nemeth-z
Copy link
Author

I tried both suggestion (2.1.1 and filetree filter) and both solved the performance issue on Win and Linux as well. Thanks for the hints, given that only the 2.2 release notes mentioned Gradle 6 compatibility, I have not considered downgrading.

@rkrisztian
Copy link

rkrisztian commented Jan 22, 2020

Sorry for asking, but do you think it's really a good idea to add up-to-date checks and cacheability to this npmInstall task? (Or whatever would run npm ci for that matter.) Because, NPM has its own ways for checking up-to-dateness, so doing the same in Gradle feels like reinventing the same functionality of NPM in Gradle. Which is hard to do perfectly, and just as efficiently. Modules like webdriver-manager tend to add files inside node_modules, e.g. as you execute ng e2e, making build caching feature not work anymore in Gradle (unless you hunt down each and every such difference and add it as an exclusion). Also, caching the downloaded dependencies is also weird, as NPM already caches stuff in: ~/.npm. So we're duplicating a lot of files too, possibly of huge sizes. (And a generic rule of thumb is that rather CPU-heavy tasks should be cached, not I/O-heavy ones.)

@deepy
Copy link
Member

deepy commented Jan 23, 2020

There's no cacheability being added to npmInstall, it's just for up-to-date checks and it's not a perfect situation

@bsautel
Copy link
Contributor

bsautel commented Jan 24, 2020

Thanks @rkrisztian for this relevant question.

  1. Regarding the cacheability, you are right, I am not sure this makes sense. You are right, npm already has a cache. Moreover, some npm packages such as node-sass contain some native code. After installing it on Linux, the node_modules/node-sass/vendor directory contains:
└── linux-x64-72
    └── binding.node

If we want to make it cacheable, we have to ensure that the cache is compatible with the system, that could be very tricky.

So you are right, we probably don't want this task to be cacheable.

  1. Regarding, up-to-date checking, I tried to fix this issue (Severe performance degradation after upgrading to 2.2.0 #63) when you do not need to ignore some files inside the node_modules directory (issue [2.1.0] Up-to-date checks are worse than before #38). My fix seems to be good but I need to adjust some tests.

I did a small experiment based on your project. As you suggested, I run ./gradlew npmRunLint --no-parallel twice in a row (and I only care about the second execution because we are talking about up-to-date checking). This runs the nodeSetup, npmSetup, npmInstall and npmRunLint tasks on projects test1 and test2.

When I force npmInstall to run by removing the output directory and replacing it by outputs.upToDateWhen { false }, I rely on npm's up-to-date checking ability. The whole Gradle command takes about 15 seconds on my Linux laptop.

When using the plugin containing my fix, the npmInstall task does not run because it is already up-to-date. The whole Gradle command takes about 1 second.

On Linux and with this size of project, it appears that Gradle's up-to-date checking is much faster than npm's one. As far as I know, on Windows, Gradle's up-to-date checking is much slower, that would be interesting to make this same experiment.

@bsautel
Copy link
Contributor

bsautel commented Jan 25, 2020

I did the same experiment on a Windows VM on my laptop.
When using Gradle's up-to-date checking: about 3 seconds.
When forcing npm install to run: about 30 seconds.

It appears that Gradle's up-to-date mechanism (using @OutputDir) is much faster than npm's one also on Windows (at least for that type of npm project).

I think that we should let Gradle's up-to-date checking enabled for this task.

@rkrisztian
Copy link

rkrisztian commented Jan 25, 2020

I see now, thanks for the explanations, @deepy & @bsautel.

@bsautel bsautel changed the title Severe performance degradation after upgrading from the original plugin Severe performance degradation after upgrading to 2.2.0 Jan 25, 2020
bsautel added a commit that referenced this issue Jan 25, 2020
…te detection much slower. We now use @OutputDirectory unless a node_modules output filter is defined.
@bsautel
Copy link
Contributor

bsautel commented Jan 25, 2020

I created a 2.x branch in which I fixed this severe performance degradation.

Could one of you @rkrisztian or @nemeth-z test it please in real conditions before we publish it?

Doing that is very simple:

  1. Clone the source code repository and check the 2.x branch out.
  2. Edit the settings.gradle file of the project in which you use the plugin and add this line:
includeBuild "../path/to/gradle-node-plugin" // Don't forget to adjust the path to the gradle-node-plugin repository directory

Gradle will understand that this is the source code of the plugin and will use it instead of the jar downloaded from the repository. As it is a Gradle project, it knows how to compile it.
3. Remove any usage of nodeModulesOutputFilter as a workaround of this issue
4. Build your project as usual.
5. Is it faster now?

bsautel added a commit that referenced this issue Jan 25, 2020
…te detection much slower. We now use @OutputDirectory unless a node_modules output filter is defined.
@nemeth-z
Copy link
Author

I tested with the branch and it works correctly. Checked the commit and I am curious though, what is the use case for using the tree filter (and falling back to output files)?

@bsautel
Copy link
Contributor

bsautel commented Jan 27, 2020

Ok, nice! Thanks @nemeth-z!

The file tree filter is to fix issue #38. Some npm commands (the Ember.js build in this issue) modify some files of the installed packages in node_modules. I don't think that doing that is a very good idea, but in this case the npmInstall task is never up-to-date since Gradle notices that the output directory changed and it decides to rerun it.

Indeed, when using @OutputDirectory, Gradle ignores new files but reruns the task when one file is modified or deleted.

To fix issue #38, we need a way to tell Gradle to ignore some files. This is the reason why use replaced the @OutputDirectory by a @OutputFiles file tree (on which we can exclude some files). But this is much slower and in most cases we do not need to be able to filter it.

@nemeth-z
Copy link
Author

I see. Thanks for the quick fix and explanation.
When is the next release planned by the way? I just realized there was a fix for 2.2 solving the missing input/output annotation on the execrunner and it breaks the plugin verification (I have my own wrapper plugin).

@bsautel
Copy link
Contributor

bsautel commented Jan 27, 2020

I think we are ready to release a 2.2.1 version including this fix. @deepy, what do you think about that?

I am afraid I did not understand what you mean @nemeth-z. Could you give me more details please?

@deepy
Copy link
Member

deepy commented Jan 27, 2020

We can probably release 2.2.1, but having moved the output into project.afterEvaluate we're now one step further from lazy tasks on master :-(

Although I guess we don't really need that solved before releasing 3.x

@nemeth-z
Copy link
Author

@bsautel I have a plugin that extends the Gradle node plugin with some additional configuration and custom npm tasks. With Gradle 6 and node plugin 2.1.1, the built in verification breaks as some of the properties are not annotated (extending from npmTask that has violations). These problems are already fixed in 2.2 so if possible I would like to use a version containing both fixes.

As for afterEval, have you though about a configuration tasks? Basically moving the logic from AfterEval to a task that is depended by the install tasks.

@bsautel
Copy link
Contributor

bsautel commented Jan 28, 2020

Ok, I now see what you mean. The version we are ready to release is the equivalent of the 2.2.0 plus the fix of this issue. We did not change anything else. In this coming version, all the properties are annotated and the validatePlugins task will report no issues.

I talked about the 2.1.1 version above as a temporary workaround until we release a fix because the cause of this performance drop is a change in the 2.2.0 version.

@deepy and @nemeth-z, that's right, one more afterEval 😟. I think it is not an issue for the 2.x branch, but that would be nice we find another solution in the 3.x branch. For now, I don't know think about any other way of doing that. @nemeth-z, could you explain a little more your idea based on configuration tasks? This is maybe a Gradle concept I don't know.

@nemeth-z
Copy link
Author

To my surprise I no longer be able to find a reference to this approach (which makes me wonder whether it is idiomatic). The idea is roughly:

task npmInstallTaskConfig {
  doLast {
    // output setup code, same as in the current afterEval
  }
}
npmInstall.dependsOn ...

This way task configuration is pushed to execution time to the latest possible moment, with the added benefit of only doing it when npmInstall is actually executed.

@deepy
Copy link
Member

deepy commented Jan 28, 2020

I think it still is idiomatic but mostly got replaced by Providers and Properties

@nemeth-z
Copy link
Author

@bsautel, @deepy, about the hot-fix release: what is the plan, when we can expect it?

@deepy
Copy link
Member

deepy commented Jan 31, 2020

My bad, I misunderstood your previous comment about versions, I'll release it in a bit

@deepy
Copy link
Member

deepy commented Jan 31, 2020

@nemeth-z 2.2.1 is available on the plugin portal now, sorry about the delay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants