Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang on Windows after "Adding to the cache ..." #11494

Open
2 of 5 tasks
kevinoid opened this issue Mar 17, 2023 · 12 comments
Open
2 of 5 tasks

Hang on Windows after "Adding to the cache ..." #11494

kevinoid opened this issue Mar 17, 2023 · 12 comments
Labels
bug Something isn't working external

Comments

@kevinoid
Copy link

Description:
In one of my repositories, actions/setup-node hangs after "Adding to the cache ..." until the job times out.

Action version:
v3

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Tools version:
I've observed the error with node-version: '14.18' and node-version: '>=17.1'.

Repro steps:

I'm observing the issue in my eslint-config-kevinoid repository. I've created a minimal workflow which reproduces the issue which produced this failing workflow run.

Minimal Workflow Issue Reproduction YAML
name: Failing Workflow
on:
  push: {}
  workflow_dispatch: {}
jobs:
  test:
    name: Node ${{ matrix.node }} ${{ matrix.arch }} on ${{ matrix.os }}
    timeout-minutes: 5
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        arch:
        - x64
        os:
        - windows-latest
        node:
        - '14.18'
    steps:
    - uses: actions/checkout@v3
    - name: Set up Node.js ${{ matrix.node }}
      uses: actions/setup-node@v3
      with:
        node-version: ${{ matrix.node }}
        architecture: ${{ matrix.arch }}
        check-latest: ${{ matrix.node == '*' }}

Note that the 5 minute timeout in the reproduction was added arbitrarily to make the issue easier to test and demonstrate. The original workflow actions/setup-node@v3 failed after 6 hours.

Also note that the issue does not occur if the actions/checkout@v3 step is removed.

Expected behavior:
The actions/setup-node@v3 step would complete in a reasonable amount of time.

Actual behavior:
The actions/setup-node@v3 step does not complete after 5 minutes (or 6 hours).

@kevinoid kevinoid added bug Something isn't working needs triage labels Mar 17, 2023
@dmitry-shibanov
Copy link
Contributor

Hello @kevinoid. Thank you for your report. We'll investigate the issue.

@kevinoid
Copy link
Author

kevinoid commented Apr 5, 2023

@dmitry-shibanov Were you able to investigate the issue? Is there anything else that I can do to assist? It appears that builds are still hanging in "Adding to the cache ...".

@panticmilos panticmilos self-assigned this Apr 12, 2023
@dusan-trickovic
Copy link

Hello, @kevinoid ! I'm sorry for the late response, I just wanted to give you a little ping to see if your issue resolved in the meantime? :)

@kevinoid
Copy link
Author

kevinoid commented May 5, 2023

Hi @dusan-trickovic, Thanks for checking in! Nope, it's still not resolved. Is there anything I can do to help investigate?

@dusan-trickovic
Copy link

dusan-trickovic commented May 5, 2023

Understood. I will investigate it and reach out to you again when I have a solution or if I need some more clarification (or at least when I have some updates / suggestions). Thank you very much for your cooperation! :)

@kevinoid
Copy link
Author

kevinoid commented May 7, 2023

I've determined the cause of the issue: The invocation of C:\npm\prefix\yarn.cmd --version in printEnvDetailsAndSetOutput() hangs due to the presence of a file named node.js in the repository. This occurs because of a regression (npm/cmd-shim#64 npm/cmd-shim#71) in the .cmd shim generated for yarn by recent versions of npm which invokes node.js instead of node.exe.

Although the root cause of the issue is not in setup-node, I would suggest adding a reasonable timeout for the --version invocations along with a useful error message to aid future users affected by similar issues.

kevinoid referenced this issue in kevinoid/eslint-config-kevinoid May 7, 2023
To avoid inadvertently running node.js (using Windows Script Host)
instead of node.exe when .js is present in %PATHEXT% (as it is by
default).  The problem is exacerbated by a regression in .cmd shims
generated by npm (npm/cmd-shim#64
npm/cmd-shim#71) and has already caused
problems in CI (https://github.com/actions/setup-node/issues/720).

Continue to export node and node.js from the package for backward
compatibility.  These may be removed in a future version.

Signed-off-by: Kevin Locke <[email protected]>
@dusan-trickovic
Copy link

Hi, @kevinoid ! Thanks for the update, I'm glad you've found the culprit behind this issue :) And thank you for the suggestion - I will investigate it and forward it to my team as well :)

@mahabaleshwars
Copy link

Hello @kevinoid, thank you for your investigation. In order to investigate further and work on the issue, could you help us by providing the repro steps? This will assist us in replicating the issue and resolving it more effectively. Appreciate your help!

Image

@kevinoid
Copy link
Author

kevinoid commented Dec 9, 2024

In order to investigate further and work on the issue, could you help us by providing the repro steps?

Thanks @mahabaleshwars. I've copied the reproduction into https://github.com/kevinoid/setup-node-issue-720. To reproduce the issue, simply clone the repository and run the workflow in GitHub Actions.

@mahabaleshwars
Copy link

mahabaleshwars commented Jan 6, 2025

Hi @kevinoid,

This issue needs to be fixed in the Windows runner image instead of implementing a timeout for --version invocations.
While adding a timeout in setup-node may serve as a reasonable short-term workaround, it is not the ideal long-term solution. A few considerations:

  • Temporary Fix: Users could still face similar issues with other tools, as the core problem lies in the Windows runner or npm's behavior.
  • Potential for False Positives: If the timeout is set too aggressively, legitimate operations may be cut off prematurely, potentially leading to false failures in environments where the invocation takes longer than expected.
    Addressing the root cause in the Windows runner image would provide a more stable and lasting resolution.

Please can you raise this issue with runner-images?

@kevinoid
Copy link
Author

Hi @mahabaleshwars,

I agree that fixing the root cause is preferable to adding a timeout. Since the root cause is an old bug in npm (npm/cmd-shim#64 and npm/cmd-shim#71), I don't see much value of opening an issue in runner-images. Are you hoping they'll come up with a workaround, or that it'll generate more interest, or something else?

@HarithaVattikuti
Copy link
Collaborator

HarithaVattikuti commented Jan 28, 2025

Moving to runner-images team for further investigation

@HarithaVattikuti HarithaVattikuti transferred this issue from actions/setup-node Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external
Projects
None yet
Development

No branches or pull requests

8 participants
@kevinoid @dmitry-shibanov @HarithaVattikuti @panticmilos @dusan-trickovic @priya-kinthali @mahabaleshwars and others