Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][Packaging][Release] Jobs that run on ARM self-hosted runners are flaky and failing with communication lost #44418

Open
raulcd opened this issue Oct 15, 2024 · 7 comments

Comments

@raulcd
Copy link
Member

raulcd commented Oct 15, 2024

Describe the bug, including details regarding any error messages, version, and platform.

The k8s self-hosted runners solution is slightly flaky lately. See for example:

The error:

The self-hosted runner: k8s-runners-linux-arm-8g6tn-gpmc7 lost communication with the server.

I am seeing this happening on the maintenance branch for the release too.

Component(s)

Continuous Integration, Packaging, Release

@raulcd
Copy link
Member Author

raulcd commented Oct 15, 2024

cc @assignUser

@assignUser
Copy link
Member

Will investigate

@assignUser
Copy link
Member

This type of error usually happens when the runner pod gets oom or cpu killed, did we increase the feature set that's build or something like that, that might increase memory or cpu use?

@kou
Copy link
Member

kou commented Oct 16, 2024

#44348 may be related. It enables Azure file system.

@kou
Copy link
Member

kou commented Oct 16, 2024

Can we increase assigned resources for the runner?

@assignUser
Copy link
Member

Ah yeah that could do it, I'll see what I can do.

@assignUser
Copy link
Member

The runner resources where increased, should take effect soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants