-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libc/int: skip TestUpdateDevices* on i386 #4595
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Kir Kolyshkin <[email protected]>
Works as intended:
|
Generally speaking, maybe we should just skip the entirety of |
If this is making CI fail, I think it is ok to merge it. But I don't understand:
|
I've been trying to run the "broken" tests locally and they seem to pass just fine. I'll do more stress tests next week, but this smells to me as something that only fails on CI? |
Simply speaking, we don't support i386; it just happens to work. We added this job merely to ensure our code is 32bit-clean; see #2768 and this comment: runc/.github/workflows/test.yml Lines 186 to 189 in 8702864
AFAIK there's still no 32bit ARM support in GHA, so we are stuck with i386.
This started happening about two months ago, and is not correlated with any changes we made at that time. So I suspect there's something wrong with Azure Linux/i386 support, or some particular hardware, or perhaps some i386 libraries, or maybe Go i386 port. Ideally, I'd love to get to the bottom of it, and there is a chance to find something interesting and useful. Practically, though, I don't have time for this, given how hard it is to reproduce. It does not look like anyone else has. So I think the best decision is to skip this, allowing for better use of developers' time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolyshkin LGTM. thanks for the link, I didn't know we added this as a proxy. Makes sense to skip it.
Below are some details about my findings, but I'm giving up too. It seems this is a regression outside of runc, and while I'd love to report it, it's so hard to repro that I'm giving up.
I could repro twice in a local ubuntu 22.04, that makes Azure quite unlikely the cause. I guess the most likely scenario is some package update in ubuntu that causes this sporadic failure, given that this started happening at a "random" point in time. But it could be a go issue (locally I'm trying with 1.23.5, CI is using 1.23.4).
The error seems to be when decoding a json, and we are not wrapping it with more text, so it is unclear where it comes from (here logs from when I hit it):
time="2025-01-27T16:14:02+01:00" level=info msg="removing old filter 1 from cgroup" id=7085 name= run_count=0 runtime=0s tag=fb6cb1c301453333 type=CGroupDevice
update_test.go:67: unexpected error: unable to start container process: invalid character 'ÿ' looking for beginning of object key string
--- FAIL: TestUpdateDevices (0.68s)
=== RUN TestUpdateDevicesSystemd
update_test.go:41: unexpected error: unable to start container process: error during container init: invalid character 'ÿ' looking for beginning of object key string
update_test.go:38: exec: Wait was already called
--- FAIL: TestUpdateDevicesSystemd (0.16s)
I've tried to run it 1400 times again with more info and I can't repro :(.
@rata thanks, this is quite interesting that you were able to repro this (I never managed to do that). Did you ran an isolated test (like I think that increasing the number of iterations inside the test (currently |
Fixes: #4594