Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Instance Preemption Causing Application Level Broken Pipe Issue #48628

Closed
MengjinYan opened this issue Nov 7, 2024 · 0 comments · Fixed by #48636
Closed

[Core] Instance Preemption Causing Application Level Broken Pipe Issue #48628

MengjinYan opened this issue Nov 7, 2024 · 0 comments · Fixed by #48636
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks

Comments

@MengjinYan
Copy link
Collaborator

What happened + What you expected to happen

In a recent investigation, we found that when a task writes message to the local object store at the same time the local object store is being shut down, the Broken pipe error thrown from the write message operation will fail the current task as an application level error. This caused the task to fail without retry.

The expected behavior should be the Broken pipe error should be thrown as a system error so that the task can further retry as configured.

Versions / Dependencies

N/A

Reproduction script

N/A

Issue Severity

None

@MengjinYan MengjinYan added bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core labels Nov 7, 2024
@MengjinYan MengjinYan self-assigned this Nov 7, 2024
@jcotant1 jcotant1 added core Issues that should be addressed in Ray Core and removed core Issues that should be addressed in Ray Core labels Nov 13, 2024
@jjyao jjyao added the P1 Issue that should be fixed within a few weeks label Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants