Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition when sync with child and grandchild #1237

Merged
merged 1 commit into from
Feb 21, 2017

Conversation

hqhq
Copy link
Contributor

@hqhq hqhq commented Dec 19, 2016

Fixes: #1236

Signed-off-by: Qiang Huang [email protected]

@hqhq
Copy link
Contributor Author

hqhq commented Jan 4, 2017

ping @opencontainers/runc-maintainers

@teddyking
Copy link
Contributor

fwiw it looks very likely that this PR also fixes #1281.

@@ -695,6 +723,10 @@ void nsexec(void)
bail("failed to sync with parent: SYNC_RECVPID_ACK: got %u", s);
}

s = SYNC_CHILD_READY;
if (write(syncfd, &s, sizeof(s)) != sizeof(s))
bail("failed to sync with patent: write(SYNC_CHILD_READY)");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/patent/parent/g

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks.

@crosbymichael
Copy link
Member

@cyphar can you take a look at this one?

@stevenh
Copy link
Contributor

stevenh commented Jan 27, 2017

I'm assuming this is in a final state now, if so then would be good to get it in ASAP as its causing pretty much all other PR's to have failed tests :(

@cyphar
Copy link
Member

cyphar commented Jan 27, 2017

/me is looking at it. While it does fix the issue, there's some parts I'd like to get cleaned up before merging.

@stevenh
Copy link
Contributor

stevenh commented Jan 27, 2017

Out of interest does that include changing the way the children are created and reaping the initial child which would make #1301 unneeded?

craigfurman pushed a commit to cloudfoundry/garden-runc-release that referenced this pull request Feb 7, 2017
To the branch await-prs-from-138131099 in the fork in
cloudfoundry-incubator. We can get back onto a released version of runc
when the following PRs are merged:

* opencontainers/runc#1237
* opencontainers/runc#1275

dadoo was the main guardian component that had to change here, to
accomodate changes in runc for processes that have TTYs.

[#138131099]
@crosbymichael
Copy link
Member

crosbymichael commented Feb 7, 2017

LGTM

Approved with PullApprove

@hqhq
Copy link
Contributor Author

hqhq commented Feb 15, 2017

ping @cyphar

Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a couple of questions. Otherwise this LGTM. Sorry for not responding on this for so long.

break;
default:
bail("unexpected sync value");
bail("unexpected sync value: %u", s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand this makes the code nicer, remember that the only debugging information we normally get is the return code. Maybe we don't lose too much information, but something to keep in mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think we lose any information here.

/* Now sync with grandchild. */

ready = false;
while (!ready) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the grandchild is ready before the child -- or if the child dies above but the grandchild is still around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the grandchild is ready before the child

It won't, grandchild won't do anything before father finishes syncing with child and send SYNC_GRANDCHILD to grandchild.

or if the child dies above but the grandchild is still around

All cases that child dies by error after creating grandchild, child will kill grandchild before exit.

@@ -413,7 +414,7 @@ void nsexec(void)
{
int pipenum;
jmp_buf env;
int syncpipe[2];
int syncChildPipe[2], syncGrandchildPipe[2];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please use snake_case in C code.

@hqhq
Copy link
Contributor Author

hqhq commented Feb 18, 2017

Updated to use snake case.

@crosbymichael @cyphar PTAL.

@cyphar
Copy link
Member

cyphar commented Feb 21, 2017

Failure is because shfmt is broken on Go 1.6.

LGTM

Approved with PullApprove

@cyphar
Copy link
Member

cyphar commented Feb 21, 2017

I've made an issue (mvdan/sh#66) to track it.

@hqhq
Copy link
Contributor Author

hqhq commented Feb 21, 2017

ping @crosbymichael

@crosbymichael
Copy link
Member

crosbymichael commented Feb 21, 2017

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit 8438b26 into opencontainers:master Feb 21, 2017
@hqhq hqhq deleted the fix_sync_race branch February 21, 2017 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants