Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkpoint: support lazy migration #1541

Merged
merged 3 commits into from
Sep 7, 2017
Merged

Conversation

adrianreber
Copy link
Contributor

@adrianreber adrianreber commented Aug 1, 2017

With the help of userfaultfd CRIU supports lazy migration. Lazy
migration means that memory pages are only transferred from the
migration source to the migration destination on page fault.

This enables to reduce the downtime during process or container
migration to a minimum as the memory does not need to be transferred
during migration.

Lazy migration currently depends on userfaultfd being available on the
current Linux kernel and if the used CRIU version supports lazy
migration. Both dependencies can be checked by querying CRIU via RPC if
the lazy migration feature is available. Using feature checking instead
of version comparison enables runC to use CRIU features from the
criu-dev branch. This way the user can decide if lazy migration should
be available by choosing the right kernel and CRIU branch.

To use lazy migration the CRIU process during dump needs to dump
everything besides the memory pages and then it opens a network port
waiting for remote page fault requests:

 # runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \
  --status-fd /tmp/postcopy-pipe

In this example CRIU will hang/wait once it has opened the network port
and wait for network connection. As runC waits for CRIU to finish it
will also hang until the lazy migration has finished. To know when the
restore on the destination side can start the '--status-fd' parameter is
used:

 #️ runc checkpoint --help | grep status
  --status-fd value   criu writes \0 to this FD once lazy-pages is ready

The parameter '--status-fd' is directly from CRIU and this way the
process outside of runC which controls the migration knows exactly when
to transfer the checkpoint (without memory pages) to the destination and
that the restore can be started.

On the destination side it is necessary to start CRIU in 'lazy-pages'
mode like this:

 # criu lazy-pages --page-server --address 192.168.122.3 --port 27 \
  -D checkpoint

and tell runC to do a lazy restore:

 # runc restore -d --image-path checkpoint --work-path checkpoint \
  --lazy-pages httpd

If both processes on the restore side have the same working directory
'criu lazy-pages' creates a unix domain socket where it waits for
requests from the actual restore. runC starts CRIU restore in lazy
restore mode and talks to 'criu lazy-pages' that it wants to restore
memory pages on demand. CRIU continues to restore the process and once
the process is running and accesses the first non-existing memory page
the 'criu lazy-pages' server will request the page from the source
system. Thus all pages from the source system will be transferred to the
destination system. Once all pages have been transferred runC on the
source system will end and the container will have finished migration.

This can also be combined with CRIU's pre-copy support. The combination
of pre-copy and post-copy (lazy migration) provides the possibility to
migrate containers with minimal downtimes.

Some additional background about post-copy migration can be found in
these articles:

https://lisas.de/~adrian/?p=1253
https://lisas.de/~adrian/?p=1183

Signed-off-by: Adrian Reber [email protected]

@adrianreber
Copy link
Contributor Author

This includes the same commits as #1535 as it needs them. The only new commits are the two latest commits.

@adrianreber
Copy link
Contributor Author

@rppt: FYI

@adrianreber
Copy link
Contributor Author

Updated with the review results of #1535

@adrianreber
Copy link
Contributor Author

@avagin: Do you have any comments on this?

@crosbymichael
Copy link
Member

@kolyshkin can you also take a look at this change? Thanks!

@avagin
Copy link
Contributor

avagin commented Aug 8, 2017

@adrianreber we need a test case for this. How it will be integrated with phaul? (https://github.com/xemul/criu/tree/criu-dev/phaul)

@avagin
Copy link
Contributor

avagin commented Aug 8, 2017

The lazy migration feature is in a criu development branch and we are going to release it in CRIU 3.4 (Sep 2017).

@adrianreber
Copy link
Contributor Author

@avagin, I was also thinking about the test case but as the container is not totally destroyed before complete memory migration my first attempts failed as it was not possible to restore the same container with the same name. I will try to change the name, that might work. Let me try to add a working test-case for lazy migration into runC. I will update this PR.

About p.haul: Not that it really belongs here, but p.haul feels kind of abandoned, especially with the difficulties of integrating p.haul and all the container engines in a useful way. So I am also leaning more towards replacing the pre-copy code in runC with CRIU's go migration library. So the main question is p.haul in its current form still alive or not? For me it seems not very alive in its current form.

@xemul
Copy link

xemul commented Aug 10, 2017

@adrianreber , the py version of the p.haul (that sits in a separate repo) is indeed abandoned. Mostly for the reasons you've mentioned -- too hard to integrate python code with anything else. At the same time the go p.haul, that sits in the criu repo is the place where support for criu live migration features will (well, should) go.

@xemul
Copy link

xemul commented Aug 10, 2017

@avagin , 3.4 is going to be in August. And lazy pages are thus aimed at 3.5.

@adrianreber
Copy link
Contributor Author

@avagin test case now included

@adrianreber
Copy link
Contributor Author

Rebased after recent breakage. Any reviewers?


data := make([]byte, 1)
count, _ := r.Read(data)
logrus.Debugf("%d:%s", count, status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove these debug statements?

func waitForCriuLazyServer(r *os.File, status string) error {

data := make([]byte, 1)
count, _ := r.Read(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add error handling for various calls in this function? Read, OpenFile, and Write are all missing

Before adding the actual lazy migration support, this adds the feature
check for lazy-pages. Right now lazy migration, which is based on
userfaultd is only available in the criu-dev branch and not yet in a
release. As the check does not dependent on a certain version but on
a CRIU feature which can be queried it can be part of runC without a new
version check depending on a feature from criu-dev.

Signed-off-by: Adrian Reber <[email protected]>
With the help of userfaultfd CRIU supports lazy migration. Lazy
migration means that memory pages are only transferred from the
migration source to the migration destination on page fault.

This enables to reduce the downtime during process or container
migration to a minimum as the memory does not need to be transferred
during migration.

Lazy migration currently depends on userfaultfd being available on the
current Linux kernel and if the used CRIU version supports lazy
migration. Both dependencies can be checked by querying CRIU via RPC if
the lazy migration feature is available. Using feature checking instead
of version comparison enables runC to use CRIU features from the
criu-dev branch. This way the user can decide if lazy migration should
be available by choosing the right kernel and CRIU branch.

To use lazy migration the CRIU process during dump needs to dump
everything besides the memory pages and then it opens a network port
waiting for remote page fault requests:

 # runc checkpoint httpd --lazy-pages --page-server 0.0.0.0:27 \
  --status-fd /tmp/postcopy-pipe

In this example CRIU will hang/wait once it has opened the network port
and wait for network connection. As runC waits for CRIU to finish it
will also hang until the lazy migration has finished. To know when the
restore on the destination side can start the '--status-fd' parameter is
used:

 #️ runc checkpoint --help | grep status
  --status-fd value   criu writes \0 to this FD once lazy-pages is ready

The parameter '--status-fd' is directly from CRIU and this way the
process outside of runC which controls the migration knows exactly when
to transfer the checkpoint (without memory pages) to the destination and
that the restore can be started.

On the destination side it is necessary to start CRIU in 'lazy-pages'
mode like this:

 # criu lazy-pages --page-server --address 192.168.122.3 --port 27 \
  -D checkpoint

and tell runC to do a lazy restore:

 # runc restore -d --image-path checkpoint --work-path checkpoint \
  --lazy-pages httpd

If both processes on the restore side have the same working directory
'criu lazy-pages' creates a unix domain socket where it waits for
requests from the actual restore. runC starts CRIU restore in lazy
restore mode and talks to 'criu lazy-pages' that it wants to restore
memory pages on demand. CRIU continues to restore the process and once
the process is running and accesses the first non-existing memory page
the 'criu lazy-pages' server will request the page from the source
system. Thus all pages from the source system will be transferred to the
destination system. Once all pages have been transferred runC on the
source system will end and the container will have finished migration.

This can also be combined with CRIU's pre-copy support. The combination
of pre-copy and post-copy (lazy migration) provides the possibility to
migrate containers with minimal downtimes.

Some additional background about post-copy migration can be found in
these articles:

 https://lisas.de/~adrian/?p=1253
 https://lisas.de/~adrian/?p=1183

Signed-off-by: Adrian Reber <[email protected]>
The lazy-pages test case is not as straight forward as the other test
cases. This is related to the fact that restoring requires a different
name if restored on the same host. During 'runc checkpoint' the
container is not destroyed before all memory pages have been transferred
to the destination and thus the same container name cannot be used.

As real world usage will rather migrate a container from one system to
another than lazy migrate a container on the same host this is only
problematic for this test case.

Another reason is that it requires starting 'runc checkpoint' and 'criu
lazy-pages' in the background as those process need to be running to
start the final restore 'runc restore'.

CRIU upstream is currently discussing to automatically start 'criu
lazy-pages' which would simplify the lazy-pages test case a bit.

The handling and checking of the background processes make the test case
not the most elegant as at one point a 'sleep 2' is required to make
sure that 'runc checkpoint' had time to do its thing before looking at
log files.

Before running the actual test criu is called in feature checking mode
to make sure lazy migration is in the test case criu enabled. If not,
the test is skipped.

Signed-off-by: Adrian Reber <[email protected]>
@crosbymichael
Copy link
Member

crosbymichael commented Sep 6, 2017

LGTM

Approved with PullApprove

1 similar comment
@mrunalp
Copy link
Contributor

mrunalp commented Sep 7, 2017

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit 7e036aa into opencontainers:master Sep 7, 2017
@adrianreber adrianreber deleted the lazy branch September 8, 2017 05:34
# there is some basic error. If the lazy migration is ready can
# be handled by $lazy_pipe. Which probably will always be ready
# after sleeping two seconds.
sleep 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't only ugly, it is probably a signal that the interface is not designed properly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrianreber I think we can discuss this interface on LPC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants