-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: improve the way cockroach is run #64177
Conversation
Hm, it looks like |
56a5c29
to
95b0fda
Compare
I think a better way to do this would be to use |
Cool, really appreciate it. I agree that it makes sense to productionize the ad-hoc starting/stopping we have right now. Let me know how I can help. |
95b0fda
to
97bdb9d
Compare
97bdb9d
to
2fae05a
Compare
Updated - we now run with |
4347aa8
to
e6297b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 3 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @cucaroach, @irfansharif, and @tbg)
cat > ${HOME}/run-cockroach.sh <<'EOF' && \ | ||
sudo systemd-run --unit cockroach --uid $uid --gid $gid -p MemoryMax=%[9]s bash ${HOME}/run-cockroach.sh | ||
#!/bin/bash | ||
ulimit -c unlimited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no set -euo pipefail
?
# Output a message visible when checking the status (via sudo systemctl status). | ||
echo "Starting cockroach (output redirected to %[1]s)" | ||
export ROACHPROD=%[3]d%[4]s; | ||
set -x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want -e
and +e
here? x
prints the commands the shell executes but that doesn't seem what you'd want here.
code=$? | ||
set +x | ||
echo "cockroach exited with exit code $code" >> %[1]s/roachprod.log | ||
echo "cockroach exited with exit code $code" >> %[1]s/cockroach.stderr.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider throwing a $(date) in these so that we can figure out the exact time it stopped
uid=$(id -u) | ||
gid=$(id -g) | ||
cat > ${HOME}/run-cockroach.sh <<'EOF' && \ | ||
sudo systemd-run --unit cockroach --uid $uid --gid $gid -p MemoryMax=%[9]s bash ${HOME}/run-cockroach.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more useful flags that we may want to add
-r, --remain-after-exit
--working-directory=
I also wonder, don't we need some probe to determine when to return? --background
returns when the ports are bound (so that you don't get spurious connection refused when connecting right after c.Start()
). What prevents that flake here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--user
Talk to the service manager of the calling user, rather than the service manager of the system.
Maybe this avoids the need for going through root
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, the memory limit doesn't work with --user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question about the probe.. Are you sure it's not the same currently? AFAICT with --background
we fork at the very beginning: https://github.com/cockroachdb/cockroach/blob/master/pkg/cli/start.go#L283
// NB: this is awkward as when the process fails, the test runner will show an | ||
// unhelpful empty error (since everything has been redirected away). This is | ||
// unfortunately equally awkward to address. | ||
if h.c.IsLocal() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't let these diverge as much. Of course we can't ask for systemd to be installed locally, but can we share the run-cockroach.sh
script and make the only difference that in local
we invoke it with a --background
arg directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @irfansharif, and @RaduBerinde)
pkg/cmd/roachprod/install/cockroach.go, line 427 at r1 (raw file):
// strict memory limit and provides tools to monitor the process. cmd := fmt.Sprintf(` sudo systemctl reset-failed cockroach
Is sudo guaranteed to work in all envs roachprod supports, is this a new requirement?
pkg/cmd/roachprod/install/cockroach.go, line 454 at r1 (raw file):
gid=$(id -g) cat > ${HOME}/run-cockroach.sh <<'EOF' && \ sudo systemd-run --unit cockroach --uid $uid --gid $gid -p MemoryMax=%[9]s bash ${HOME}/run-cockroach.sh
I think we'll probably want the ability to play with MemoryMax % and also specify MemoryHigh which appears to be a "soft" limit (https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#MemoryHigh=bytes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @irfansharif, @RaduBerinde, and @tbg)
pkg/cmd/roachprod/install/cluster_synced.go, line 159 at r1 (raw file):
} // Stop TODO(peter): document
would you mind taking the opportunity and document the wait
?
pkg/cmd/roachprod/install/cluster_synced.go, line 176 at r1 (raw file):
var resetServiceCmd string if !c.IsLocal() { // Since we're waiting (the default), clean up the service.
consider saying a few more words about what this reset-failed
does.
pkg/cmd/roachprod/install/cockroach.go, line 451 at r1 (raw file):
fi uid=$(id -u)
please write something about how --uid
and --gid
relate to --user
, and how we can't use --user
.
pkg/cmd/roachprod/install/cockroach.go, line 461 at r1 (raw file):
echo ">>> roachprod start: $starttime" >> %[1]s/roachprod.log ps axeww -o pid -o command >> %[1]s/roachprod.log [ -x /usr/bin/lslocks ] && /usr/bin/lslocks >> %[1]s/roachprod.log
If you know why this is here, consider putting a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTRs! I'll work more on it.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @andreimatei, @cucaroach, @irfansharif, @RaduBerinde, and @tbg)
pkg/cmd/roachprod/install/cockroach.go, line 399 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
We shouldn't let these diverge as much. Of course we can't ask for systemd to be installed locally, but can we share the
run-cockroach.sh
script and make the only difference that inlocal
we invoke it with a--background
arg directly?
I don't want to create a script on your machine.. But I can extract the common stuff in the code here
pkg/cmd/roachprod/install/cockroach.go, line 427 at r1 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
Is sudo guaranteed to work in all envs roachprod supports, is this a new requirement?
It's a new requirement, but it should be available in all VMs that we set up. We won't use it for "local" clusters.
pkg/cmd/roachprod/install/cockroach.go, line 454 at r1 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
I think we'll probably want the ability to play with MemoryMax % and also specify MemoryHigh which appears to be a "soft" limit (https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#MemoryHigh=bytes)
We can play with that and add a config if we conclude it's useful. I am hesitant to use the soft limit (which slows down the program). It would be very hard to debug that slowdown, and it could be especially problematic with benchmarks.
pkg/cmd/roachprod/install/cockroach.go, line 461 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
If you know why this is here, consider putting a comment.
No idea, does anyone else?
pkg/cmd/roachprod/install/cockroach.go, line 468 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
do you want
-e
and+e
here?x
prints the commands the shell executes but that doesn't seem what you'd want here.
We definitely don't want -e
, we want to be able to run the code after it when it fails. I wanted to show the command so you can see it if you do systemctl status cockroach
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @andreimatei, @cucaroach, @irfansharif, @RaduBerinde, and @tbg)
pkg/cmd/roachprod/install/cockroach.go, line 427 at r1 (raw file):
Previously, RaduBerinde wrote…
It's a new requirement, but it should be available in all VMs that we set up. We won't use it for "local" clusters.
We use sudo
in SyncedCluster.Wipe
and some other SyncedCluster
operations.
pkg/cmd/roachprod/install/cockroach.go, line 460 at r1 (raw file):
Do we need these |
e6297b2
to
9b6f4b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @andreimatei, @cucaroach, @irfansharif, and @tbg)
pkg/cmd/roachprod/install/cluster_synced.go, line 159 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
would you mind taking the opportunity and document the
wait
?
Done.
pkg/cmd/roachprod/install/cluster_synced.go, line 176 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
consider saying a few more words about what this
reset-failed
does.
Removed.
pkg/cmd/roachprod/install/cockroach.go, line 451 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
please write something about how
--uid
and--gid
relate to--user
, and how we can't use--user
.
Done.
pkg/cmd/roachprod/install/cockroach.go, line 456 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
no
set -euo pipefail
?
Done.
pkg/cmd/roachprod/install/cockroach.go, line 475 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
consider throwing a $(date) in these so that we can figure out the exact time it stopped
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @andreimatei, @cucaroach, @irfansharif, and @RaduBerinde)
pkg/cmd/roachprod/install/cluster_synced.go, line 159 at r1 (raw file):
Previously, RaduBerinde wrote…
Done.
But this takes the signal and wait, so there's no default here. Did you intend to put this comment elsewhere?
pkg/cmd/roachprod/install/cockroach.go, line 399 at r1 (raw file):
Previously, RaduBerinde wrote…
I don't want to create a script on your machine.. But I can extract the common stuff in the code here
Why not? I appreciate that you've introduced reuse here but... this stuff is just so hard to maintain or even read. I thought we would have a single script that simply knows whether it's invoked in local or remote mode. I have this PR checked out locally and it's so hard. I think we lost GOTRACEBACK=crash for the non-local case? We need to clean this up avoid the divergence between the paths or it'll be so hard to maintain and get right.
I spent some time on this - pushed a commit - pretty happy with how it turned out. We are of course pushing the boundaries here of what is reasonable but this is a lot less terrible than what we had before this PR.
pkg/cmd/roachprod/install/cockroach.go, line 454 at r1 (raw file):
--background
uses sdnotify.Exec
:
cockroach/pkg/cli/start_unix.go
Line 87 in 8d24277
return true, sdnotify.Exec(cmd) |
In other words, the parent waits until the child calls sdnotify.Ready
, which it only does after binding the ports:
Lines 422 to 425 in 6b1b69a
// ReadyFn will be called when the server has started listening on | |
// its network sockets, but perhaps before it has done bootstrapping | |
// and thus before Start() completes. | |
serverCfg.ReadyFn = func(waitForInit bool) { |
But I think this is... then already set up for success? We are running it under systemd now. The only question is, does systemd know that cockroach will signal systemd? It would have to decide to wait for the signal if things were to work out. I think that means using a service type of "Notify"?
https://askubuntu.com/questions/1120023/how-to-use-systemd-notify
If you set up a service using Type=notify, systemd will automatically set up a communication socket back to systemd and will export its path to the service under $NOTIFY_SOCKET.
pkg/cmd/roachprod/install/cockroach.go, line 454 at r1 (raw file):
Previously, RaduBerinde wrote…
We can play with that and add a config if we conclude it's useful. I am hesitant to use the soft limit (which slows down the program). It would be very hard to debug that slowdown, and it could be especially problematic with benchmarks.
It's easy enough to play with them on a branch, I don't think we need to add knobs at this point.
pkg/cmd/roachprod/install/cockroach.go, line 460 at r1 (raw file):
Previously, RaduBerinde wrote…
Do we need these
ps axeww
s in the roachprod log? They are very unwieldy
No, they similarly come from past debugging. Let's remove.
pkg/cmd/roachprod/install/cockroach.go, line 461 at r1 (raw file):
Previously, RaduBerinde wrote…
No idea, does anyone else?
We sometimes run into issues in which cockroach start
fails because .... well, we don't know, but one suspicion was that we were holding on to some RocksDB lock. This stuff hasn't been useful, let's remove. (Done, in my commit)
pkg/cmd/roachprod/install/cockroach.go, line 475 at r1 (raw file):
Previously, RaduBerinde wrote…
Done.
That'll be in UTC, right?
Yep, just checked:
ubuntu@tobias-1619593141-08-n5cpu16-0001:~$ date
Wed Apr 28 07:29:57 UTC 2021
pkg/cmd/roachprod/install/cockroach.go, line 399 at r1 (raw file): Previously, tbg (Tobias Grieger) wrote…
Very nice!! |
b1f4d48
to
31bf2d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tbg this looks good to me. If this looks good to you, I will squash and merge.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner, @andreimatei, @cucaroach, @irfansharif, @RaduBerinde, and @tbg)
pkg/cmd/roachprod/install/cluster_synced.go, line 159 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
But this takes the signal and wait, so there's no default here. Did you intend to put this comment elsewhere?
Rephrased.
pkg/cmd/roachprod/install/cockroach.go, line 454 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
--background
usessdnotify.Exec
:cockroach/pkg/cli/start_unix.go
Line 87 in 8d24277
return true, sdnotify.Exec(cmd) In other words, the parent waits until the child calls
sdnotify.Ready
, which it only does after binding the ports:Lines 422 to 425 in 6b1b69a
// ReadyFn will be called when the server has started listening on // its network sockets, but perhaps before it has done bootstrapping // and thus before Start() completes. serverCfg.ReadyFn = func(waitForInit bool) { But I think this is... then already set up for success? We are running it under systemd now. The only question is, does systemd know that cockroach will signal systemd? It would have to decide to wait for the signal if things were to work out. I think that means using a service type of "Notify"?
https://askubuntu.com/questions/1120023/how-to-use-systemd-notify
If you set up a service using Type=notify, systemd will automatically set up a communication socket back to systemd and will export its path to the service under $NOTIFY_SOCKET.
Very cool! Yes, it works. I had to add NotifyAccess=all
because the cockroach is a different PID than the one started by systemd (bash). I verified that it waits by adding a time.Sleep before calling sdnotify in the cockroach binary. In the meantime systemctl status cockroach
shows it like this, very useful: Active: activating (start) since Wed 2021-04-28 21:30:40 UTC; 8s ago
.
pkg/cmd/roachprod/install/cockroach.go, line 391 at r4 (raw file):
# \EOF ensures that no parameter expansion occurs in the heredoc. See: # https://unix.stackexchange.com/questions/376103/here-document-without-interpreting-escape-sequences-but-with-interpolation cat > "${helper}" << \EOF && chmod +x "${helper}" && "${helper}" "${verb}"
We are using 'EOF'
below which disables everything, made it the same.
pkg/cmd/roachprod/install/cockroach.go, line 407 at r4 (raw file):
Previously, RaduBerinde wrote…
I don't think this is needed (because of the ||)
Removed and it still works (verified that the error code makes it to roachprod.log)
416e59d
to
af7df9a
Compare
While running some stress tests with TPCH, I observed two big problems with roachprod: - the out-of-memory behavior is very bad: instead of the process being killed, the system enters a thrashing mode where everything in the VM slows to a crawl (to the point where just sshing in can take minutes). - when the cockroach process exits, the exit code is not recorded anywhere, making it impossible in some cases to figure out why it stopped. In my particular case, we were exiting with exit code 8 (which is `exit.TimeoutAfterFatalError()`) because writing to the logs was unacceptably slow. This commit attempts to improve things on both these fronts. Instead of running with `--background`, we use `systemd-run` to run cockroach as a service unit. This has several advantages: - we have much better monitoring infrastructure via `systemctl status cockroach` - we can now run code after the exit, allowing us to record it in various logs. - we can set a strict cgroups memory limit (set to `95%`) so that the process gets oom-killed before the system starts to thrash. As part of the commit, we also print out information about the status of cockroach when logging in. Fixes cockroachdb#64176. Release note: None
af7df9a
to
b937cc8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
🤞 for the nightly run, I expect to see a few more OOMs on clusters that were previously just scraping by.
Reviewed 1 of 3 files at r1, 2 of 3 files at r5, 1 of 1 files at r6.
Dismissed @cucaroach, and @cucaroach from 5 discussions.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei, @irfansharif, and @RaduBerinde)
Build succeeded: |
Same here. I almost want to start a nightly run with a lower MemoryMax and adjust all the tests so they use less than 80-85%. If we let them scrape by, any small change can drive them over the limit. |
While running some stress tests with TPCH, I observed two big problems
with roachprod:
being killed, the system enters a thrashing mode where everything
in the VM slows to a crawl (to the point where just sshing in can
take minutes).
anywhere, making it impossible in some cases to figure out why it
stopped. In my particular case, we were exiting with exit code 8
(which is
exit.TimeoutAfterFatalError()
) because writing to thelogs was unacceptably slow.
This commit attempts to improve things on both these fronts. Instead
of running with
--background
, we usesystemd-run
to run cockroachas a service unit. This has several advantages:
systemctl status cockroach
various logs.
95%
) so that theprocess gets oom-killed before the system starts to thrash.
As part of the commit, we also print out information about the status
of cockroach when logging in.
Fixes #64176.
Release note: None