You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Online DDL's cut-over threshold value is used to both determine whether a migration is ready to cut-over, as well as set timeout for the cut-over operation, as follows:
One of the indicators to a migration being ready to cut-over is the vreplication lag (which is different from replication lag, but is certainly correlated to, and can be caused by replication lag). If vreplication lag is higher then the cut-over threshold, we say the migration is not ready to complete.
Once we enter the cut-over phase, there's a bunch of operations that we timeout. The locks on the table. The rename statement, the wrapping query buffering. And the value used for those timeouts is (based on) the cut-over threshold.
Just putting this out of the way that it makes sense to use that same value for both cases, as the closely correlate to each other.
Now, there is a default cut-over threshold, hard coded as 10sec. The user is then able to supply a different value via ddl strategy flag, e.g. --cut-over-threshold=15s.
But as things stand, that value is then constant for the duration of the migrations. Sometimes, you see a migration struggling to complete under load. There's a variety of techniques to help it get through: just let it retry; throttle it and then unthrottle it on a less busy time; force completion (kill blocking queries). However, we want to also add the ability to modify the cut-over threshold on a running migration.
This would be achieved by a query such as ALTER VITESS_MIGRATION '<uuid>' CUTOVER_THRESHOLD='15s'. We should limit cut-over threshold to reasonable values. I'd say 5s would be the bare minimum, and 30s is pushing it on the upper limit (the effect could be a table being locked for 30s).
Use Case(s)
Dynamic control over Online DDL migrations, solving ongoing cut-over issues.
The text was updated successfully, but these errors were encountered:
Feature Description
Online DDL's cut-over threshold value is used to both determine whether a migration is ready to cut-over, as well as set timeout for the cut-over operation, as follows:
Just putting this out of the way that it makes sense to use that same value for both cases, as the closely correlate to each other.
Now, there is a default cut-over threshold, hard coded as
10sec
. The user is then able to supply a different value via ddl strategy flag, e.g.--cut-over-threshold=15s
.But as things stand, that value is then constant for the duration of the migrations. Sometimes, you see a migration struggling to complete under load. There's a variety of techniques to help it get through: just let it retry; throttle it and then unthrottle it on a less busy time; force completion (kill blocking queries). However, we want to also add the ability to modify the cut-over threshold on a running migration.
This would be achieved by a query such as
ALTER VITESS_MIGRATION '<uuid>' CUTOVER_THRESHOLD='15s'
. We should limit cut-over threshold to reasonable values. I'd say5s
would be the bare minimum, and30s
is pushing it on the upper limit (the effect could be a table being locked for30s
).Use Case(s)
Dynamic control over Online DDL migrations, solving ongoing cut-over issues.
The text was updated successfully, but these errors were encountered: