-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(1-1-restore): adds 1-1-restore cli and service stub #4221
feat(1-1-restore): adds 1-1-restore cli and service stub #4221
Conversation
f07e746
to
94d7777
Compare
pkg/service/one2onerestore/model.go
Outdated
SourceClusterID string `json:"source_cluster_id"` | ||
SnapshotTag string `json:"snapshot_tag"` | ||
NodesMapping []nodeMapping `json:"nodes_mapping"` | ||
BatchSize int `json:"batch_size,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think batch size is needed, it's ok to call RClone to copy all SSTables of the given keyspace/table of a given node at once.
RClone tracks the progress on its own, SM just calls for the progress using jobID.
pkg/service/one2onerestore/model.go
Outdated
Parallel int `json:"parallel,omitempty"` | ||
Transfers int `json:"transfers"` | ||
RateLimit []DCLimit `json:"rate_limit,omitempty"` | ||
AllowCompaction bool `json:"allow_compaction,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don;t need this parameter.
Compaction should be disabled to not waste CPU cycles and disk IO.
But it doesn't have an impact on the correctness of the restore.
This adds a new task type for fast vnode restore. Fixes: #4200
This introduces new cli command for fast vnode restore procedure. Fixes: #4200
This adds support for the fast restore task type to the task scheduler
This is the result of running `make docs`
This renames fastrestore to 1-1 restore and also makes it a subcommand of restore cmd.
This describes what 'same topology' means. Also adds references to new subcommands to the 'restore' cli page.
…s uuid This removes 1-1-restore options that are not needed at the moment (we can added them later if nedded). Also changed source-cluster-id type to uuid.UUID
37409c0
to
7e9f877
Compare
@karol-kokoszka @Michal-Leszczynski |
This dowgrades some of the deps that was updated by accident, so now it looks more like the master branch.
@@ -27,6 +27,7 @@ const ( | |||
UnknownTask TaskType = "unknown" | |||
BackupTask TaskType = "backup" | |||
RestoreTask TaskType = "restore" | |||
One2OneRestoreTask TaskType = "1_1_restore" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: yet another version of writing the 1-1-restore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that using 1-1-restore
when applicable makes sense. In other cases, we can use one2onerestore
.
@VAveryanov8 why do you think that 1_1_restore
fits better here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nit. Please don't spend much time on deciding what name to use. 1_1_restore
is perfectly fine.
This adds usage of IsSnapshotTag and uuid.Nil. Also aligns 1-1-restore service naming convention across the code, comment and docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
I have no major concerns.
pkg/cmd/scylla-manager/server.go
Outdated
healthSvc *healthcheck.Service | ||
backupSvc *backup.Service | ||
restoreSvc *restore.Service | ||
one2OneRestoreSvc *one2onerestore.Service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the interface for the one2OneRestoreSvc.Service (like Servicer) and reference it here instead of the struct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I can easily do this, but I'd like to clarify one detail in advance. I would like to define interface in scylla-manger/server.go
package, but I see that in the SM code base it's often other way around (e.g ConfigCacher defined in configcache
pkg). i think it's better to define interfaces on the consumer side https://go.dev/wiki/CodeReviewComments#interfaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the practice in go is to define the interface in the package that consumes the service, but here, the interface defines the API for the whole service which definition and implementation stays in the "one2onerestore" package.
So, for the clarity of what this package is expected to do, I prefer to have this interface that defines the API.
Returning interface from the constructor aligns with the intention to make the service’s API explicit and maintain a clear package boundary, even though it deviates a bit from the “accept interfaces, return concrete types” idiom.
Btw, packages like net/http
http.Handler (defined in the same package).
Another example of the same approach (breaking the idiom), is in go-kit https://github.com/go-kit/examples/blob/master/addsvc/pkg/addservice/service.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the practice in go is to define the interface in the package that consumes the service, but here, the interface defines the API for the whole service which definition and implementation stays in the "one2onerestore" package.
So, for the clarity of what this package is expected to do, I prefer to have this interface that defines the API.
That make sense and it is a good intention, but why public methods of a Service struct doesn't define API and boundaries of a "one2onerestore" package?
In provided examples, they all have different pattern - interface is public, but implementation is private. In std lib it make sense - when you promise backward compatibility you don't want to expose some helper structs that can be changed in near future, e.g. http.FileServer. (and they use this pattern only for some helper functions).
In go-kit
they use public interface and private struct for the services (business logic), because they have a lot of wrappers around Service, to hide details about various transports (e.g. http, grpc) that can be used, see https://github.com/go-kit/examples/blob/master/addsvc/pkg/addendpoint/set.go#L34
P.S. I'm not trying to argue for the sake of arguing or to resist making the change, I'm modestly trying to understand and learn why we want to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That make sense and it is a good intention, but why public methods of a Service struct doesn't define API and boundaries of a "one2onerestore" package?
@VAveryanov8 I appreciate your will to dive into the discussion and challenge the proposal. Here is why I believe we should expose public API via interface instead of relying on public methods from the struct itself.
Although the public methods on the Service struct do define what the service does, an interface makes that contract explicit. It tells consumers (even though Scylla Manager is the consumer), "This is what you can rely on," without revealing any implementation details that aren’t meant to be part of the public API.
By keeping the concrete type unexported and only exposing an interface, we hide any internal methods and data that we might not want external users to depend on. Responsibility of the 1-1-restore service is to just do the restore, maybe it should expose methods to track the progress as well.
In provided examples, they all have different pattern - interface is public, but implementation is private. In std lib it make sense - when you promise backward compatibility you don't want to expose some helper structs that can be changed in near future, e.g. http.FileServer. (and they use this pattern only for some helper functions).
In go-kit they use public interface and private struct for the services (business logic), because they have a lot of wrappers around Service, to hide details about various transports (e.g. http, grpc) that can be used, see https://github.com/go-kit/examples/blob/master/addsvc/pkg/addendpoint/set.go#L34
We have black-box testing (integration-tests) that are validating things like GetTarget etc (I think this the reason why originally some of the methods are public). The change to use interface came with the refactor of healthcheck service. Originally, everything was defined by struct.
For me, personally, the biggest advantage of using interface and defining it in the same package is just the clarity and simplicity.
An interface in the service package serves as a clear contract of what the package is meant to do. This clarity benefits anyone consuming the service (or reviewing the code), as they see a well-defined set of methods without being distracted by internal details.
Returning the interface from the constructor reinforces that our intent is to expose only the API that matters to consumers. It’s a deliberate decision that communicates, “This is the boundary of our package.”
P.S. I'm not trying to argue for the sake of arguing or to resist making the change, I'm modestly trying to understand and learn why we want to do that.
No worries, just don't keep this discussion too long :)
Another example of breaking the interface idiom from k8s repo-> https://github.com/kubernetes/client-go/blob/master/informers/factory.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's wrap it up :) I'm going to do that - in the end of the day, it's not gonna break something and I'm not gonna hit by electricity from my computer for doing that :D
Just to have some rules for the future implementations:
- Do we always define service interfaces on the implementation side?
- Do we need to define service as a private struct?
Another example of breaking the interface idiom from k8s repo-> https://github.com/kubernetes/client-go/blob/master/informers/factory.go
Yes, in general, I'm not against "breaking the rule", if there is a clear reason for that. In k8s if it's not a factory, then they always follow the pattern:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/bootstrap/bootstrapsigner.go#L95
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/certificates/certificate_controller.go#L55
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/clusterroleaggregation/clusterroleaggregation_controller.go#L55
...
And literally every service under pkg/controller
if it's not an exception, like implementing Options pattern or Factory pattern when there are several versions (implementations) of a service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to have some rules for the future implementations:
Do we always define service interfaces on the implementation side?
That's my preference for the services in Scylla Manager. All services have well-defined responsibilities.
Services from `pkg/service'.
Do we need to define service as a private struct?
Would be good, but check the impact on integration-tests first - these are black-box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a change! Please review
P.S. https://github.com/scylladb/scylla-manager/blob/master/pkg/restapi/services.go 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the side of following the convention, as we don't face any exception here.
After the introduction of the interface in one2onerestore
pkg we have an interesting situation where:
// Servicer is an interface that defines one2onerestore service API contract.
type Servicer interface {
// One2OneRestore restores data (tables) from the source cluster backup to the target cluster if they have equal topolgy:
// nodes count, shards count, token ownership should be exactly the same.
One2OneRestore(ctx context.Context, clusterID, taskID, runID uuid.UUID, properties json.RawMessage) error
// Runner creates a Runner that handels 1-1-restore operations.
Runner() Runner
}
- service constructor returns interface defined in the pkg
- runner constructor returns struct which implements interface from consumer pkg
The only thing important for the scheduler is the Runner
which actually runs the task (whatever it might do). Right now the One2OneRestore
interface method is not even used, right? (If we really want, we can use it in the Runner
, but I don't see benefit in that).
Also, if we want to equip 1-1-restore
task with progress, then we would still need to create additional interface in .../restapi/services.go
(as pasted in the comment above)? That's messy.
So IIUC, by defining the interface on the one2onerestore
pkg side, we:
- don't follow the general go convention
- don't follow SM repo patters (other task services [backup/restore/repair] don't expose their own interfaces)
In terms of black-box testing, we should just think about what types, fields, functions, methods needs to be exported, and which shouldn't. We can also choose if we want to have access to the unexported functionalities by choosing the test file package (service
or service_test
). We can even use export_test.go
for accessing some things that other pkgs shouldn't touch, but we still would like to test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale of why I want to keep the interface in front of the service is in the comments above.
Thanks for making the change. Feel free to merge !
// One2OneRestore restores data (tables) from the source cluster backup to the target cluster if they have equal topolgy: | ||
// nodes count, shards count, token ownership should be exactly the same. | ||
One2OneRestore(ctx context.Context, clusterID, taskID, runID uuid.UUID, properties json.RawMessage) error | ||
// Runner creates a Runner that handels 1-1-restore operations. | ||
Runner() Runner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
topolgy
->topology
handels
->handles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving approve, but in case you want to keep negotiating for not including interface in the producer pkg, I'm happy to support this cause:)
This adds 1-1 restore cli and stubs.
For now cli just schedules the 1-1 restore task type and when task is executed, service logs "Not yet implemented".
1-1-restore is added as a subcommand to the restore cli, so it looks like below:
Note: I've temporarily used
v3/pkg/managerclient
from theva/fast-restore-part-1
branch as requried changed not in the master yetP.S. Feel free to suggest another name for this cli and service 😄
Please make sure that: