-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster-driver-registrar permissions #44
Comments
Spoke with @msau42 about this. Sounds like the proposal is to keep the Overall seems reasonable and will reduce complexity. So I support it. |
Yeah. Thats what I"m thinking. |
This also eliminates the need for needing to run a long-running sidecar that will use up memory, but not really be actively reconciling anything |
I'm ok with this as long as we document somewhere what to do. |
We decided in today's CSI Meeting to go ahead and discontinue development of the cluster-driver-registrar. We need to deprecate and update docs:
|
/assign @lpabon |
One thing I started looking at was a need to be able to querie a plugins capabilities (ie from an external controller, perhaps I want to see if the provided storage supports Snapshots, Cloning etc etc). I think it's a fairly important usage feature. I was chatting with @lpabon and @msau42 about this and where it would live; it sounds like removing this may or may not have some implications. |
Being able to inspect the driver in order to build an appropriate csidriver object would be useful. But having a whole agent for that running all the time sounds like too much to me. what about a go cli tool that does the inspection and dumps out to stdout a first stab at a csidriver object? |
@kfox1111 I think it would be tough to ask customers to have a separate tool. I think the only cli we can depend on is |
I think there is a disconnect here... I don't expect customers to build up the kubernetes packaging themselves. They will deploy static manifests or helm charts to load k8s objects that load the drivers into their cluster. As such, they will have an already built csidriver object as part of the deployment. The separate tool, could optionally be used by the person building the k8s packaging for the driver. |
@kfox1111 Yeah, I see where you are going. Do you mind attending tomorrow's kube-csi meeting to discuss it with the team? |
Ahh, yeah that doesn't help my use case; say for example I have a controller running that periodically takes snapshots, it would be useful for that controller to be able to determine if the PVC assigned to it can even support Snapshots. Anyway, seems like a good primer for discussing this in tomorrows meeting. |
I can try. its been historically hard for me to make it though. |
@j-griffith thats what csidriver is for though I think? You can query k8s to see what features the csidriver supports? |
Spoke too soon. Looks like it doesn't have a flag today to say if it supports snapshots. Maybe it should though? |
Yep, I was getting at proposing something to actually reflect the response of a full GetCapabilities call. I'm not sure how to request a describe on the csidriver object, anyway that's why I raised the point, gather all the pieces and bits of knowledge and see what we can come up with. It may not have any impact on the registrar removal at all. |
@j-griffith to support your use case I think there would need two pieces to implement:
I agree with @kfox1111 that we shouldn't need a long running process to do this, and some outside utility or some not long-running sidecar should be sufficient. Now there's also the issue that some volume features don't actually have a specific capability (like raw block), and that capability can only be determined by trying to use it and getting back a runtime error. |
@j-griffith can you describe your controller in more detail and what it does? It watches PVCs and then calls CreateSnapshot on the csi driver? |
This still could be made a flag in CSIDriver and annotated by the packager. It just can't automatically be done. |
@msau42 That was an arbitrary example, but I'll expand on the idea a bit, I think this deserves it's own issue (not sure which project it would belong to, but I lean towards Kubernetes API). There are two things that stand out IMO:
All of this is more about usability for a standard cluster user, not a developer or a cluster admin. Don't know if that helps or not, I can be more precise if needed. The problem with "optional" capabilities is that they're "optional" so as a user I would like to have an authoritative source to determine what capabilities are or aren't available to me. |
part 1 is interesting in that csidriver wouldn't necessarily solve it either, as there is a disconnect between driver and storageclass. Users can not necessarily see the storage class details they are using. so, part 2 sounds like something like an operator, automating what the person might do in part 1. This is an interesting set of use cases for sure. Definitely bigger then this container though. This almost feels like its big enough to need a KEP? |
Thanks for the details, this helps me understand what perspectives we're looking at this from. I'd like to expand a little bit on each persona:
|
Yeah, I think the gist of what I'm thinking here is to be able to use the Kubernetes API to say "hey, what capabilities are available for StorageClass I apologize I'm not completely familiar with what's being suggested WRT the CSIDriver object, I believe folks are talking about an internal object that again isn't exposed via the Kube API. I am inclined to agree that I don't think the registrar is where something like this should live. |
CSIDriver is a K8s API object that currently contains some information about an installed driver: https://kubernetes-csi.github.io/docs/csi-driver-object.html |
|
Are you running in a 1.14 cluster? We're currently recommending that drivers create the CSIDriver object manually as part of their deployment specs. Instead of running the cluster-driver-registrar |
Yeah, with my user persona hat on, I can see the problem. both storageclass and csidriver are kind of scoped to be cluster admin tools. There isn't really an object that is user facing with the required info in it. Usually that would be stuck in documentation somewhere provided by the admin saying what storageclasses existed and what things it supports. like, the gold storage class provides better ha, or the fast storage class provides x iops... but being able to ask the cluster what storageclasses support snapshotting is really important too... The csidriver should provide the needed info. but the user cant see it, plus they don't use it directly anyway. The storageclass is what they use, but they typically cant see that either... The problem I guess, is you need to see some of the csidriver info through the lens of a user looking at the storage class.... Maybe an option would be to extend the storageclass api to have a functionality endpoint? so, in rbac something like: - apiGroups:
- "storage.k8s.io"
resources:
- storageclasses/functionality
verbs:
- get Which would allow the user to ask for the safe bits of the associated csidriver object by the corresponding storageclass? Definitely into needing a KEP though I think. |
I think we may be overengineering this. At is simplest, we use cluster-driver-registrar as is today: One container creating a CSIDriver object providing information about the driver to the caller (user/admin). The issue is that this is a running process wasting memory since once the CSIDriver object is created, it does not need to update it any longer. We have discussed the possibility of changing cluster-driver-registrar from a running loop to a begin-end program so that it can be run in a Job. That way it will:
If this is not desired, we can have another side-car container (provisioner?) register the CSIDriver. It is running already and it has leader support. |
I spoke with @liggitt about the self-registration and permissions aspects from the original concern. The fact that this is a controller-level deployment, and can therefore be run in dedicated/restricted nodes (which can control specifically what workloads run on it) means that we're not concerned about this level of self-reporting compared to something like self-reporting Nodes. @kfox1111 's concern about auditing I think still applies. It's not easy to see the spec created before deployment since it's created on demand. But I think manually crafting the CSIDriver object also makes it more likely that it's going to get out of date (and maybe that's ok). Maybe we can support both models? The sidecar that creates the object on demand, but it also has a "dry run" or "offline" mode that only writes the spec out to a file. |
I think we can leave it up to the vendors then to decide which model to use |
Its also up to the operators. Which is my primary hat. Id much rather see a CSIDriver object up front, then to give a driver access to register them itself. If you let the vendors do too much, it makes it harder on the operators to validate. So no easy answer there. |
Also, the risk of getting out of date applies to all of the packaging, not just the csidriver object. For example, what if the driver starts saying "I support snapshots" but the packaging doesn't add the sidecar for snapshot support yet? Having the csidriver packaging decoupled from the driver allows the specific csidriver info to be at the package level, stating what the whole k8s driver (csi driver + k8s packaging) supports. |
note that I'm not opposed to the admin-created object, I just wasn't especially concerned about drivers self-reporting I do think it needs to be clearer whether the CSIDriver spec is:
|
hmm.. I wasn't thinking it that way but that is interesting. I did not see it as an admin object but a packager. so maybe there really are 3 sets of different information:
Support for the admin to override it might be useful. I still think the csidriver object should be a packager object though. If the driver was fully self describing, it should have enough info that you could run a commandline tool against the csidriver and it would fully generate all the kubernetes packaging automatically. statefulsets/daemonsets/csidriver/rbac/etc. Doing just csidriver but not the rest feels like it could cause issues. |
Did we reach any conclusion here? Our documentation says:
It has been two releases without any progress in the refactoring / redesign. The documentation also says that CSIDriver instances should be created "by installation manifest". I am proposing deprecation of this sidecar in #48. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
All documentation has been updated to mark this as deprecated and e2e tests updated. |
@msau42: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cluster-driver-registrar design feels like it has some issues with respect to permissions and encourages what I think is an antipattern.
The trend of loading controllers into k8s and giving them enough access to self register crd's and other things has always felt a little bad. If your deploying with helm, helm is already dealing with registering objects and has all the permissions needed to register them. The permissions to do anything stop after its installed, so it can't be used to do further things to the cluster. This seems good. But, if you give a long running process permissions to do such things, then it can be used to do bad things later. Another facet is auditing. If its in helm, it can be easily parsed by a human and verified its safe before running. This is very hard to do if its buried inside code.
The text was updated successfully, but these errors were encountered: