diff --git a/keps/sig-storage/20200212-volume-group.md b/keps/sig-storage/20200212-volume-group.md new file mode 100644 index 000000000000..5116e447fb7a --- /dev/null +++ b/keps/sig-storage/20200212-volume-group.md @@ -0,0 +1,366 @@ +--- +title: Volume Group +authors: + - "@xing-yang" + - "@jingxu97" +owning-sig: sig-storage +participating-sigs: + - sig-storage +reviewers: + - "@msau42" + - "@saad-ali" + - "@thockin" +approvers: + - "@msau42" + - "@saad-ali" + - "@thockin" +editor: TBD +creation-date: 2020-02-12 +last-updated: 2020-02-12 +status: provisional +see-also: + - n/a +replaces: + - n/a +superseded-by: + - n/a +--- + +# Title + +Volume Group + +## Table of Contents + + +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [API Definitions](#api-definitions) + - [Example Yaml Files](#example-yaml-files) + - [Volume Group Snapshot](#volume-group-snapshot) + - [Volume Placement](#volume-placement) + + +## Summary + +This proposal is to introduce a VolumeGroup API to manage multiple volumes together and a GroupSnapshot API to take a snapshot of a VolumeGroup. + +## Motivation + +While there is already a KEP (https://github.com/kubernetes/enhancements/pull/1051) that introduces APIs to do application snapshot, backup, and restore, there are other use cases not covered by that KEP. + +Use case 1: +A VolumeGroup allows users to manage multiple volumes belonging to the same application together and therefore it is very useful in general. For example, it can be used to group all volumes in the same StatefulSet together. + +Use case 2: +For some storage systems, volumes are always managed in a group. For these storage systems, they will have to create a group for a single volume if they need to implement a create volume function in Kubernetes. Providing a VolumeGroup API will be very convenient for them. + +Use case 3: +Instead of taking individual snapshots one after another, VolumeGroup can be used as a source for taking a snapshot of all the volumes in the same volume group. This may be a storage level consistent group snapshot if the storage system supports it. In any case, when used together with ExecutionHook, this group snapshot can be application consistent. For this use case, we will introduce another CRD GroupSnapshot. + +Use case 4: +VolumeGroup can be used to manage group replication or consistency group replication if the storage system supports it. Note replication is out of scope for this proposal. It is mentioned here as a potential future use case. + +Use case 5: +VolumeGroup can be used to manage volume placement to either spread the volumes across storage pools or stack the volumes on the same storage pool. Related KEPs proposing the concept of storage pool for volume placement is as follows: + https://github.com/kubernetes/enhancements/pull/1353 + https://github.com/kubernetes/enhancements/pull/1347 +We may not really need a VolumeGroup for this use case. A StoragePool is probably enough. This is to be determined. + +Use case 6: +VolumeGroup can also be used together with application snapshot. It can be a resource managed by the ApplicationSnapshot CRD. + +Use case 7: +Some applications may not want to use ApplicationSnapshot CRD because they don’t use Kubernetes workload APIs such as StatefulSet, Deployment, etc. Instead, they have developed their own operators. In this case it is more convenient to use VolumeGroup to manage persistent volumes used in those applications. + +### Goals + +* Provide an API to manage multiple volumes together in a group. +* Provide an API to take a snapshot of a group of volumes. +* Provide a design to facilitate volume placement using the group API (To be determined). +* The group API should be generic and extensible so that it may be used to support other features in the future. + +### Non-Goals + +* A VolumeGroup may potentially be used to support replication group in the future, but providing design on replication group is not in the scope of this KEP. This can be discussed in the future. + +## Proposal + +This proposal introduces new CRDs VolumeGroup, VolumeGroupClass, and GroupSnapshot. + +Create new group: +There are two ways to create a new VolumeGroup. +* Create a group with existing volumes, either with a list of PVCs or using a selector. +* Create an empty group first, then create each individual volume with group_id which will add a volume to the already created group. + +Snapshot: +A GroupSnapshot can be created with a VolumeGroup as the source. + +Restore: +* In the restore case, a VolumeGroup can be created from a GroupSnapshot source. +* We could also achieve this by creating a volume from a snapshot for all the snapshots in the GroupSnapshot, and then create a group with those restored volumes. + +For the volume placement support, it assumes that storage pools exist on storage systems already. In VolumeGroupClass, there is an AllowedTopologies field that can be used to specify the accessibility of the group of volumes to storage pools and nodes. However it won’t have a field to track the capacities of the storage pools. + +### API Definitions + +API definitions are as follows: + +``` +type VolumeGroupClass struct { + metav1.TypeMeta + // +optional + metav1.ObjectMeta + + // Driver is the driver expected to handle this VolumeGroupClass. + // This value may not be empty. + Driver string + + // Parameters holds parameters for driver. + // These values are opaque to the system and are passed directly + // to the driver. + // +optional + Parameters map[string]string + + // This field specifies whether group snapshot is supported. + // The default is false. + // +optional + GroupSnapshot *bool + + // Restrict the topologies where a group of volumes can be located. + // Each driver defines its own supported topology specifications. + // An empty TopologySelectorTerm list means there is no topology restriction. + // This field is passed on to the drivers to handle placement of a group of + // volumes on storage pools. + // +optional + AllowedTopologies []api.TopologySelectorTerm +} + +// VolumeGroup is a user's request for a group of volumes +type VolumeGroup struct { + metav1.TypeMeta + // +optional + metav1.ObjectMeta + + // Spec defines the volume group requested by a user + Spec VolumeGroupSpec + + // Status represents the current information about a volume group + // +optional + Status *VolumeGroupStatus +} + +// VolumeGroupSpec describes the common attributes of group storage devices +// and allows a Source for provider-specific attributes +Type VolumeGroupSpec struct { + // +optional + VolumeGroupClassName *string + + // If Source is nil, an empty volume group will be created. + // Otherwise, a volume group will be created with PVCs (if PVCList or Select is set) or + // with a GroupSnapshot as data source + // +optional + Source *VolumeGroupSource +} + +// VolumeGroupSource contains 3 options. If VolumeGroupSource is not nil, +// one of the 3 options must be defined. +Type VolumeGroupSource struct { + // A list of existing persistent volume claims + // +optional + PVCList []PersistentVolumeClaim + + // A label query over existing persistent volume claims to be added to the volume group. + // +optional + Selector *metav1.LabelSelector + + // This field specifies the source of a volume group. (this is for restore) + // Supported Kind is GroupSnapshot + // +optional + GroupDataSource *TypedLocalObjectReference + } + +type VolumeGroupStatus struct { + GroupCreationTime *metav1.Time + + // A list of persistent volume claims + // +optional + PVCList []PersistentVolumeClaim + + Ready *bool + + // Last error encountered during group creation + Error *VolumeGroupError +} + +// Describes an error encountered on the group +type VolumeGroupError struct { + // time is the timestamp when the error was encountered. + // +optional + Time *metav1.Time + + // message details the encountered error + // +optional + Message *string +} + +// GroupSnapshot is a user's request for taking a group snapshot. +type GroupSnapshot struct { + metav1.TypeMeta `json:",inline"` + // Standard object's metadata. + // +optional + metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"` + + // Spec defines the desired characteristics of a group snapshot requested by a user. + Spec GroupSnapshotSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"` + + // Status represents the latest observed state of the group snapshot + // +optional + Status *GroupSnapshotStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"` +} + +// GroupSnapshotSpec describes the common attributes of a group snapshot +type GroupSnapshotSpec struct { + // Source has the information about where the group snapshot is created from. + // Supported Kind is VolumeGroup + // +optional + Source *TypedLocalObjectReference `json:"source" protobuf:"bytes,1,opt,name=source"` +} + +Type GroupSnapshotStatus struct { + + // ReadyToUse becomes true when ReadyToUse on all individual snapshots become true + // +optional + ReadyToUse *bool + + // List of volume snapshots + SnapshotList []VolumeSnapshot +} + +type PersistentVolumeClaimSpec struct { + ...... + VolumeGroupNames []string + ...... +} + + +type VolumeSnapshotSpec struct{ + ...... + GroupSnapshotName *string + ...... +} +``` + +### Example Yaml Files + +#### Volume Group Snapshot + +Example yaml files to define a VolumeGroupClass and VolumeGroup are in the following. + +A VolumeGroupClass that supports groupSnapshot: +``` +apiVersion: volumegroup.storage.k8s.io/v1alpha1 +kind: VolumeGroupClass +metadata: + name: volumeGroupClass1 +spec: + parameters: + …... + groupSnapshot: true +``` + +A VolumeGroup belongs to this VolumeGroupClass: +``` +apiVersion: volumegroup.storage.k8s.io/v1alpha1 +kind: VolumeGroup +metadata: + Name: volumeGroup1 +spec: + volumeGroupClassName: volumeGroupClass1 + selector: + matchLabels: + app: my-app +``` + +A GroupSnapshot taken from the VolumeGroup: +``` +apiVersion: volumegroup.storage.k8s.io/v1alpha1 +kind: GroupSnapshot +metadata: + name: my-group-snapshot +spec: + source: + name: volumeGroup1 + kind: VolumeGroup + apiGroup: volumegroup.storage.k8s.io +``` + +A PVC that belongs to the volume group which supports groupSnapshot: +``` +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: pvc1 + annotations: +spec: + accessModes: + - ReadWriteOnce + dataSource: null + resources: + requests: + storage: 1Gi + storageClassName: storageClass1 + volumeMode: Filesystem + volumeGroupNames: [volumeGroup1] +``` + +#### Volume Placement + +A VolumeGroupClass that supports placement: +``` +apiVersion: volumegroup.storage.k8s.io/v1alpha1 +kind: VolumeGroupClass +metadata: + name: placementGroupClass1 +spec: + parameters: + …... + allowedTopologies: [failure-domain.example.com/placement: storagePool1] +``` +``` +apiVersion: volumegroup.storage.k8s.io/v1alpha1 +kind: VolumeGroup +metadata: + Name: placemenGroup1 +spec: + volumeGroupClassName: placementGroupClass1 +``` + +A PVC that belongs to both the volume group with groupSnapshot support and placement. +``` +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: pvc1 + annotations: +spec: + accessModes: + - ReadWriteOnce + dataSource: null + resources: + requests: + storage: 1Gi + storageClassName: storageClass1 + volumeMode: Filesystem + volumeGroupNames: [volumeGroup1, placementGroup1] +``` + +Note: More details on VolumeGroup and VolumeGroupClass related changes in CSI Spec will be added after we make decisions on how to proceed with the design, i.e., should we drop the placement part. + +A new external controller will handle VolumeGroupClass and VolumeGroup resources. +External provisioner will be modified to read information from volume groups (through volumeGroupNames) and pass them down to the CSI driver. + +If both placement group and volume group with groupSnapshot support are defined, it is possible for the same volume to join both groups. For example, a volume group with groupSnapshot support may include volume members from two placement groups as they belong to the same application.