Skip to content

Commit

Permalink
Add Volume Group KEP
Browse files Browse the repository at this point in the history
  • Loading branch information
xing-yang committed May 4, 2020
1 parent 7070e2c commit 3c867ba
Showing 1 changed file with 366 additions and 0 deletions.
366 changes: 366 additions & 0 deletions keps/sig-storage/20200212-volume-group.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,366 @@
---
title: Volume Group
authors:
- "@xing-yang"
- "@jingxu97"
owning-sig: sig-storage
participating-sigs:
- sig-storage
reviewers:
- "@msau42"
- "@saad-ali"
- "@thockin"
approvers:
- "@msau42"
- "@saad-ali"
- "@thockin"
editor: TBD
creation-date: 2020-02-12
last-updated: 2020-02-12
status: provisional
see-also:
- n/a
replaces:
- n/a
superseded-by:
- n/a
---

# Title

Volume Group

## Table of Contents

<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [API Definitions](#api-definitions)
- [Example Yaml Files](#example-yaml-files)
- [Volume Group Snapshot](#volume-group-snapshot)
- [Volume Placement](#volume-placement)
<!-- /toc -->

## Summary

This proposal is to introduce a VolumeGroup API to manage multiple volumes together and a GroupSnapshot API to take a snapshot of a VolumeGroup.

## Motivation

While there is already a KEP (https://github.com/kubernetes/enhancements/pull/1051) that introduces APIs to do application snapshot, backup, and restore, there are other use cases not covered by that KEP.

Use case 1:
A VolumeGroup allows users to manage multiple volumes belonging to the same application together and therefore it is very useful in general. For example, it can be used to group all volumes in the same StatefulSet together.

Use case 2:
For some storage systems, volumes are always managed in a group. For these storage systems, they will have to create a group for a single volume if they need to implement a create volume function in Kubernetes. Providing a VolumeGroup API will be very convenient for them.

Use case 3:
Instead of taking individual snapshots one after another, VolumeGroup can be used as a source for taking a snapshot of all the volumes in the same volume group. This may be a storage level consistent group snapshot if the storage system supports it. In any case, when used together with ExecutionHook, this group snapshot can be application consistent. For this use case, we will introduce another CRD GroupSnapshot.

Use case 4:
VolumeGroup can be used to manage group replication or consistency group replication if the storage system supports it. Note replication is out of scope for this proposal. It is mentioned here as a potential future use case.

Use case 5:
VolumeGroup can be used to manage volume placement to either spread the volumes across storage pools or stack the volumes on the same storage pool. Related KEPs proposing the concept of storage pool for volume placement is as follows:
https://github.com/kubernetes/enhancements/pull/1353
https://github.com/kubernetes/enhancements/pull/1347
We may not really need a VolumeGroup for this use case. A StoragePool is probably enough. This is to be determined.

Use case 6:
VolumeGroup can also be used together with application snapshot. It can be a resource managed by the ApplicationSnapshot CRD.

Use case 7:
Some applications may not want to use ApplicationSnapshot CRD because they don’t use Kubernetes workload APIs such as StatefulSet, Deployment, etc. Instead, they have developed their own operators. In this case it is more convenient to use VolumeGroup to manage persistent volumes used in those applications.

### Goals

* Provide an API to manage multiple volumes together in a group.
* Provide an API to take a snapshot of a group of volumes.
* Provide a design to facilitate volume placement using the group API (To be determined).
* The group API should be generic and extensible so that it may be used to support other features in the future.

### Non-Goals

* A VolumeGroup may potentially be used to support replication group in the future, but providing design on replication group is not in the scope of this KEP. This can be discussed in the future.

## Proposal

This proposal introduces new CRDs VolumeGroup, VolumeGroupClass, and GroupSnapshot.

Create new group:
There are two ways to create a new VolumeGroup.
* Create a group with existing volumes, either with a list of PVCs or using a selector.
* Create an empty group first, then create each individual volume with group_id which will add a volume to the already created group.

Snapshot:
A GroupSnapshot can be created with a VolumeGroup as the source.

Restore:
* In the restore case, a VolumeGroup can be created from a GroupSnapshot source.
* We could also achieve this by creating a volume from a snapshot for all the snapshots in the GroupSnapshot, and then create a group with those restored volumes.

For the volume placement support, it assumes that storage pools exist on storage systems already. In VolumeGroupClass, there is an AllowedTopologies field that can be used to specify the accessibility of the group of volumes to storage pools and nodes. However it won’t have a field to track the capacities of the storage pools.

### API Definitions

API definitions are as follows:

```
type VolumeGroupClass struct {
metav1.TypeMeta
// +optional
metav1.ObjectMeta
// Driver is the driver expected to handle this VolumeGroupClass.
// This value may not be empty.
Driver string
// Parameters holds parameters for driver.
// These values are opaque to the system and are passed directly
// to the driver.
// +optional
Parameters map[string]string
// This field specifies whether group snapshot is supported.
// The default is false.
// +optional
GroupSnapshot *bool
// Restrict the topologies where a group of volumes can be located.
// Each driver defines its own supported topology specifications.
// An empty TopologySelectorTerm list means there is no topology restriction.
// This field is passed on to the drivers to handle placement of a group of
// volumes on storage pools.
// +optional
AllowedTopologies []api.TopologySelectorTerm
}
// VolumeGroup is a user's request for a group of volumes
type VolumeGroup struct {
metav1.TypeMeta
// +optional
metav1.ObjectMeta
// Spec defines the volume group requested by a user
Spec VolumeGroupSpec
// Status represents the current information about a volume group
// +optional
Status *VolumeGroupStatus
}
// VolumeGroupSpec describes the common attributes of group storage devices
// and allows a Source for provider-specific attributes
Type VolumeGroupSpec struct {
// +optional
VolumeGroupClassName *string
// If Source is nil, an empty volume group will be created.
// Otherwise, a volume group will be created with PVCs (if PVCList or Select is set) or
// with a GroupSnapshot as data source
// +optional
Source *VolumeGroupSource
}
// VolumeGroupSource contains 3 options. If VolumeGroupSource is not nil,
// one of the 3 options must be defined.
Type VolumeGroupSource struct {
// A list of existing persistent volume claims
// +optional
PVCList []PersistentVolumeClaim
// A label query over existing persistent volume claims to be added to the volume group.
// +optional
Selector *metav1.LabelSelector
// This field specifies the source of a volume group. (this is for restore)
// Supported Kind is GroupSnapshot
// +optional
GroupDataSource *TypedLocalObjectReference
}
type VolumeGroupStatus struct {
GroupCreationTime *metav1.Time
// A list of persistent volume claims
// +optional
PVCList []PersistentVolumeClaim
Ready *bool
// Last error encountered during group creation
Error *VolumeGroupError
}
// Describes an error encountered on the group
type VolumeGroupError struct {
// time is the timestamp when the error was encountered.
// +optional
Time *metav1.Time
// message details the encountered error
// +optional
Message *string
}
// GroupSnapshot is a user's request for taking a group snapshot.
type GroupSnapshot struct {
metav1.TypeMeta `json:",inline"`
// Standard object's metadata.
// +optional
metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
// Spec defines the desired characteristics of a group snapshot requested by a user.
Spec GroupSnapshotSpec `json:"spec" protobuf:"bytes,2,opt,name=spec"`
// Status represents the latest observed state of the group snapshot
// +optional
Status *GroupSnapshotStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
}
// GroupSnapshotSpec describes the common attributes of a group snapshot
type GroupSnapshotSpec struct {
// Source has the information about where the group snapshot is created from.
// Supported Kind is VolumeGroup
// +optional
Source *TypedLocalObjectReference `json:"source" protobuf:"bytes,1,opt,name=source"`
}
Type GroupSnapshotStatus struct {
// ReadyToUse becomes true when ReadyToUse on all individual snapshots become true
// +optional
ReadyToUse *bool
// List of volume snapshots
SnapshotList []VolumeSnapshot
}
type PersistentVolumeClaimSpec struct {
......
VolumeGroupId *string
......
}
type VolumeSnapshotSpec struct{
......
GroupSnapshotId *string
......
}
```

### Example Yaml Files

#### Volume Group Snapshot

Example yaml files to define a VolumeGroupClass and VolumeGroup are in the following.

A VolumeGroupClass that supports groupSnapshot:
```
apiVersion: volumegroup.storage.k8s.io/v1alpha1
kind: VolumeGroupClass
metadata:
name: volumeGroupClass1
spec:
parameters:
…...
groupSnapshot: true
```

A VolumeGroup belongs to this VolumeGroupClass:
```
apiVersion: volumegroup.storage.k8s.io/v1alpha1
kind: VolumeGroup
metadata:
Name: volumeGroup1
spec:
volumeGroupClassName: volumeGroupClass1
selector:
matchLabels:
app: my-app
```

A GroupSnapshot taken from the VolumeGroup:
```
apiVersion: volumegroup.storage.k8s.io/v1alpha1
kind: GroupSnapshot
metadata:
name: my-group-snapshot
spec:
source:
name: volumeGroup1
kind: VolumeGroup
apiGroup: volumegroup.storage.k8s.io
```

A PVC that belongs to the volume group which supports groupSnapshot:
```
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc1
annotations:
spec:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 1Gi
storageClassName: storageClass1
volumeMode: Filesystem
volumeGroupNames: [volumeGroup1]
```

#### Volume Placement

A VolumeGroupClass that supports placement:
```
apiVersion: volumegroup.storage.k8s.io/v1alpha1
kind: VolumeGroupClass
metadata:
name: placementGroupClass1
spec:
parameters:
…...
allowedTopologies: [failure-domain.example.com/placement: storagePool1]
```
```
apiVersion: volumegroup.storage.k8s.io/v1alpha1
kind: VolumeGroup
metadata:
Name: placemenGroup1
spec:
volumeGroupClassName: placementGroupClass1
```

A PVC that belongs to both the volume group with groupSnapshot support and placement.
```
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc1
annotations:
spec:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 1Gi
storageClassName: storageClass1
volumeMode: Filesystem
volumeGroupNames: [volumeGroup1, placementGroup1]
```

Note: More details on VolumeGroup and VolumeGroupClass related changes in CSI Spec will be added after we make decisions on how to proceed with the design, i.e., should we drop the placement part.

A new external controller will handle VolumeGroupClass and VolumeGroup resources.
External provisioner will be modified to read information from volume groups (through volumeGroupNames) and pass them down to the CSI driver.

If both placement group and volume group with groupSnapshot support are defined, it is possible for the same volume to join both groups. For example, a volume group with groupSnapshot support may include volume members from two placement groups as they belong to the same application.

0 comments on commit 3c867ba

Please sign in to comment.