Design: Storage in Cluster-API architecture

Create Volume

Dynamic provisioning in Guest Cluster

  1. kubectl apply -f pvc.yaml is initiated by user, and the request is sent to kube-api-server .

  2. kube-controller create the PVC, but its status will be pending.

  3. gcCSI watches the creation of PVC then send request to the kube-api-server in Management Cluster to create the volume.

  4. kube-controller in Management Cluster creates the PVC.

  5. mgmtCSI watches the creation of PVC in turn calls the Storage Infra API to create the volume. Loop on the status checking.

    5.1 When volume is provisioned, mgmtCSI will create the PV in Management Cluster . The PV creation is actually done by the external-provisioner which runs as a side-car container within controller-plugin of the mgmtCSI .

  6. kube-controller listens on the event from PV. Once PV 's status is bound, it binds the PVC to the PV.

  7. gcCSI waits on the status of PVC creation in management cluster and create PV in Guest Cluster accordingly.

  8. kube-controller in Guest Cluster binds the PVC to PV eventually.

Static provisioning in Guest Cluster

The static provisioning in Guest Cluster is a little bit tricky. If we only have one K8S cluster, user needs to create the volume manually first, and then create PV to bind to the volume manually created (Refer to Volume Provisioning for more details).

However, things are different in cluster API picture. First, we have to have the volume provisioned(no matter using dynamic provisioning or static provisioning) under Management Cluster, so it could be used by one or multiple Guest Cluster. The "volume provisioned" means that we have PV and PVC ready in Mangement Cluster. Second, user needs to create the PV in Guest cluster which refers to the PVC in Management Cluster.

The differences between static provisioning in Guest Cluster and the static provisioning in single cluster are:

  • Single cluster is just one layer, the PV refers to the volume user manually created. Guest cluster has multiple layers, and its PV refers to the PVC in Management Cluster instead of the volume.

  • TBA

Open questions

How to deal with ReclaimPolicy

You probably noticed that we might have two ReclaimPolicy. One is in Management Cluster, one is in Guest Cluster.

  • In dynamic provisioning case, they should be the same in StorageClass spec.

  • In static provisioning case, the ReclaimPolicy in PV could be different between Management Cluster and Guest Cluster. If Guest Cluster user wants the ReclaimPolicy to be delete, the ReclaimPolicy in Management Cluster needs to be delete as well. That is because the user wants the volume to be deleted if PV gets deleted. If Guest Cluster user wants the ReclaimPolicy to be retain, the ReclaimPolicy in Management Cluster could to be delete (it does not neccessarily need to be delete, things could be different based on business logic). That is because user wants the volume to be retained if PV gets deleted, so gcCSI should not invoke any API calls to mgmtCSI to remove the volume.

Attach Volume

When a Volume gets created by mgmtCSI from Create Volume Step 5 above, it needs to be attached to the node where the Pod is running.

  • gcCSI knows where the Pod is scheduled and updates VirtualMachine.Spec.Volumes

  • VM-Operator running on Mangement Cluster watches the VirtualMachine.Spec.Volume , if new Volumes are added, VM-Operator creates VolumeAttachment instances accordingly with the NodeUUID and VolumeName . Those two are the information mgmtCSI needs to attach volumes to node.

  • Once volumes are attached (No matter succeeded or failed), mgmtCSI will update VolumeAttachmentStatus

  • VM-Operator watches the changes from VolumeAttachmentStatus updates VirtualMachine.Status.Volumes accordingly.

  • gcCSI watches the changes from VirtualMachine.Status.Volumes and updates PVC accordingly.

Reference

Last updated

Was this helpful?