Rancher + VSphere Out-Of-Tree Provider

This stumped me for a while since I did not find any article that explained this in a TLDR; way. Frankly, I still don’t understand it fully but it works and for now I’m going to leave it at that. 🙂

As always, if you see something that is not correct or can be improved upon or just want to leave a thanks, please leave a comment below.

Why should we do this?

Before the cloud providers has been built-in into Kubernetes code base (in-tree). Since ie. Vmware had to wait for a new Kubernetes release to get new code out there it is not a very efficient process. On top of that the code base is more bulky with all these features shipped in every release. I guess most people would not use more than one Cloud provider in a cluster so that means larger footprint for features not used.

This said, there is a functioning provider “in-tree” but it won’t be getting new features so Vmware has made it clear where the future is.

Some prep work

First we need to create a storage class in Vmware. This is to abstract the way storage is chosen when someone asks for a persistent volume. Go to Menu -> Policies and Profiles and then click on VM Storage Policies.

Disclaimer. Before this I never even touched Storage classes in VMWare so you might want to read up a bit yourself before following me blind into the unknown.

There’s likely a few of them there. I created my own but if you have one you like you can just skip this step. The guide was pretty simple. In my case I gave it a name, enabled host base rules chose custom encryption and left it at default, noted that all my local disks were compatible and the clicked on Finish.

All you need here is one Storage Class to use as the default way for Vmware to choose disks for the persistent volumes but if you have different Tiers of disk here, like slow disks, fast disks you can work with ie. tagging and have different storage classes named ie. Fast, Slow. Gold, Silver and Bronze also seems popular with the cool kids.

Installing the Vsphere provider

First, the installation. I simply followed the guide that Rancher provided but read the three sections below before you do anything so you’re well prepared. The market place installation is really easy. Warning, if you’re adding this to an existing cluster you probably will cause disruptions. I did it but this is my home lab so I did not care much. Did not lose any data but then again, experiences may vary!

If you have an existing cluster

If you have an existing cluster you need to pre-taint your nodes with uninitialized=true:Noshedule. The reason behind this is that the vsphere controller will double check that the nodes are OK before they enter the cluster.

for node in $(kubectl get nodes | awk '{print $1}' | tail -n +2); do
    kubectl taint node $node node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule
done

If you have an existing cluster you also need to change rancher_kubernetes_engine_config.cloud_provider.name to external this re-configured my cluster but did not destroy my nodes.

Notes regarding the installation

  • Login with the full user notation, ie. domain.local\rancher as opposed to just rancher
  • In the step where you configure Storage class in the CSI installation you will refer to the VMware Storage policy you created earlier. I left the storage class name default and entered my VSphere storage class.
Only later I found that you can have space in the names of the Storage Policy. Does my chosen naming convention annoy me? Yes. Will I change it? No. Good reminder to read more before doing! 😉
  • I left the Node configuration blank. My nodes managed by Rancher and based on Ubutu + RKE.

VM Machine version

The last thing you might want to check is if your Nodes are running on a machine version higher than 13. If not there will be issues later on when trying to attach the persistent volume to the Nodes. You can check this in the VM details under compatibility.

If it’s below 13 I would advice you to schedule an upgrade by right clicking on the VM, choosing Compatibility and scheduling an upgrade. After this is done, cordon off the node in the Rancher cluster manager and reboot it to upgrade it.

You might also want to change the default machine version by right clicking on the Data Center in Vcenter and clicking Edit Default VM Compatibility. This did not help me when creating new nodes though as the template I was using had compatibility version of 10. In order to fix this I had to convert my template to a virtual machine, upgrade the machine version and then convert it back to a template.

You also might want to keep an eye on this GitHub issue for a way to do this via the Node template in Rancher.

Creating and using a Persistent Volume

Woah, longer article than I originally planned. Now to the fun stuff, actually putting the stuff we just prepared into practive. From what I’ve read the normal way to provision persistent storage in Kubernetes is to first pre-allocate disk space by creating a volume, then create a PersistentVolumeClaim to claim a part of, or the whole volume.

With the CSI provider Vsphere will take care of the PersistentVolume for you and all you need is to define a persistent volume claim with the correct storage class name.

Example Persistent Volume Claim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-pvc
  namespace: grafana
spec:
  storageClassName: vsphere-csi-sc
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

Note that the storage class here is the default storage class name that is chosen for you when installing CSI. Don’t confuse Storage Class with Vsphere Storage Policy!

Using the volume

There’s probably thousands of examples out there but to make this post complete, here’s one example how to run Grafana with the PVC above:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana
  namespace: grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
      - image: grafana/grafana:7.5.2
        name: grafana
        ports:
        - containerPort: 3000
          name: http
        volumeMounts:
          - name: grafana-storage
            mountPath: /var/lib/grafana
      volumes:
        - name: grafana-storage
          persistentVolumeClaim:
            claimName: grafana-pvc
      securityContext:
        runAsNonRoot: true
        runAsUser: 65534
        fsGroup: 472

Chaos monkey-ish

Find out where the pod is running:

Kill the node using vcenter:

Grafana still lives! Let’s delete the node:

Rancher starts to spin up a new machine directly, grafana lives:

And when rancher-prod3 is shutting down vcenter attaches the disk to the next available node:

Success!

Some final words

When you create the persistent volume claim above Vsphere will create a disk and attach it to the node running the pod in question. This happens automatically and Vsphere will move around the disk and pods for you if nodes are restarted. You can see all this happening real-time in Vcenter!

One last thing that you might want to note is that newly provisioned nodes will have the aforementioned taint for a while before the Vpshere controller marks them as ready.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *