Deployment of Kubernetes Cluster onto Various Clouds

Kubernetes is the de facto orchestration engine for containers on all popular cloud platforms. It is necessary to be able to deploy Kubernetes clusters on the clouds. Rancher provides similar portal as ECP. Regardless which user interface to use in the future, here is a list of the clouds to be supported:

  1. OpenStack
  2. Google Cloud Platform
  3. Microsoft Azure
  4. Docker Enterprise Edition
  5. Amazon Web Services

There are two major aspects of running Kubernetes: compute and storage. Storage almost always needs to be set up explicitely for containers running in Kubwernetes.

Cloud Provider Cluster Storage Workload Persistent Volume Claim (PVC)
AWS (EKS) kops commands Helm chart Rfam StatefulSet nfs-server-provisioner
DEE (VMWare) Built-in on VMs Built-in NFS servers Rfam StatefulSet Built-in standard-01, standard-02
GCP (GKE) Terraform role / gcloud command NFS storage via GUI Rfam StatefulSet nfs-client-provisioner
MSA (AKS) Terraform script Helm chart Rfam StatefulSet nfs-server-provisioner
OSK (VMs) FM!! cpa-kubernetes Helm chart Rfam StatefulSet nfs-client-provisioner

We need to use various different ways to set set up the cluster and storage. The workload and PVC can become quite similar afterwards. Note that only NFS is tried at present. Additional storage methods such as S3 object store and NoSQL database need to be investigated.

Note that nfs-server-provisioner is not production ready. It is using emptyDir as its PV. Helm chart needs to be modified for a real PV. However, Neither AWS nor MSA provides NFS service.

Here is the project for the sample code on all 5 clouds. It contains scripts to create clusters and to deploy a stateful set or a deployment with NFS storage. A real pipeline Rfam is used in the sample. Download the sample via git clone on to laptop, Azure Cloud Shell or Google Cloud Shell.

Command Line Interfaces

Both GCP and MSA provide Cloud Shells. They both provide options to upload and download files, too. It is optional to install CLIs for these two clouds locally. However, local CLI still provides convenience from time to time, especially when they would not initialize, timeout too quickly, or keep deleting some handy tools such as git or helm.

CLI for AWS

AWS does not have CLI integrated into a browser as Google and Microsoft do. It is necessary to install it on a local machine:

pip3 install awscli --upgrade --user

Make sure that the install path /Users/davidyuan/Library/Python/3.7/bin is added to /etc/paths. Run the same command to upgrade the client.

Configure it with aws configure command. Generate and download access key by selecting IAM service at AWS console.

More information can be found here.

Configure kubectl for EKS

Download aws-iam-authenticator and make it executable under /usr/local/bin:

curl -o /usr/local/bin/aws-iam-authenticator https://amazon-eks.s3-us-west-2.amazonaws.com/1.11.5/2018-12-06/bin/darwin/amd64/aws-iam-authenticator
chmod +x /usr/local/bin/aws-iam-authenticator

CLI for GCP

If it is necessary to interact with GCP outside Google Cloud Shell, the Google Cloud SDK can be installed on Mac OS directly:

brew cask install google-cloud-sdk

CLI for MSA

Install and configure Azure CLI is extremely simple:

brew update && brew install azure-cli
az login

To start using kubectl, merge the credential to the configure context in ~/.kube/config and set it as the current context:

az aks get-credentials --resource-group azure-k8scluster --name k8scluster

CLI for OSK

The instructions to Installing OpenStack CLI on Mac OS X is detailed in a separate document.

Amazon Web Services

AWS GUI for EKS does not seem working. Neither does Terraform provider. Kops is able to provide production grade Kubernetes cluster. However, the cluster is not integrated into EKS.

  1. Log onto AWS.
  2. Follow a few simple steps to install kops and aws.
  3. Register a domain 4ebi.uk, and create an S3 bucket for kops state if not done already.
  4. Build cluster configuration and deploy the new cluster onto AWS.
  5. Deploy Helm chart nfs-server-provisioner to define a storage class.
  6. Deploy PVC for shared storage.
  7. Deploy Rfam StatefulSet for the workload and private storage.

Note: run aws configure if needed. The kops creates a master node and resets the current-context on the client machine. Thus, it creates two masters.

Docker Enterprise Edition

This is the simpliest environment to work with. The cluster, storage, storage classes are all built-in. The tricky part is to set up CLI. Due to the way RBAC is set up in this environment, a lot of Helm charts may not work. However, this is not a very efficient environment. The cluster is built on top of VMWare VMs.

  1. Log onto HX-UCP or HH-UCP.
  2. Install UCP client bundle and Kubernetest CLI.
  3. Deploy PVC for shared storage.
  4. Deploy Rfam StatefulSet for the workload and private storage.

Note: cd hx-ucp-bundle-xxx or cd hh-ucp-bundle-xxx to run source env.sh to reset current-context for Kubernetes if needed.

Google Cloud Platform

Google provides three ways to create a cluster: GUI, CLI and Terraform module. All of them can product the same cluster consistently. GCP GUI also has wizard to provision NFS server. There seems no CLI for this.

  1. Log onto GCP.
  2. Create a GKE cluster via GUI, gcloud command or Terraform module.
  3. Create NFS storage via GUI. Take note of the IP address and export path.
  4. Deploy Helm chart nfs-client-provisioner to define a storage class.
  5. Deploy PVC for shared storage.
  6. Deploy Rfam StatefulSet for the workload and private storage.

Note: run all commands from the cloud shell.

NFS Filestore

GCP provides NFS filestore, which can be created or managed at https://console.cloud.google.com/filestore. Take note of IP address and export path for NFS mount or PVC.

Microsoft Azure

Azure provides three ways to create a cluster: GUI, CLI and Terraform module, with Terraform as tutorials. They support both PowerShell and BASH by default in Cloud Shell. In addition, Terraform is installed by default in Cloud Shell, too.

  1. Log onto MSA.
  2. Create a AKE cluster via GUI, gcloud command or Terraform module.
  3. Deploy Helm chart nfs-server-provisioner to define a storage class.
  4. Deploy PVC for shared storage.
  5. Deploy Rfam StatefulSet for the workload and private storage.

Note: Initialize Terraform with the instructions. Make sure the storage container is created and Terraform is initialized to use it for state. Also make sure that service principle for RBAC is created.

Tip: clouddrive unmount may be needed if the subscription with the storage for clouddrive is disabled or deleted.

OpenStack

The Embassy cloud is based on RedHat version of OpenStack. It does not support Magnum. Therefore, Kubernetes must run on VMs on Embassy. The home-grown cpa-kubernetes is used.

  1. Log onto ECP
  2. Import https://github.com/EMBL-EBI-TSI/cpa-kubernetes into ECP.
  3. Deploy the two CPAs. Take note of the IP address and export path. Note that if an NFS server is available, cpa-nfs-server-and-vol does not need to be deployed.
  4. Deploy Helm chart nfs-client-provisioner to define a storage class.
  5. Deploy PVC for shared storage.
  6. Deploy Rfam StatefulSet for the workload and private storage.

Note: cpa-kubernetes creates a single master. Run all commands from the master.