HPC with Azure CycleCloud ========================= Azure CycleCloud is designed to support HPC in the cloud environment, specifically on Azure. It is tightly integrated with the vendor technologies. The controller is containerized, which can run anywhere under Docker. The only supported cloud is Azure. Thus, it makes little sense to run the container anywhere other that Azure. Install Azure CLI as documented in `Deployment of Kubernetes Cluster onto Various Clouds `_ to get started. Cost ---- CycleCloud seems very expensive. It is roughly £1 per hour by just idling a single 4-core master node of a Slurm cluster. HPC on CycleCloud ----------------- CycleCloud provides an FQDN (e.g. `cyclecloud.westeurope.azurecontainer.io` named in the deployment) mapping to an external IP once deployed. As shown on our deployment `https://cyclecloud.westeurope.azurecontainer.io/cloud/cluster_list `_, the following schedulers are supported natively: 1. Slurm 2. PBS 3. HTCondor 4. Grid Engine The following file systems are supported natively: 1. BeeGFS 2. GlusterFS 3. NFS Most interesting to us is that both Docker and Singularity are supported natively by CycleCloud. .. image:: /static/images/Tech-tips/CycleCloudGUI.png Configuration ------------- Configure Azure CycleCloud by following the instructions on GUI. Tricky part is to provide the information of service principle so that an Azure subscription is available as a cloud provider. CLI --- The commandline interface is critical for any real workload. Download it from the GUI (e.g. https://..azurecontainer.io/download/tools/cyclecloud-cli.zip). Unzip and run `install.sh`. `/etc/paths` may need to be updated to include `/Users/davidyuan/bin` in `PATH`. Here are two tutorials to customize the Azure CycleCloud: * `To modify a cluster template `_ * `To deploy customer application `_ Subscription ------------ HPC consumes significantly amount of resources. It is a good idea to create a separate subscription for each project to force the separation of resources and accounting. It also makes scripting a bit easier by allowing some parameters hard-coded. `az login` reports a list of subscriptions and which one is the default. The same information can also be found via `az account list`. Create a new one via `the portal `_. It is always a good idea to set the present working subscription as default:: az account set --subscription "" Service principle ----------------- At least one service principle is needed to allow CycleCloud to access Azure cloud resources in a subscription. It must be created at the subscription scope:: az ad sp create-for-rbac --scopes="/subscriptions/" Take note of the JSON response. The information is needed to create cloud provider account in CycleCloud GUI. It is quite hard to find it again via `az ad sp list` and application secret will be hidden. Resource group -------------- Use `az account list-locations` to find a valid location code for a subscription. Note that not all services are available in all locations. Create a resource group to organize resources for CycleCloud:: az group create --name ${CIName} --location ${Location} Vnet and subnet --------------- The CycleCloud requires three subnets for production. They are needed to create HPC clusters in GUI. * cycle: The subnet in which the CycleCloud server is started in * compute: A /22 subnet for the HPC clusters * user: The subnet for creating user logins For non-production, one subnet is enough:: az network vnet create --name ${CIName} --resource-group ${CIName} --address-prefix 10.0.0.0/16 az network vnet subnet create --resource-group ${CIName} --vnet-name ${CIName} --name compute --address-prefix 10.0.0.0/22 Container instance ------------------ The CycleCloud is packaged as RPM, DEB or container. The container does not support Kubernetes at present. This means that it can not be running on AKS but can be installed on Azure Container Instances:: az container create \ --resource-group ${CIName} \ --location ${Location} \ --name ${CIName} \ --dns-name-label ${CIName} \ --image mcr.microsoft.com/hpc/azure-cyclecloud \ --ip-address public \ --ports 80 443 \ --cpu 2 \ --memory 4 \ -e JAVA_HEAP_SIZE=2048 FQDN="${FQDN}"