HPC with Slurm on GCP ===================== A Slurm cluster can be created easily on GCP, following instructions in the git repository `Slurm on Google Cloud Platform `_. Its Terraform script is in beta. Its Deployment Manager script is in production quality with excellent security design. Accessing with CLI ------------------ The newly created cluster has a dedicated login node. In the most secure configuration, no public IPs are assigned to any nodes. The firewall only allows ICMP and TCP port 22. Follow instructions in `Accessing GCP node from CLI `_ to access the Slurm cluster via SSH, SCP, rsync, etc. Enable GCSFuse -------------- GCSFuse presents storage objects as files on shared directories. This allows you to access Petabytes of storage without pre-allocating anything. There is no downloading or uploading needed. You do not need to hard-code any keys or passwords, either. Edit basic.tfvars if you are using Terraform script or slurm-cluster.yaml if you are using Deployment Manager to create Slurm clusters. Here are the steps in Terraform:: Add "https://www.googleapis.com/auth/devstorage.full_control" to compute_node_scopes, for example: compute_node_scopes = [ "https://www.googleapis.com/auth/monitoring.write", "https://www.googleapis.com/auth/logging.write", "https://www.googleapis.com/auth/devstorage.full_control" ] (Optional) If you are using a none default service account, make sure it has "Storage Admin" role. If you are using the default service account (i.e., compute_node_service_account = "default"), nothing needs to be done. Edit network_storage for worker nodes, for example: network_storage = [ { server_ip = "none" remote_mount = "dy-test-301718" local_mount = "/data" fs_type = "gcsfuse" mount_options = "file_mode=666,dir_mode=777,allow_other" } ] This can be found in our slurm-master repo where we put the latest working code from official repo, https://gitlab.ebi.ac.uk/TSI/slurm-main/-/blob/master/tf/examples/basic/basic.tfvars.sample You can fork this repo, add your changes in tfvars and follow the README to setup CI/CD of your slurm cluster infrastructure. Monitoring ---------- Google Stackdrive should be enabled for monitoring. The dashboard can be accessed from Google Cloud Console, for example https://console.cloud.google.com/monitoring/dashboards/resourceList/gce_instance?project=citc-slurm&timeDomain=1h. References ---------- #. HPC made easy: Announcing new features for Slurm on GCP, https://cloud.google.com/blog/products/compute/hpc-made-easy-announcing-new-features-for-slurm-on-gcp #. Slurm on Google Cloud Platform, https://github.com/SchedMD/slurm-gcp#stand-alone-cluster-in-google-cloud-platform