Google GKE

Fusion streamlines the deployment of Nextflow pipelines in Kubernetes because it replaces the need to configure and maintain a shared file system in your cluster.

Platform Google GKE compute environments

Seqera Platform supports Fusion in Google GKE compute environments.

See Google GKE for Platform instructions to enable Fusion.

Nextflow CLI

This feature requires Nextflow 23.02.1-edge or later.

To use Fusion directly in Nextflow with a Google GKE cluster, you must configure a cluster, namespace, and service account, and update your Nextflow configuration.

Kubernetes configuration

Create a GKE "standard" cluster ("Autopilot" is not supported). See Creating a zonal cluster for more information.
Use instance types with 2 or more CPUs and SSD storage (families: n1, n2, c2, m1, m2, m3).
Enable the Workload identity feature when creating (or updating) the cluster:
- Enable Workload Identity in the cluster Security settings.
- Enable GKE Metadata Server in the node group Security settings.
See Authenticate to Google Cloud APIs from GKE workloads to configure the cluster.
Replace the following example values with values corresponding your environment:
- CLUSTER_NAME: the GKE cluster name — cluster-1
- COMPUTE_REGION: the GKE cluster region — europe-west1
- NAMESPACE: the GKE namespace — fusion-demo
- KSA_NAME: the GKE service account name — fusion-sa
- GSA_NAME: the Google service account — gsa-demo
- GSA_PROJECT: the Google project id — my-nf-project-261815
- PROJECT_ID: the Google project id — my-nf-project-261815
- ROLE_NAME: the role to grant access permissions to the Google Storage bucket — roles/storage.admin

Create the K8s role and rolebinding required to run Nextflow by applying the following Kubernetes config:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: fusion-demo
  name: fusion-role
rules:
  - apiGroups: [""]
    resources: ["pods", "pods/status", "pods/log", "pods/exec"]
    verbs: ["get", "list", "watch", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: fusion-demo
  name: fusion-rolebind
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: fusion-role
subjects:
  - kind: ServiceAccount
    name: fusion-sa
---
apiVersion: v1
kind: Secret
metadata:
  namespace: fusion-demo
  name: fusion-sa-token
  annotations:
    kubernetes.io/service-account.name: fusion-sa
type: kubernetes.io/service-account-token
...

Nextflow configuration

Add the following to your nextflow.conf file:

wave.enabled = true
fusion.enabled = true
process.executor = 'k8s'
process.scratch = false
k8s.context = '<YOUR-GKE-CLUSTER-CONTEXT>'
k8s.namespace = 'fusion-demo'
k8s.serviceAccount = 'fusion-sa'
k8s.pod.nodeSelector = 'iam.gke.io/gke-metadata-server-enabled=true'

Replace <YOUR-GKE-CLUSTER-CONTEXT> with the context name in your Kubernetes configuration.

Run the pipeline with the usual run command:
```
nextflow run <YOUR PIPELINE SCRIPT> -w gs://<YOUR-BUCKET>/work
```
Replace <YOUR-BUCKET> with a Google Cloud Storage bucket to which you have read-write access.

Platform Google GKE compute environments​

Nextflow CLI​

Kubernetes configuration​

Nextflow configuration​

Platform Google GKE compute environments

Nextflow CLI

Kubernetes configuration

Nextflow configuration