Kubernetes cluster Connection¶
The Kubernetes cluster Connection type enables connection to a Kubernetes cluster by SparkKubernetesOperator tasks and KubernetesPodOperator tasks.
Authenticating to Kubernetes cluster¶
There are different ways to connect to Kubernetes using Airflow.
- Use kube_config that reside in the default location on the machine(~/.kube/config) - just leave all fields empty 
- Use in_cluster config, if Airflow runs inside Kubernetes cluster take the configuration from the cluster - mark: In cluster configuration 
- Use kube_config from different location - insert the path into - Kube config path
- Use kube_config in JSON format from connection configuration - paste kube_config into - Kube config (JSON format)
Default Connection IDs¶
The default connection ID is kubernetes_default .
Configuring the Connection¶
- In cluster configuration
- Use in cluster configuration. 
- Kube config path
- Use custom path to kube config. 
- Kube config (JSON format)
- Kube config that used to connect to Kubernetes client. 
- Namespace
- Default Kubernetes namespace for the connection. 
- Cluster context
- When using a kube config, can specify which context to use. 
- Disable verify SSL
- Can optionally disable SSL certificate verification. By default SSL is verified. 
- Disable TCP keepalive
- TCP keepalive is a feature (enabled by default) that tries to keep long-running connections alive. Set this parameter to True to disable this feature. 
- Xcom sidecar image
- Define the - imageused by the- PodDefaults.SIDECAR_CONTAINER(defaults to- "alpine") to allow private repositories, as well as custom image overrides.
Example storing connection in env var using URI format:
AIRFLOW_CONN_KUBERNETES_DEFAULT='kubernetes://?in_cluster=True&kube_config_path=~%2F.kube%2Fconfig&kube_config=kubeconfig+json&namespace=namespace'
And using JSON format:
AIRFLOW_CONN_KUBERNETES_DEFAULT='{"conn_type": "kubernetes", "extra": {"in_cluster": true, "kube_config_path": "~/.kube/config", "namespace": "my-namespace"}}'