Kubernetes cluster Connection¶
The Kubernetes cluster Connection type enables connection to a Kubernetes cluster by SparkKubernetesOperator
tasks and KubernetesPodOperator
tasks.
Authenticating to Kubernetes cluster¶
There are different ways to connect to Kubernetes using Airflow.
Use kube_config that reside in the default location on the machine(~/.kube/config) - just leave all fields empty
Use in_cluster config, if Airflow runs inside Kubernetes cluster take the configuration from the cluster - mark: In cluster configuration
Use kube_config from different location - insert the path into
Kube config path
Use kube_config in JSON format from connection configuration - paste kube_config into
Kube config (JSON format)
Default Connection IDs¶
The default connection ID is kubernetes_default
.
Configuring the Connection¶
- In cluster configuration
Use in cluster configuration.
- Kube config path
Use custom path to kube config.
- Kube config (JSON format)
Kube config that used to connect to Kubernetes client.
- Namespace
Default Kubernetes namespace for the connection.
- Cluster context
When using a kube config, can specify which context to use.
- Disable verify SSL
Can optionally disable SSL certificate verification. By default SSL is verified.
- Disable TCP keepalive
TCP keepalive is a feature (enabled by default) that tries to keep long-running connections alive. Set this parameter to True to disable this feature.
- Xcom sidecar image
Define the
image
used by thePodDefaults.SIDECAR_CONTAINER
(defaults to"alpine"
) to allow private repositories, as well as custom image overrides.
Example storing connection in env var using URI format:
AIRFLOW_CONN_KUBERNETES_DEFAULT='kubernetes://?in_cluster=True&kube_config_path=~%2F.kube%2Fconfig&kube_config=kubeconfig+json&namespace=namespace'
And using JSON format:
AIRFLOW_CONN_KUBERNETES_DEFAULT='{"conn_type": "kubernetes", "extra": {"in_cluster": true, "kube_config_path": "~/.kube/config", "namespace": "my-namespace"}}'