Production Guide

The following are things to consider when using this Helm chart in a production environment.

Database

You will want to use an external database instead of the one deployed with the chart by default. Both PostgreSQL and MySQL are supported. Supported versions can be found on the Set up a Database Backend page.

# Don't deploy postgres
postgresql:
  enabled: false

# Use an external database
data:
  metadataConnection:
    user: ...
    pass: ...
    protocol: postgresql  # or 'mysql'
    host: ...
    port: ...
    db: ...

PgBouncer

If you are using PostgreSQL as your database, you will likely want to enable PgBouncer as well. Airflow can open a lot of database connections due to its distributed nature and using a connection pooler can significantly reduce the number of open connections on the database.

pgbouncer:
  enabled: true

Depending on the size of you Airflow instance, you may want to adjust the following as well (defaults are shown):

pgbouncer:
  # The maximum number of connections to PgBouncer
  maxClientConn: 100
  # The maximum number of server connections to the metadata database from PgBouncer
  metadataPoolSize: 10
  # The maximum number of server connections to the result backend database from PgBouncer
  resultBackendPoolSize: 5

Webserver Secret Key

You should set a static webserver secret key when deploying with this chart as it will help ensure your Airflow components only restart when necessary.

Warning

You should use a different secret key for every instance you run, as this key is used to sign session cookies and perform other security related functions!

First, generate a strong secret key:

python3 -c 'import secrets; print(secrets.token_hex(16))'

Now add the secret to your values file:

webserverSecretKey: <secret_key>

Alternatively, create a kubernetes Secret and use webserverSecretKeySecretName:

webserverSecretKeySecretName: my-webserver-secret
# where the random key is under `webserver-secret-key` in the k8s Secret

Example to create a kubernetes Secret from kubectl:

kubectl create secret generic my-webserver-secret --from-literal="webserver-secret-key=$(python3 -c 'import secrets; print(secrets.token_hex(16))')"

Extending and customizing Airflow Image

The Apache Airflow community, releases Docker Images which are reference images for Apache Airflow. However, Airflow has more than 60 community managed providers (installable via extras) and some of the default extras/providers installed are not used by everyone, sometimes others extras/providers are needed, sometimes (very often actually) you need to add your own custom dependencies, packages or even custom providers, or add custom tools and binaries that are needed in your deployment.

In Kubernetes and Docker terms this means that you need another image with your specific requirements. This is why you should learn how to build your own Docker (or more properly Container) image.

Typical scenarios where you would like to use your custom image:

  • Adding apt packages

  • Adding PyPI packages

  • Adding binary resources necessary for your deployment

  • Adding custom tools needed in your deployment

See Building the image for more details on how you can extend and customize the Airflow image.

Managing DAG Files

See Manage DAGs files.

knownHosts

If you are using dags.gitSync.sshKeySecret, you should also set dags.gitSync.knownHosts. Here we will show the process for GitHub, but the same can be done for any provider:

Grab GitHub’s public key:

ssh-keyscan -t rsa github.com > github_public_key

Next, print the fingerprint for the public key:

ssh-keygen -lf github_public_key

Compare that output with GitHub’s SSH key fingerprints.

They match, right? Good. Now, add the public key to your values. It’ll look something like this:

dags:
  gitSync:
    knownHosts: |
      github.com ssh-rsa AAAA...FAaQ==

Accessing the Airflow UI

How you access the Airflow UI will depend on your environment, however the chart does support various options:

Ingress

You can create and configure Ingress objects. See the Ingress chart parameters. For more information on Ingress, see the Kubernetes Ingress documentation.

LoadBalancer Service

You can change the Service type for the webserver to be LoadBalancer, and set any necessary annotations:

webserver:
  service: LoadBalancer
  annotations: {}

For more information on LoadBalancer Services, see the Kubernetes LoadBalancer Service Documentation.

Logging

Depending on your choice of executor, task logs may not work out of the box. All logging choices can be found at Manage logs.

Metrics

The chart can support sending metrics to an existing StatsD instance or provide a Prometheus endpoint.

Prometheus

The metrics endpoint is available at svc/{{ .Release.Name }}-statsd:9102/metrics.

External StatsD

To use an external StatsD instance:

statsd:
  enabled: false
config:
  metrics:  # or 'scheduler' for Airflow 1
    statsd_on: true
    statsd_host: ...
    statsd_port: ...

Celery Backend

If you are using CeleryExecutor or CeleryKubernetesExecutor, you can bring your own Celery backend.

By default, the chart will deploy Redis. However, you can use any supported Celery backend instead:

redis:
  enabled: false
data:
  brokerUrl: redis://redis-user:password@redis-host:6379/0

For more information about setting up a Celery broker, refer to the exhaustive Celery documentation on the topic.

Security Context Constraints

A Security Context Constraint (SCC) is a OpenShift construct that works as a RBAC rule however it targets Pods instead of users. When defining a SCC, one can control actions and resources a POD can perform or access during startup and runtime.

The SCCs are split into different levels or categories with the restricted SCC being the default one assigned to Pods. When deploying Airflow to OpenShift, one can leverage the SCCs and allow the Pods to start containers utilizing the anyuid SCC.

In order to enable the usage of SCCs, one must set the parameter rbac.createSCCRoleBinding to true as shown below:

rbac:
  create: true
  createSCCRoleBinding: true

In this chart, SCCs are bound to the Pods via RoleBindings meaning that the option rbac.create must also be set to true in order to fully enable the SCC usage.

For more information about SCCs and what can be achieved with this construct, please refer to Managing security context constraints.

Was this entry helpful?