Setting Configuration Options

The first time you run Airflow, it will create a file called airflow.cfg in your $AIRFLOW_HOME directory (~/airflow by default). This is in order to make it easy to “play” with airflow configuration.

However, for production case you are advised to generate the configuration using command line:

airflow config list --defaults

This command will produce the output that you can copy to your configuration file and edit.

It will contain all the default configuration options, with examples, nicely commented out so you need only un-comment and modify those that you want to change. This way you can easily keep track of all the configuration options that you changed from default and you can also easily upgrade your installation to new versions of Airflow when they come out and automatically use the defaults for existing options if they changed there.

You can redirect it to your configuration file and edit it:

airflow config list --defaults > "${AIRFLOW_HOME}/airflow.cfg"

You can also set options with environment variables by using this format: AIRFLOW__{SECTION}__{KEY} (note the double underscores).

For example, the metadata database connection string can either be set in airflow.cfg like this:

[database]
sql_alchemy_conn = my_conn_string

or by creating a corresponding environment variable:

export AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=my_conn_string

Note that when the section name has a dot in it, you must replace it with an underscore when setting the env var. For example consider the pretend section providers.some_provider:

[providers.some_provider]
this_param = true
export AIRFLOW__PROVIDERS_SOME_PROVIDER__THIS_PARAM=true

You can also derive the connection string at run time by appending _cmd to the key like this:

[database]
sql_alchemy_conn_cmd = bash_command_to_run

You can also derive the connection string at run time by appending _secret to the key like this:

[database]
sql_alchemy_conn_secret = sql_alchemy_conn
# You can also add a nested path
# example:
# sql_alchemy_conn_secret = database/sql_alchemy_conn

This will retrieve config option from Secret Backends e.g Hashicorp Vault. See Secrets Backends for more details.

The following config options support this _cmd and _secret version:

  • sql_alchemy_conn in [database] section

  • fernet_key in [core] section

  • broker_url in [celery] section

  • flower_basic_auth in [celery] section

  • result_backend in [celery] section

  • password in [atlas] section

  • smtp_password in [smtp] section

  • secret_key in [webserver] section

The _cmd config options can also be set using a corresponding environment variable the same way the usual config options can. For example:

export AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_CMD=bash_command_to_run

Similarly, _secret config options can also be set using a corresponding environment variable. For example:

export AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET=sql_alchemy_conn

Note

The config options must follow the config prefix naming convention defined within the secrets backend. This means that sql_alchemy_conn is not defined with a connection prefix, but with config prefix. For example it should be named as airflow/config/sql_alchemy_conn

The idea behind this is to not store passwords on boxes in plain text files.

The universal order of precedence for all configuration options is as follows:

  1. set as an environment variable (AIRFLOW__DATABASE__SQL_ALCHEMY_CONN)

  2. set as a command environment variable (AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_CMD)

  3. set as a secret environment variable (AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET)

  4. set in airflow.cfg

  5. command in airflow.cfg

  6. secret key in airflow.cfg

  7. Airflow’s built in defaults

Note

For Airflow versions >= 2.2.1, < 2.3.0 Airflow’s built in defaults took precedence over command and secret key in airflow.cfg in some circumstances.

You can check the current configuration with the airflow config list command.

If you only want to see the value for one option, you can use airflow config get-value command as in the example below.

$ airflow config get-value core executor
SequentialExecutor

Note

For more information on configuration options, see Configuration Reference

Note

See Modules Management for details on how Python and Airflow manage modules.

Note

Use the same configuration across all the Airflow components. While each component does not require all, some configurations need to be same otherwise they would not work as expected. A good example for that is secret_key which should be same on the Webserver and Worker to allow Webserver to fetch logs from Worker.

The webserver key is also used to authorize requests to Celery workers when logs are retrieved. The token generated using the secret key has a short expiry time though - make sure that time on ALL the machines that you run airflow components on is synchronized (for example using ntpd) otherwise you might get “forbidden” errors when the logs are accessed.

Configuring local settings

Some Airflow configuration is configured via local setting, because they require changes in the code that is executed when Airflow is initialized. Usually it is mentioned in the detailed documentation where you can configure such local settings - This is usually done in the airflow_local_settings.py file.

You should create a airflow_local_settings.py file and put it in a directory in sys.path or in the $AIRFLOW_HOME/config folder. (Airflow adds $AIRFLOW_HOME/config to sys.path when Airflow is initialized)

You can see the example of such local settings here:

Example settings you can configure this way:

Configuring Flask Application for Airflow Webserver

Airflow uses Flask to render the web UI. When you initialize the Airflow webserver, predefined configuration is used, based on the webserver section of the airflow.cfg file. You can override these settings and add any extra settings however by adding flask configuration to webserver_config.py file in your $AIRFLOW_HOME directory. This file is automatically loaded by the webserver.

For example if you would like to change rate limit strategy to “moving window”, you can set the RATELIMIT_STRATEGY to moving-window.

You could also enhance / modify the underlying flask app directly, as the app context is pushed to webserver_config.py:

from flask import current_app as app


@app.before_request
def print_custom_message() -> None:
    print("Executing before every request")

Was this entry helpful?