Alternative secrets backend¶
In addition to retrieving connections & variables from environment variables or the metastore database, you can enable an alternative secrets backend to retrieve Airflow connections or Airflow variables, such as AWS SSM Parameter Store, Hashicorp Vault Secrets or you can roll your own.
Search path¶
When looking up a connection/variable, by default Airflow will search environment variables first and metastore database second.
If you enable an alternative secrets backend, it will be searched first, followed by environment variables, then metastore. This search ordering is not configurable.
Configuration¶
The [secrets]
section has the following options:
[secrets]
backend =
backend_kwargs =
Set backend
to the fully qualified class name of the backend you want to enable.
You can provide backend_kwargs
with json and it will be passed as kwargs to the __init__
method of
your secrets backend.
See AWS SSM Parameter Store for an example configuration.
AWS SSM Parameter Store Secrets Backend¶
To enable SSM parameter store, specify SystemsManagerParameterStoreBackend
as the backend
in [secrets]
section of airflow.cfg
.
Here is a sample configuration:
[secrets]
backend = airflow.contrib.secrets.aws_systems_manager.SystemsManagerParameterStoreBackend
backend_kwargs = {"connections_prefix": "/airflow/connections", "variables_prefix": "/airflow/variables", "profile_name": "default"}
Storing and Retrieving Connections¶
If you have set connections_prefix
as /airflow/connections
, then for a connection id of smtp_default
,
you would want to store your connection at /airflow/connections/smtp_default
.
Optionally you can supply a profile name to reference aws profile, e.g. defined in ~/.aws/config
.
The value of the SSM parameter must be the connection URI representation of the connection object.
Storing and Retrieving Variables¶
If you have set variables_prefix
as /airflow/variables
, then for an Variable key of hello
,
you would want to store your Variable at /airflow/variables/hello
.
Optionally you can supply a profile name to reference aws profile, e.g. defined in ~/.aws/config
.
Hashicorp Vault Secrets Backend¶
To enable Hashicorp vault to retrieve Airflow connection/variable, specify VaultBackend
as the backend
in [secrets]
section of airflow.cfg
.
Here is a sample configuration:
[secrets]
backend = airflow.contrib.secrets.hashicorp_vault.VaultBackend
backend_kwargs = {"connections_path": "connections", "variables_path": "variables", "mount_point": "airflow", "url": "http://127.0.0.1:8200"}
The default KV version engine is 2
, pass kv_engine_version: 1
in backend_kwargs
if you use
KV Secrets Engine Version 1
.
You can also set and pass values to Vault client by setting environment variables. All the environment variables listed at https://www.vaultproject.io/docs/commands/#environment-variables are supported.
Hence, if you set VAULT_ADDR
environment variable like below, you do not need to pass url
key to backend_kwargs
:
export VAULT_ADDR="http://127.0.0.1:8200"
Storing and Retrieving Connections¶
If you have set connections_path
as connections
and mount_point
as airflow
, then for a connection id of
smtp_default
, you would want to store your secret as:
vault kv put airflow/connections/smtp_default conn_uri=smtps://user:host@relay.example.com:465
Note that the Key
is conn_uri
, Value
is postgresql://airflow:airflow@host:5432/airflow
and
mount_point
is airflow
.
You can make a mount_point
for airflow
as follows:
vault secrets enable -path=airflow -version=2 kv
Verify that you can get the secret from vault
:
❯ vault kv get airflow/connections/smtp_default
====== Metadata ======
Key Value
--- -----
created_time 2020-03-19T19:17:51.281721Z
deletion_time n/a
destroyed false
version 1
====== Data ======
Key Value
--- -----
conn_uri smtps://user:host@relay.example.com:465
The value of the Vault key must be the connection URI representation of the connection object to get connection.
Storing and Retrieving Variables¶
If you have set variables_path
as variables
and mount_point
as airflow
, then for a variable with
hello
as key, you would want to store your secret as:
vault kv put airflow/variables/hello value=world
Verify that you can get the secret from vault
:
❯ vault kv get airflow/variables/hello
====== Metadata ======
Key Value
--- -----
created_time 2020-03-28T02:10:54.301784Z
deletion_time n/a
destroyed false
version 1
==== Data ====
Key Value
--- -----
value world
Note that the secret Key
is value
, and secret Value
is world
and
mount_point
is airflow
.
GCP Secrets Manager Backend¶
To enable GCP Secrets Manager to retrieve connection/variables, specify CloudSecretsManagerBackend
as the backend
in [secrets]
section of airflow.cfg
.
Available parameters to backend_kwargs
:
connections_prefix
: Specifies the prefix of the secret to read to get Connections.variables_prefix
: Specifies the prefix of the secret to read to get Variables.gcp_key_path
: Path to GCP Credential JSON filegcp_scopes
: Comma-separated string containing GCP scopessep
: separator used to concatenate connections_prefix and conn_id. Default: “-“
Note: The full GCP Secrets Manager secret id should follow the pattern “[a-zA-Z0-9-_]”.
Here is a sample configuration if you want to just retrieve connections:
[secrets]
backend = airflow.contrib.secrets.gcp_secrets_manager.CloudSecretsManagerBackend
backend_kwargs = {"connections_prefix": "airflow-connections", "sep": "-"}
Here is a sample configuration if you want to just retrieve variables:
[secrets]
backend = airflow.contrib.secrets.gcp_secrets_manager.CloudSecretsManagerBackend
backend_kwargs = {"variables_prefix": "airflow-variables", "sep": "-"}
and if you want to retrieve both Variables and connections use the following sample config:
[secrets]
backend = airflow.contrib.secrets.gcp_secrets_manager.CloudSecretsManagerBackend
backend_kwargs = {"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "sep": "-"}
When gcp_key_path
is not provided, it will use the Application Default Credentials in the current environment. You can set up the credentials with:
# 1. GOOGLE_APPLICATION_CREDENTIALS environment variable
export GOOGLE_APPLICATION_CREDENTIALS=path/to/key-file.json
# 2. Set with SDK
gcloud auth application-default login
# If the Cloud SDK has an active project, the project ID is returned. The active project can be set using:
gcloud config set project
The value of the Secrets Manager secret id must be the connection URI representation of the connection object.
Roll your own secrets backend¶
A secrets backend is a subclass of airflow.secrets.BaseSecretsBackend
, and just has to implement the
get_connections()
method.
There are two options:
Option 1: a base implmentation of the
get_connections()
is provided, you just need to implement theget_conn_uri()
method to make it functional.Option 2: simply override the
get_connections()
method.
Just create your class, and put the fully qualified class name in backend
key in the [secrets]
section of airflow.cfg
. You can you can also pass kwargs to __init__
by supplying json to the
backend_kwargs
config param. See Configuration for more details,
and SSM Parameter Store for an example.
Note
If you are rolling your own secrets backend, you don’t strictly need to use airflow’s URI format. But doing so makes it easier to switch between environment variables, the metastore, and your secrets backend.