Initializing a Database Backend

If you want to take a real test drive of Airflow, you should consider setting up a real database backend and switching to the LocalExecutor.

Airflow was built to interact with its metadata using SqlAlchemy with MySQL, Postgres and SQLite as supported backends (SQLite is used primarily for development purpose).

See also

Scheduler HA Database Requirements if you plan on running more than one scheduler

Note

We rely on more strict ANSI SQL settings for MySQL in order to have sane defaults. Make sure to have specified explicit_defaults_for_timestamp=1 in your my.cnf under [mysqld]

Note

If you decide to use MySQL, we recommend using the mysqlclient driver and specifying it in your SqlAlchemy connection string. (I.e., mysql+mysqldb://<user>:<password>@<host>[:<port>]/<dbname>.) But we also support the mysql-connector-python driver (I.e., mysql+mysqlconnector://<user>:<password>@<host>[:<port>]/<dbname>.) which lets you connect through SSL without any cert options provided. However if you want to use other drivers visit the SqlAlchemy docs for more information regarding download and setup of the SqlAlchemy connection.

Note

If you decide to use Postgres, we recommend using the psycopg2 driver and specifying it in your SqlAlchemy connection string. (I.e., postgresql+psycopg2://<user>:<password>@<host>/<db>.) Also note that since SqlAlchemy does not expose a way to target a specific schema in the Postgres connection URI, you may want to set a default schema for your role with a command similar to ALTER ROLE username SET search_path = airflow, foobar;

Setup your database to host Airflow

Create a database called airflow and a database user that Airflow will use to access this database.

Example, for MySQL:

CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
CREATE USER 'airflow' IDENTIFIED BY 'airflow';
GRANT ALL PRIVILEGES ON airflow.* TO 'airflow';

Example, for Postgres:

CREATE DATABASE airflow;
CREATE USER airflow WITH PASSWORD 'airflow';
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;

You may need to update your Postgres pg_hba.conf to add the airflow user to the database access control list; and to reload the database configuration to load your change. See The pg_hba.conf File in the Postgres documentation to learn more.

Configure Airflow's database connection string

Once you have setup your database to host Airflow, you'll need to alter the SqlAlchemy connection string located in sql_alchemy_conn option in [core] section in your configuration file $AIRFLOW_HOME/airflow.cfg.

You can also define connection URI using AIRFLOW__CORE__SQL_ALCHEMY_CONN environment variable.

Configure a worker that supports parallelism

You should then also change the executor option in the [core] option to use LocalExecutor, an executor that can parallelize task instances locally.

Initialize the database

# initialize the database
airflow db init

Was this entry helpful?