Content
Command Line Interface Reference¶
Airflow has a very rich command line interface that allows for many types of operation on a DAG, starting services, and supporting development and testing.
usage: airflow [-h]
{backfill,list_dag_runs,list_tasks,clear,pause,unpause,trigger_dag,delete_dag,show_dag,pool,variables,kerberos,render,run,initdb,list_dags,dag_state,task_failed_deps,task_state,serve_logs,test,webserver,resetdb,upgradedb,checkdb,shell,scheduler,worker,flower,version,connections,create_user,delete_user,list_users,sync_perm,next_execution,rotate_fernet_key,config,info}
...
Positional Arguments¶
- subcommand
Possible choices: backfill, list_dag_runs, list_tasks, clear, pause, unpause, trigger_dag, delete_dag, show_dag, pool, variables, kerberos, render, run, initdb, list_dags, dag_state, task_failed_deps, task_state, serve_logs, test, webserver, resetdb, upgradedb, checkdb, shell, scheduler, worker, flower, version, connections, create_user, delete_user, list_users, sync_perm, next_execution, rotate_fernet_key, config, info
sub-command help
Sub-commands:¶
backfill¶
Run subsections of a DAG for a specified date range. If reset_dag_run option is used, backfill will first prompt users whether airflow should clear all the previous dag_run and task_instances within the backfill date range. If rerun_failed_tasks is used, backfill will auto re-run the previous failed task instances within the backfill date range.
airflow backfill [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-m] [-l]
[-x] [-y] [-i] [-I] [-sd SUBDIR] [--pool POOL]
[--delay_on_limit DELAY_ON_LIMIT] [-dr] [-v] [-c CONF]
[--reset_dagruns] [--rerun_failed_tasks] [-B]
dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -t, --task_regex
The regex to filter specific task_ids to backfill (optional)
- -s, --start_date
Override start_date YYYY-MM-DD
- -e, --end_date
Override end_date YYYY-MM-DD
- -m, --mark_success
Mark jobs as succeeded without running them
Default: False
- -l, --local
Run the task using the LocalExecutor
Default: False
- -x, --donot_pickle
Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code.
Default: False
- -y, --yes
Do not prompt to confirm reset. Use with care!
Default: False
- -i, --ignore_dependencies
Skip upstream tasks, run only the tasks matching the regexp. Only works in conjunction with task_regex
Default: False
- -I, --ignore_first_depends_on_past
Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past).
Default: False
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- --pool
Resource pool to use
- --delay_on_limit
Amount of time in seconds to wait when the limit on maximum active dag runs (max_active_runs) has been reached before trying to execute a dag run again.
Default: 1.0
- -dr, --dry_run
Perform a dry run for each task. Only renders Template Fields for each task, nothing else
Default: False
- -v, --verbose
Make logging output more verbose
Default: False
- -c, --conf
JSON string that gets pickled into the DagRun’s conf attribute
- --reset_dagruns
if set, the backfill will delete existing backfill-related DAG runs and start anew with fresh, running DAG runs
Default: False
- --rerun_failed_tasks
if set, the backfill will auto-rerun all the failed tasks for the backfill date range instead of throwing exceptions
Default: False
- -B, --run_backwards
if set, the backfill will run tasks from the most recent day first. if there are tasks that depend_on_past this option will throw an exception
Default: False
list_dag_runs¶
List dag runs given a DAG id. If state option is given, it will onlysearch for all the dagruns with the given state. If no_backfill option is given, it will filter outall backfill dagruns for given dag id.
airflow list_dag_runs [-h] [--no_backfill] [--state STATE] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- --no_backfill
filter all the backfill dagruns given the dag id
Default: False
- --state
Only list the dag runs corresponding to the state
list_tasks¶
List the tasks within a DAG
airflow list_tasks [-h] [-t] [-sd SUBDIR] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -t, --tree
Tree view
Default: False
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
clear¶
Clear a set of task instance, as if they never ran
airflow clear [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-sd SUBDIR]
[-u] [-d] [-c] [-f] [-r] [-x] [-xp] [-dx]
dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -t, --task_regex
The regex to filter specific task_ids to backfill (optional)
- -s, --start_date
Override start_date YYYY-MM-DD
- -e, --end_date
Override end_date YYYY-MM-DD
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -u, --upstream
Include upstream tasks
Default: False
- -d, --downstream
Include downstream tasks
Default: False
- -c, --no_confirm
Do not request confirmation
Default: False
- -f, --only_failed
Only failed jobs
Default: False
- -r, --only_running
Only running jobs
Default: False
- -x, --exclude_subdags
Exclude subdags
Default: False
- -xp, --exclude_parentdag
Exclude ParentDAGS if the task cleared is a part of a SubDAG
Default: False
- -dx, --dag_regex
Search dag_id as regex instead of exact string
Default: False
pause¶
Pause a DAG
airflow pause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
unpause¶
Resume a paused DAG
airflow unpause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
trigger_dag¶
Trigger a DAG run
airflow trigger_dag [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF] [-e EXEC_DATE]
dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -r, --run_id
Helps to identify this run
- -c, --conf
JSON string that gets pickled into the DagRun’s conf attribute
- -e, --exec_date
The execution date of the DAG
delete_dag¶
Delete all DB records related to the specified DAG
airflow delete_dag [-h] [-y] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -y, --yes
Do not prompt to confirm reset. Use with care!
Default: False
show_dag¶
Displays DAG’s tasks with their dependencies
airflow show_dag [-h] [-sd SUBDIR] [-s SAVE] [--imgcat] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -s, --save
Saves the result to the indicated file.
The file format is determined by the file extension. For more information about supported format, see: https://www.graphviz.org/doc/info/output.html
If you want to create a PNG file then you should execute the following command: airflow dags show <DAG_ID> –save output.png
If you want to create a DOT file then you should execute the following command: airflow dags show <DAG_ID> –save output.dot
- --imgcat
Displays graph using the imgcat tool.
For more information, see: https://www.iterm2.com/documentation-images.html
Default: False
pool¶
CRUD operations on pools
airflow pool [-h] [-s NAME SLOT_COUNT POOL_DESCRIPTION] [-g NAME] [-x NAME]
[-i FILEPATH] [-e FILEPATH]
Named Arguments¶
- -s, --set
Set pool slot count and description, respectively
- -g, --get
Get pool info
- -x, --delete
Delete a pool
- -i, --import
Import pool from JSON file
- -e, --export
Export pool to JSON file
variables¶
CRUD operations on variables
airflow variables [-h] [-s KEY VAL] [-g KEY] [-j] [-d VAL] [-i FILEPATH]
[-e FILEPATH] [-x KEY]
Named Arguments¶
- -s, --set
Set a variable
- -g, --get
Get value of a variable
- -j, --json
Deserialize JSON variable
Default: False
- -d, --default
Default value returned if variable does not exist
- -i, --import
Import variables from JSON file
- -e, --export
Export variables to JSON file
- -x, --delete
Delete a variable
kerberos¶
Start a kerberos ticket renewer
airflow kerberos [-h] [-kt [KEYTAB]] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
[principal]
Positional Arguments¶
- principal
kerberos principal
Named Arguments¶
- -kt, --keytab
keytab
Default: “airflow.keytab”
- --pid
PID file location
- -D, --daemon
Daemonize instead of running in the foreground
Default: False
- --stdout
Redirect stdout to this file
- --stderr
Redirect stderr to this file
- -l, --log-file
Location of the log file
render¶
Render a task instance’s template(s)
airflow render [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- task_id
The id of the task
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
run¶
Run a single task instance
airflow run [-h] [-sd SUBDIR] [-m] [-f] [--pool POOL] [--cfg_path CFG_PATH]
[-l] [-A] [-i] [-I] [--ship_dag] [-p PICKLE] [-int]
dag_id task_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- task_id
The id of the task
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -m, --mark_success
Mark jobs as succeeded without running them
Default: False
- -f, --force
Ignore previous task instance state, rerun regardless if task already succeeded/failed
Default: False
- --pool
Resource pool to use
- --cfg_path
Path to config file to use instead of airflow.cfg
- -l, --local
Run the task using the LocalExecutor
Default: False
- -A, --ignore_all_dependencies
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps
Default: False
- -i, --ignore_dependencies
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies
Default: False
- -I, --ignore_depends_on_past
Ignore depends_on_past dependencies (but respect upstream dependencies)
Default: False
- --ship_dag
Pickles (serializes) the DAG and ships it to the worker
Default: False
- -p, --pickle
Serialized pickle object of the entire dag (used internally)
- -int, --interactive
Do not capture standard output and error streams (useful for interactive debugging)
Default: False
list_dags¶
List all the DAGs
airflow list_dags [-h] [-sd SUBDIR] [-r]
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -r, --report
Show DagBag loading report
Default: False
dag_state¶
Get the status of a dag run
airflow dag_state [-h] [-sd SUBDIR] dag_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
task_failed_deps¶
Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn’t get scheduled and then queued by the scheduler, and then run by an executor).
airflow task_failed_deps [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- task_id
The id of the task
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
task_state¶
Get the status of a task instance
airflow task_state [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- task_id
The id of the task
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
test¶
Test a task instance. This will run a task without checking for dependencies or recording its state in the database.
airflow test [-h] [-sd SUBDIR] [-dr] [-tp TASK_PARAMS] [-pm]
dag_id task_id execution_date
Positional Arguments¶
- dag_id
The id of the dag
- task_id
The id of the task
- execution_date
The execution date of the DAG
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -dr, --dry_run
Perform a dry run for each task. Only renders Template Fields for each task, nothing else
Default: False
- -tp, --task_params
Sends a JSON params dict to the task
- -pm, --post_mortem
Open debugger on uncaught exception
Default: False
webserver¶
Start a Airflow webserver instance
airflow webserver [-h] [-p PORT] [-w WORKERS]
[-k {sync,eventlet,gevent,tornado}] [-t WORKER_TIMEOUT]
[-hn HOSTNAME] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE]
[-l LOG_FILE] [--ssl_cert SSL_CERT] [--ssl_key SSL_KEY] [-d]
Named Arguments¶
- -p, --port
The port on which to run the server
Default: 8080
- -w, --workers
Number of workers to run the webserver on
Default: 4
- -k, --workerclass
Possible choices: sync, eventlet, gevent, tornado
The worker class to use for Gunicorn
Default: “sync”
- -t, --worker_timeout
The timeout for waiting on webserver workers
Default: 120
- -hn, --hostname
Set the hostname on which to run the web server
Default: “0.0.0.0”
- --pid
PID file location
- -D, --daemon
Daemonize instead of running in the foreground
Default: False
- --stdout
Redirect stdout to this file
- --stderr
Redirect stderr to this file
- -A, --access_logfile
The logfile to store the webserver access log. Use ‘-‘ to print to stderr.
Default: “/Users/kaxilnaik/airflow/logs/webserver/webserver_logs.log”
- -E, --error_logfile
The logfile to store the webserver error log. Use ‘-‘ to print to stderr.
Default: “-“
- -l, --log-file
Location of the log file
- --ssl_cert
Path to the SSL certificate for the webserver
- --ssl_key
Path to the key to use with the SSL certificate
- -d, --debug
Use the server that ships with Flask in debug mode
Default: False
resetdb¶
Burn down and rebuild the metadata database
airflow resetdb [-h] [-y]
Named Arguments¶
- -y, --yes
Do not prompt to confirm reset. Use with care!
Default: False
scheduler¶
Start a scheduler instance
airflow scheduler [-h] [-d DAG_ID] [-sd SUBDIR] [-r RUN_DURATION]
[-n NUM_RUNS] [-p] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
Named Arguments¶
- -d, --dag_id
The id of the dag to run
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
- -r, --run-duration
Set number of seconds to execute before exiting
- -n, --num_runs
Set the number of runs to execute before exiting
Default: -1
- -p, --do_pickle
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.
Default: False
- --pid
PID file location
- -D, --daemon
Daemonize instead of running in the foreground
Default: False
- --stdout
Redirect stdout to this file
- --stderr
Redirect stderr to this file
- -l, --log-file
Location of the log file
worker¶
Start a Celery worker node
airflow worker [-h] [-p] [-q QUEUES] [-c CONCURRENCY] [-cn CELERY_HOSTNAME]
[--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
[-l LOG_FILE] [-a AUTOSCALE] [-s]
Named Arguments¶
- -p, --do_pickle
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code.
Default: False
- -q, --queues
Comma delimited list of queues to serve
Default: “default”
- -c, --concurrency
The number of worker processes
Default: 8
- -cn, --celery_hostname
Set the hostname of celery worker if you have multiple workers on a single machine.
- --pid
PID file location
- -D, --daemon
Daemonize instead of running in the foreground
Default: False
- --stdout
Redirect stdout to this file
- --stderr
Redirect stderr to this file
- -l, --log-file
Location of the log file
- -a, --autoscale
Minimum and Maximum number of worker to autoscale
- -s, --skip_serve_logs
Don’t start the serve logs process along with the workers.
Default: False
flower¶
Start a Celery Flower
airflow flower [-h] [-hn HOSTNAME] [-p PORT] [-fc FLOWER_CONF] [-u URL_PREFIX]
[-ba BASIC_AUTH] [-a BROKER_API] [--pid [PID]] [-D]
[--stdout STDOUT] [--stderr STDERR] [-l LOG_FILE]
Named Arguments¶
- -hn, --hostname
Set the hostname on which to run the server
Default: “0.0.0.0”
- -p, --port
The port on which to run the server
Default: 5555
- -fc, --flower_conf
Configuration file for flower
- -u, --url_prefix
URL prefix for Flower
- -ba, --basic_auth
Securing Flower with Basic Authentication. Accepts user:password pairs separated by a comma. Example: flower_basic_auth = user1:password1,user2:password2
- -a, --broker_api
Broker api
- --pid
PID file location
- -D, --daemon
Daemonize instead of running in the foreground
Default: False
- --stdout
Redirect stdout to this file
- --stderr
Redirect stderr to this file
- -l, --log-file
Location of the log file
connections¶
List/Add/Delete connections
airflow connections [-h] [-l] [-a] [-d] [--conn_id CONN_ID]
[--conn_uri CONN_URI] [--conn_extra CONN_EXTRA]
[--conn_type CONN_TYPE] [--conn_host CONN_HOST]
[--conn_login CONN_LOGIN] [--conn_password CONN_PASSWORD]
[--conn_schema CONN_SCHEMA] [--conn_port CONN_PORT]
Named Arguments¶
- -l, --list
List all connections
Default: False
- -a, --add
Add a connection
Default: False
- -d, --delete
Delete a connection
Default: False
- --conn_id
Connection id, required to add/delete a connection
- --conn_uri
Connection URI, required to add a connection without conn_type
- --conn_extra
Connection Extra field, optional when adding a connection
- --conn_type
Connection type, required to add a connection without conn_uri
- --conn_host
Connection host, optional when adding a connection
- --conn_login
Connection login, optional when adding a connection
- --conn_password
Connection password, optional when adding a connection
- --conn_schema
Connection schema, optional when adding a connection
- --conn_port
Connection port, optional when adding a connection
create_user¶
Create an account for the Web UI (FAB-based)
airflow create_user [-h] [-r ROLE] [-u USERNAME] [-e EMAIL] [-f FIRSTNAME]
[-l LASTNAME] [-p PASSWORD] [--use_random_password]
Named Arguments¶
- -r, --role
Role of the user. Existing roles include Admin, User, Op, Viewer, and Public
- -u, --username
Username of the user
- -e, --email
Email of the user
- -f, --firstname
First name of the user
- -l, --lastname
Last name of the user
- -p, --password
Password of the user
- --use_random_password
Do not prompt for password. Use random string instead
Default: False
delete_user¶
Delete an account for the Web UI
airflow delete_user [-h] [-u USERNAME]
Named Arguments¶
- -u, --username
Username of the user
next_execution¶
Get the next execution datetime of a DAG.
airflow next_execution [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
- dag_id
The id of the dag
Named Arguments¶
- -sd, --subdir
File location or directory from which to look for the dag. Defaults to ‘[AIRFLOW_HOME]/dags’ where [AIRFLOW_HOME] is the value you set for ‘AIRFLOW_HOME’ config you set in ‘airflow.cfg’
Default: “[AIRFLOW_HOME]/dags”
rotate_fernet_key¶
Rotate all encrypted connection credentials and variables; see https://airflow.readthedocs.io/en/stable/howto/secure-connections.html#rotating-encryption-keys.
airflow rotate_fernet_key [-h]
config¶
Show current application configuration
airflow config [-h] [--color {on,auto,off}]
Named Arguments¶
- --color
Possible choices: on, auto, off
Do emit colored output (default: auto)
Default: “auto”
info¶
Show information about current Airflow and environment
airflow info [-h] [--anonymize] [--file-io]
Named Arguments¶
- --anonymize
Minimize any personal identifiable information. Use it when sharing output with others.
Default: False
- --file-io
Send output to file.io service and returns link.
Default: False