Metrics¶
Airflow can be set up to send metrics to StatsD.
Setup¶
First you must install statsd requirement:
pip install 'apache-airflow[statsd]'
Note
On November 2020, new version of PIP (20.3) has been released with a new, 2020 resolver. This resolver
does not yet work with Apache Airflow and might leads to errors in installation - depends on your choice
of extras. In order to install Airflow you need to either downgrade pip to version 20.2.4
pip upgrade --pip==20.2.4
or, in case you use Pip 20.3, you need to add option
--use-deprecated legacy-resolver
to your pip install command.
Add the following lines to your configuration file e.g. airflow.cfg
[scheduler]
statsd_on = True
statsd_host = localhost
statsd_port = 8125
statsd_prefix = airflow
If you want to avoid send all the available metrics to StatsD, you can configure an allow list of prefixes to send only the metrics that start with the elements of the list:
[scheduler]
statsd_allow_list = scheduler,executor,dagrun
Counters¶
Name |
Description |
---|---|
|
Number of started |
|
Number of ended |
|
Operator |
|
Operator |
|
Overall task instances failures |
|
Overall task instances successes |
|
Zombie tasks killed |
|
Scheduler heartbeats |
|
Number of currently running DAG parsing processes |
|
Number of tasks killed externally |
Gauges¶
Name |
Description |
---|---|
|
DAG bag size |
|
Number of errors from trying to parse DAG files |
|
Seconds taken to scan and import all DAG files once |
|
Seconds spent processing |
|
Seconds since |
|
Number of file processors that have been killed due to taking too long |
|
Number of open slots on executor |
|
Number of queued tasks on executor |
|
Number of running tasks on executor |
|
Number of open slots in the pool |
|
Number of used slots in the pool |
|
Number of starving tasks in the pool |
Timers¶
Name |
Description |
---|---|
|
Milliseconds taken to check DAG dependencies |
|
Milliseconds taken to finish a task |
|
Milliseconds taken to load the given DAG file |
|
Milliseconds taken for a DagRun to reach success state |
|
Milliseconds taken for a DagRun to reach failed state |
|
Milliseconds of delay between the scheduled DagRun start date and the actual DagRun start date |
|
Milliseconds elapsed between first task start_date and dagrun expected start |