airflow.models.serialized_dag
¶
Serialzed DAG table in database.
Module Contents¶
-
class
airflow.models.serialized_dag.
SerializedDagModel
(dag)[source]¶ Bases:
airflow.models.base.Base
A table for serialized DAGs.
serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:
[core] store_serialized_dags = True
: enable this feature[core] min_serialized_dag_update_interval = 30
(s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.[scheduler] dag_dir_list_interval = 300
(s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60
It is used by webserver to load dagbags when
store_serialized_dags=True
. Because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.-
static
dag_fileloc_hash
(full_filepath)[source]¶ “Hashing file location for indexing.
- Parameters
full_filepath – full filepath of DAG file
- Returns
hashed full_filepath
-
classmethod
write_dag
(cls, dag, min_update_interval=None, session=None)[source]¶ Serializes a DAG and writes it into database.
- Parameters
dag – a DAG to be written into database
min_update_interval – minimal interval in seconds to update serialized DAG
session – ORM Session
-
classmethod
read_all_dags
(cls, session=None)[source]¶ Reads all DAGs in serialized_dag table.
- Parameters
session – ORM Session
- Returns
a dict of DAGs read from database
-
classmethod
remove_dag
(cls, dag_id, session=None)[source]¶ Deletes a DAG with given dag_id.
- Parameters
dag_id (str) – dag_id to be deleted
session – ORM Session
-
classmethod
remove_deleted_dags
(cls, alive_dag_filelocs, session=None)[source]¶ Deletes DAGs not included in alive_dag_filelocs.
- Parameters
alive_dag_filelocs (list) – file paths of alive DAGs
session – ORM Session