airflow.models.serialized_dag

Serialzed DAG table in database.

Module Contents

airflow.models.serialized_dag.log[source]
class airflow.models.serialized_dag.SerializedDagModel(dag)[source]

Bases: airflow.models.base.Base

A table for serialized DAGs.

serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:

  • [core] store_serialized_dags = True: enable this feature

  • [core] min_serialized_dag_update_interval = 30 (s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.

  • [scheduler] dag_dir_list_interval = 300 (s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60

It is used by webserver to load dagbags when store_serialized_dags=True. Because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.

__tablename__ = serialized_dag[source]
dag_id[source]
fileloc[source]
fileloc_hash[source]
data[source]
last_updated[source]
__table_args__[source]
dag[source]

The DAG deserialized from the data column

static dag_fileloc_hash(full_filepath)[source]

“Hashing file location for indexing.

Parameters

full_filepath – full filepath of DAG file

Returns

hashed full_filepath

classmethod write_dag(cls, dag, min_update_interval=None, session=None)[source]

Serializes a DAG and writes it into database.

Parameters
  • dag – a DAG to be written into database

  • min_update_interval – minimal interval in seconds to update serialized DAG

  • session – ORM Session

classmethod read_all_dags(cls, session=None)[source]

Reads all DAGs in serialized_dag table.

Parameters

session – ORM Session

Returns

a dict of DAGs read from database

classmethod remove_dag(cls, dag_id, session=None)[source]

Deletes a DAG with given dag_id.

Parameters
  • dag_id (str) – dag_id to be deleted

  • session – ORM Session

classmethod remove_deleted_dags(cls, alive_dag_filelocs, session=None)[source]

Deletes DAGs not included in alive_dag_filelocs.

Parameters
  • alive_dag_filelocs (list) – file paths of alive DAGs

  • session – ORM Session

classmethod has_dag(cls, dag_id, session=None)[source]

Checks a DAG exist in serialized_dag table.

Parameters
  • dag_id (str) – the DAG to check

  • session – ORM Session

Return type

bool

classmethod get(cls, dag_id, session=None)[source]

Get the SerializedDAG for the given dag ID. It will cope with being passed the ID of a subdag by looking up the root dag_id from the DAG table.

Parameters
  • dag_id – the DAG to fetch

  • session – ORM Session

Was this entry helpful?