airflow.models.serialized_dag

Serialized DAG table in database.

Module Contents

Classes

SerializedDagModel

A table for serialized DAGs.

Attributes

log

airflow.models.serialized_dag.log[source]
class airflow.models.serialized_dag.SerializedDagModel(dag: airflow.models.dag.DAG)[source]

Bases: airflow.models.base.Base

A table for serialized DAGs.

serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:

  • [core] min_serialized_dag_update_interval = 30 (s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.

  • [scheduler] dag_dir_list_interval = 300 (s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60

It is used by webserver to load dags because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.

__tablename__ = serialized_dag[source]
dag_id[source]
fileloc[source]
fileloc_hash[source]
data[source]
last_updated[source]
dag_hash[source]
__table_args__[source]
dag_runs[source]
dag_model[source]
__repr__(self)[source]

Return repr(self).

classmethod write_dag(cls, dag: airflow.models.dag.DAG, min_update_interval: Optional[int] = None, session: sqlalchemy.orm.Session = None) bool[source]

Serializes a DAG and writes it into database. If the record already exists, it checks if the Serialized DAG changed or not. If it is changed, it updates the record, ignores otherwise.

Parameters
  • dag – a DAG to be written into database

  • min_update_interval – minimal interval in seconds to update serialized DAG

  • session – ORM Session

Returns

Boolean indicating if the DAG was written to the DB

classmethod read_all_dags(cls, session: sqlalchemy.orm.Session = None) Dict[str, airflow.serialization.serialized_objects.SerializedDAG][source]

Reads all DAGs in serialized_dag table.

Parameters

session – ORM Session

Returns

a dict of DAGs read from database

property dag(self)[source]

The DAG deserialized from the data column

classmethod remove_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None)[source]

Deletes a DAG with given dag_id. :param dag_id: dag_id to be deleted :param session: ORM Session

classmethod remove_deleted_dags(cls, alive_dag_filelocs: List[str], session=None)[source]

Deletes DAGs not included in alive_dag_filelocs.

Parameters
  • alive_dag_filelocs – file paths of alive DAGs

  • session – ORM Session

classmethod has_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None) bool[source]

Checks a DAG exist in serialized_dag table.

Parameters
  • dag_id – the DAG to check

  • session – ORM Session

classmethod get(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[SerializedDagModel][source]

Get the SerializedDAG for the given dag ID. It will cope with being passed the ID of a subdag by looking up the root dag_id from the DAG table.

Parameters
  • dag_id – the DAG to fetch

  • session – ORM Session

static bulk_sync_to_db(dags: List[airflow.models.dag.DAG], session: sqlalchemy.orm.Session = None)[source]

Saves DAGs as Serialized DAG objects in the database. Each DAG is saved in a separate database query.

Parameters
Returns

None

classmethod get_last_updated_datetime(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[datetime.datetime][source]

Get the date when the Serialized DAG associated to DAG was last updated in serialized_dag table

Parameters
  • dag_id (str) – DAG ID

  • session (Session) – ORM Session

classmethod get_max_last_updated_datetime(cls, session: sqlalchemy.orm.Session = None) Optional[datetime.datetime][source]

Get the maximum date when any DAG was last updated in serialized_dag table

Parameters

session (Session) – ORM Session

classmethod get_latest_version_hash(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[str][source]

Get the latest DAG version for a given DAG ID.

Parameters
  • dag_id (str) – DAG ID

  • session (Session) – ORM Session

Returns

DAG Hash, or None if the DAG is not found

Return type

str | None

classmethod get_dag_dependencies(cls, session: sqlalchemy.orm.Session = None) Dict[str, List[airflow.serialization.serialized_objects.DagDependency]][source]

Get the dependencies between DAGs

Parameters

session (Session) – ORM Session

Was this entry helpful?