airflow.models.serialized_dag
¶
Serialized DAG table in database.
Module Contents¶
Classes¶
A table for serialized DAGs. |
Attributes¶
- class airflow.models.serialized_dag.SerializedDagModel(dag)[source]¶
Bases:
airflow.models.base.Base
A table for serialized DAGs.
serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:
[core] min_serialized_dag_update_interval = 30
(s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.[scheduler] dag_dir_list_interval = 300
(s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60[core] compress_serialized_dags
: whether compressing the dag data to the Database.
It is used by webserver to load dags because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.
- classmethod write_dag(dag, min_update_interval=None, session=None)[source]¶
Serializes a DAG and writes it into database. If the record already exists, it checks if the Serialized DAG changed or not. If it is changed, it updates the record, ignores otherwise.
- Parameters
dag (airflow.models.dag.DAG) – a DAG to be written into database
min_update_interval (Optional[int]) – minimal interval in seconds to update serialized DAG
session (sqlalchemy.orm.Session) – ORM Session
- Returns
Boolean indicating if the DAG was written to the DB
- Return type
- classmethod read_all_dags(session=None)[source]¶
Reads all DAGs in serialized_dag table.
- Parameters
session (sqlalchemy.orm.Session) – ORM Session
- Returns
a dict of DAGs read from database
- Return type
Dict[str, airflow.serialization.serialized_objects.SerializedDAG]
- classmethod remove_dag(dag_id, session=None)[source]¶
Deletes a DAG with given dag_id. :param dag_id: dag_id to be deleted :param session: ORM Session
- classmethod remove_deleted_dags(alive_dag_filelocs, session=None)[source]¶
Deletes DAGs not included in alive_dag_filelocs.
- Parameters
alive_dag_filelocs (List[str]) – file paths of alive DAGs
session – ORM Session
- classmethod has_dag(dag_id, session=None)[source]¶
Checks a DAG exist in serialized_dag table.
- Parameters
dag_id (str) – the DAG to check
session (sqlalchemy.orm.Session) – ORM Session
- classmethod get(dag_id, session=None)[source]¶
Get the SerializedDAG for the given dag ID. It will cope with being passed the ID of a subdag by looking up the root dag_id from the DAG table.
- Parameters
dag_id (str) – the DAG to fetch
session (sqlalchemy.orm.Session) – ORM Session
- static bulk_sync_to_db(dags, session=None)[source]¶
Saves DAGs as Serialized DAG objects in the database. Each DAG is saved in a separate database query.
- Parameters
dags (List[airflow.models.dag.DAG]) – the DAG objects to save to the DB
session (sqlalchemy.orm.Session) – ORM Session
- Returns
None
- classmethod get_last_updated_datetime(dag_id, session=None)[source]¶
Get the date when the Serialized DAG associated to DAG was last updated in serialized_dag table
- Parameters
dag_id (str) – DAG ID
session (sqlalchemy.orm.Session) – ORM Session
- classmethod get_max_last_updated_datetime(session=None)[source]¶
Get the maximum date when any DAG was last updated in serialized_dag table
- Parameters
session (sqlalchemy.orm.Session) – ORM Session
- classmethod get_latest_version_hash(dag_id, session=None)[source]¶
Get the latest DAG version for a given DAG ID.
- Parameters
dag_id (str) – DAG ID
session (sqlalchemy.orm.Session) – ORM Session
- Returns
DAG Hash, or None if the DAG is not found
- Return type
str | None
- classmethod get_dag_dependencies(session=None)[source]¶
Get the dependencies between DAGs
- Parameters
session (sqlalchemy.orm.Session) – ORM Session