airflow.models.serialized_dag
¶
Serialized DAG table in database.
Module Contents¶
Classes¶
A table for serialized DAGs. |
Attributes¶
- class airflow.models.serialized_dag.SerializedDagModel(dag: airflow.models.dag.DAG)[source]¶
Bases:
airflow.models.base.Base
A table for serialized DAGs.
serialized_dag table is a snapshot of DAG files synchronized by scheduler. This feature is controlled by:
[core] min_serialized_dag_update_interval = 30
(s): serialized DAGs are updated in DB when a file gets processed by scheduler, to reduce DB write rate, there is a minimal interval of updating serialized DAGs.[scheduler] dag_dir_list_interval = 300
(s): interval of deleting serialized DAGs in DB when the files are deleted, suggest to use a smaller interval such as 60
It is used by webserver to load dags because reading from database is lightweight compared to importing from files, it solves the webserver scalability issue.
- classmethod write_dag(cls, dag: airflow.models.dag.DAG, min_update_interval: Optional[int] = None, session: sqlalchemy.orm.Session = None) bool [source]¶
Serializes a DAG and writes it into database. If the record already exists, it checks if the Serialized DAG changed or not. If it is changed, it updates the record, ignores otherwise.
- Parameters
dag – a DAG to be written into database
min_update_interval – minimal interval in seconds to update serialized DAG
session – ORM Session
- Returns
Boolean indicating if the DAG was written to the DB
- classmethod read_all_dags(cls, session: sqlalchemy.orm.Session = None) Dict[str, airflow.serialization.serialized_objects.SerializedDAG] [source]¶
Reads all DAGs in serialized_dag table.
- Parameters
session – ORM Session
- Returns
a dict of DAGs read from database
- classmethod remove_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None)[source]¶
Deletes a DAG with given dag_id. :param dag_id: dag_id to be deleted :param session: ORM Session
- classmethod remove_deleted_dags(cls, alive_dag_filelocs: List[str], session=None)[source]¶
Deletes DAGs not included in alive_dag_filelocs.
- Parameters
alive_dag_filelocs – file paths of alive DAGs
session – ORM Session
- classmethod has_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None) bool [source]¶
Checks a DAG exist in serialized_dag table.
- Parameters
dag_id – the DAG to check
session – ORM Session
- classmethod get(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[SerializedDagModel] [source]¶
Get the SerializedDAG for the given dag ID. It will cope with being passed the ID of a subdag by looking up the root dag_id from the DAG table.
- Parameters
dag_id – the DAG to fetch
session – ORM Session
- static bulk_sync_to_db(dags: List[airflow.models.dag.DAG], session: sqlalchemy.orm.Session = None)[source]¶
Saves DAGs as Serialized DAG objects in the database. Each DAG is saved in a separate database query.
- Parameters
dags (List[airflow.models.dag.DAG]) – the DAG objects to save to the DB
session (Session) – ORM Session
- Returns
None
- classmethod get_last_updated_datetime(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[datetime.datetime] [source]¶
Get the date when the Serialized DAG associated to DAG was last updated in serialized_dag table
- Parameters
dag_id (str) – DAG ID
session (Session) – ORM Session
- classmethod get_max_last_updated_datetime(cls, session: sqlalchemy.orm.Session = None) Optional[datetime.datetime] [source]¶
Get the maximum date when any DAG was last updated in serialized_dag table
- Parameters
session (Session) – ORM Session
- classmethod get_latest_version_hash(cls, dag_id: str, session: sqlalchemy.orm.Session = None) Optional[str] [source]¶
Get the latest DAG version for a given DAG ID.
- classmethod get_dag_dependencies(cls, session: sqlalchemy.orm.Session = None) Dict[str, List[airflow.serialization.serialized_objects.DagDependency]] [source]¶
Get the dependencies between DAGs
- Parameters
session (Session) – ORM Session