airflow.models.taskinstance
¶
Module Contents¶
-
airflow.models.taskinstance.
set_current_context
(context: Context)[source]¶ -
Sets the current execution context to the provided context object.
-
This method should be called once per Task execution, before calling operator.execute.
-
airflow.models.taskinstance.
load_error_file
(fd: IO[bytes]) → Optional[Union[str, Exception]][source]¶ -
Load and return error from error file
-
airflow.models.taskinstance.
set_error_file
(error_file: str, error: Union[str, Exception]) → None[source]¶ -
Write error into error file by path
-
airflow.models.taskinstance.
clear_task_instances
(tis, session, activate_dag_runs=True, dag=None)[source]¶ -
Clears a set of task instances, but makes sure the running ones
-
get killed.
- Parameters
tis – a list of task instances
session – current session
activate_dag_runs – flag to check for active dag run
dag – DAG object
-
class
airflow.models.taskinstance.
TaskInstanceKey
[source]¶ Bases:
typing.NamedTuple
Key used to identify task instance.
-
class
airflow.models.taskinstance.
TaskInstance
(task, execution_date: datetime, state: Optional[str] = None)[source]¶ Bases:
airflow.models.base.Base
,airflow.utils.log.logging_mixin.LoggingMixin
Task instances store the state of a task instance. This table is the authority and single source of truth around what tasks have run and the state they are in.
The SqlAlchemy model doesn’t have a SqlAlchemy foreign key to the task or dag model deliberately to have more control over transactions.
Database transactions on this table should insure double triggers and any confusion around what task instances are or aren’t ready to run even while multiple schedulers may be firing task instances.
-
try_number
[source]¶ Return the try number that this task number will be when it is actually run.
If the TaskInstance is currently running, this will match the column in the database, in all other cases this will be incremented.
-
prev_attempted_tries
[source]¶ Based on this instance’s try_number, this will calculate the number of previously attempted tries, defaulting to 0.
-
is_premature
[source]¶ Returns whether a task is in UP_FOR_RETRY state and its retry interval has elapsed.
-
previous_ti
[source]¶ This attribute is deprecated. Please use airflow.models.taskinstance.TaskInstance.get_previous_ti method.
-
previous_ti_success
[source]¶ This attribute is deprecated. Please use airflow.models.taskinstance.TaskInstance.get_previous_ti method.
-
previous_start_date_success
[source]¶ This attribute is deprecated. Please use airflow.models.taskinstance.TaskInstance.get_previous_start_date method.
-
command_as_list
(self, mark_success=False, ignore_all_deps=False, ignore_task_deps=False, ignore_depends_on_past=False, ignore_ti_state=False, local=False, pickle_id=None, raw=False, job_id=None, pool=None, cfg_path=None)[source]¶ Returns a command that can be executed anywhere where airflow is installed. This command is part of the message sent to executors by the orchestrator.
-
static
generate_command
(dag_id: str, task_id: str, execution_date: datetime, mark_success: bool = False, ignore_all_deps: bool = False, ignore_depends_on_past: bool = False, ignore_task_deps: bool = False, ignore_ti_state: bool = False, local: bool = False, pickle_id: Optional[int] = None, file_path: Optional[str] = None, raw: bool = False, job_id: Optional[str] = None, pool: Optional[str] = None, cfg_path: Optional[str] = None)[source]¶ Generates the shell command required to execute this task instance.
- Parameters
dag_id (str) – DAG ID
task_id (str) – Task ID
execution_date (datetime) – Execution date for the task
mark_success (bool) – Whether to mark the task as successful
ignore_all_deps (bool) – Ignore all ignorable dependencies. Overrides the other ignore_* parameters.
ignore_depends_on_past (bool) – Ignore depends_on_past parameter of DAGs (e.g. for Backfills)
ignore_task_deps (bool) – Ignore task-specific dependencies such as depends_on_past and trigger rule
ignore_ti_state (bool) – Ignore the task instance’s previous failure/success
local (bool) – Whether to run the task locally
pickle_id (Optional[int]) – If the DAG was serialized to the DB, the ID associated with the pickled DAG
file_path (Optional[str]) – path to the file containing the DAG definition
raw (Optional[bool]) – raw mode (needs more details)
job_id (Optional[int]) – job ID (needs more details)
pool (Optional[str]) – the Airflow pool that the task should run in
cfg_path (Optional[str]) – the Path to the configuration file
- Returns
shell command that can be used to run the task instance
- Return type
-
current_state
(self, session=None)[source]¶ Get the very latest state from the database, if a session is passed, we use and looking up the state becomes part of the session, otherwise a new session is used.
- Parameters
session (Session) – SQLAlchemy ORM Session
-
error
(self, session=None)[source]¶ Forces the task instance’s state to FAILED in the database.
- Parameters
session (Session) – SQLAlchemy ORM Session
-
refresh_from_db
(self, session=None, lock_for_update=False)[source]¶ Refreshes the task instance from the database based on the primary key
- Parameters
session (Session) – SQLAlchemy ORM Session
lock_for_update (bool) – if True, indicates that the database should lock the TaskInstance (issuing a FOR UPDATE clause) until the session is committed.
-
refresh_from_task
(self, task, pool_override=None)[source]¶ Copy common attributes from the given task.
- Parameters
task (airflow.models.BaseOperator) – The task object to copy from
pool_override (str) – Use the pool_override instead of task’s pool
-
clear_xcom_data
(self, session=None)[source]¶ Clears all XCom data from the database for the task instance
- Parameters
session (Session) – SQLAlchemy ORM Session
-
set_state
(self, state: str, session=None)[source]¶ Set TaskInstance state.
- Parameters
state (str) – State to set for the TI
session (Session) – SQLAlchemy ORM Session
-
are_dependents_done
(self, session=None)[source]¶ Checks whether the immediate dependents of this task instance have succeeded or have been skipped. This is meant to be used by wait_for_downstream.
This is useful when you do not want to start processing the next schedule of a task until the dependents are done. For instance, if the task DROPs and recreates a table.
- Parameters
session (Session) – SQLAlchemy ORM Session
-
get_previous_ti
(self, state: Optional[str] = None, session: Session = None)[source]¶ The task instance for the task that ran before this task instance.
- Parameters
state – If passed, it only take into account instances of a specific state.
session – SQLAlchemy ORM Session
-
get_previous_execution_date
(self, state: Optional[str] = None, session: Session = None)[source]¶ The execution date from property previous_ti_success.
- Parameters
state – If passed, it only take into account instances of a specific state.
session – SQLAlchemy ORM Session
-
get_previous_start_date
(self, state: Optional[str] = None, session: Session = None)[source]¶ The start date from property previous_ti_success.
- Parameters
state – If passed, it only take into account instances of a specific state.
session – SQLAlchemy ORM Session
-
are_dependencies_met
(self, dep_context=None, session=None, verbose=False)[source]¶ Returns whether or not all the conditions are met for this task instance to be run given the context for the dependencies (e.g. a task instance being force run from the UI will ignore some dependencies).
- Parameters
dep_context (DepContext) – The execution context that determines the dependencies that should be evaluated.
session (sqlalchemy.orm.session.Session) – database session
verbose (bool) – whether log details on failed dependencies on info or debug log level
-
next_retry_datetime
(self)[source]¶ Get datetime of the next retry if the task instance fails. For exponential backoff, retry_delay is used as base and will be converted to seconds.
-
ready_for_retry
(self)[source]¶ Checks on whether the task instance is in the right state and timeframe to be retried.
-
get_dagrun
(self, session: Session = None)[source]¶ Returns the DagRun for this TaskInstance
- Parameters
session – SQLAlchemy ORM Session
- Returns
DagRun
-
check_and_change_state_before_execution
(self, verbose: bool = True, ignore_all_deps: bool = False, ignore_depends_on_past: bool = False, ignore_task_deps: bool = False, ignore_ti_state: bool = False, mark_success: bool = False, test_mode: bool = False, job_id: Optional[str] = None, pool: Optional[str] = None, session=None)[source]¶ Checks dependencies and then sets state to RUNNING if they are met. Returns True if and only if state is set to RUNNING, which implies that task should be executed, in preparation for _run_raw_task
- Parameters
verbose (bool) – whether to turn on more verbose logging
ignore_all_deps (bool) – Ignore all of the non-critical dependencies, just runs
ignore_depends_on_past (bool) – Ignore depends_on_past DAG attribute
ignore_task_deps (bool) – Don’t check the dependencies of this TaskInstance’s task
ignore_ti_state (bool) – Disregards previous task instance state
mark_success (bool) – Don’t run the task, mark its state as success
test_mode (bool) – Doesn’t record success or failure in the DB
job_id (str) – Job (BackfillJob / LocalTaskJob / SchedulerJob) ID
pool (str) – specifies the pool to use to run the task instance
session (Session) – SQLAlchemy ORM Session
- Returns
whether the state was changed to running or not
- Return type
-
_run_raw_task
(self, mark_success: bool = False, test_mode: bool = False, job_id: Optional[str] = None, pool: Optional[str] = None, error_file: Optional[str] = None, session=None)[source]¶ Immediately runs the task (without checking or changing db state before execution) and then sets the appropriate final state after completion and runs any post-execute callbacks. Meant to be called only after another function changes the state to running.
-
_execute_task
(self, context, task_copy)[source]¶ Executes Task (optionally with a Timeout) and pushes Xcom results
-
_run_execute_callback
(self, context: Context, task)[source]¶ Functions that need to be run before a Task is executed
-
_run_finished_callback
(self, error: Optional[Union[str, Exception]] = None)[source]¶ Call callback defined for finished state change.
NOTE: Only invoke this function from caller of self._run_raw_task or self.run
-
run
(self, verbose: bool = True, ignore_all_deps: bool = False, ignore_depends_on_past: bool = False, ignore_task_deps: bool = False, ignore_ti_state: bool = False, mark_success: bool = False, test_mode: bool = False, job_id: Optional[str] = None, pool: Optional[str] = None, session=None)[source]¶ Run TaskInstance
-
_handle_reschedule
(self, actual_start_date, reschedule_exception, test_mode=False, session=None)[source]¶
-
handle_failure
(self, error: Union[str, Exception], test_mode: Optional[bool] = None, force_fail: bool = False, error_file: Optional[str] = None, session=None)[source]¶ Handle Failure for the TaskInstance
-
handle_failure_with_callback
(self, error: Union[str, Exception], test_mode: Optional[bool] = None, force_fail: bool = False, session=None)[source]¶
-
overwrite_params_with_dag_run_conf
(self, params, dag_run)[source]¶ Overwrite Task Params with DagRun.conf
-
render_templates
(self, context: Optional[Context] = None)[source]¶ Render templates in the operator fields.
-
xcom_push
(self, key: str, value: Any, execution_date: Optional[datetime] = None, session: Session = None)[source]¶ Make an XCom available for tasks to pull.
- Parameters
key (str) – A key for the XCom
value (any picklable object) – A value for the XCom. The value is pickled and stored in the database.
execution_date (datetime) – if provided, the XCom will not be visible until this date. This can be used, for example, to send a message to a task on a future date without it being immediately visible.
session (Session) – Sqlalchemy ORM Session
-
xcom_pull
(self, task_ids: Optional[Union[str, Iterable[str]]] = None, dag_id: Optional[str] = None, key: str = XCOM_RETURN_KEY, include_prior_dates: bool = False, session: Session = None)[source]¶ Pull XComs that optionally meet certain criteria.
The default value for key limits the search to XComs that were returned by other tasks (as opposed to those that were pushed manually). To remove this filter, pass key=None (or any desired value).
If a single task_id string is provided, the result is the value of the most recent matching XCom from that task_id. If multiple task_ids are provided, a tuple of matching values is returned. None is returned whenever no matches are found.
- Parameters
key (str) – A key for the XCom. If provided, only XComs with matching keys will be returned. The default key is ‘return_value’, also available as a constant XCOM_RETURN_KEY. This key is automatically given to XComs returned by tasks (as opposed to being pushed manually). To remove the filter, pass key=None.
task_ids (str or iterable of strings (representing task_ids)) – Only XComs from tasks with matching ids will be pulled. Can pass None to remove the filter.
dag_id (str) – If provided, only pulls XComs from this DAG. If None (default), the DAG of the calling task is used.
include_prior_dates (bool) – If False, only XComs from the current execution_date are returned. If True, XComs from previous dates are returned as well.
session (Session) – Sqlalchemy ORM Session
-
-
class
airflow.models.taskinstance.
SimpleTaskInstance
(ti: TaskInstance)[source]¶ Simplified Task Instance.
Used to send data between processes via Queues.
-
construct_task_instance
(self, session=None, lock_for_update=False)[source]¶ Construct a TaskInstance from the database based on the primary key
- Parameters
session – DB session.
lock_for_update – if True, indicates that the database should lock the TaskInstance (issuing a FOR UPDATE clause) until the session is committed.
- Returns
the task instance constructed
-