airflow.providers.databricks.triggers.databricks

Module Contents

Classes

DatabricksExecutionTrigger

The trigger handles the logic of async communication with DataBricks API.

class airflow.providers.databricks.triggers.databricks.DatabricksExecutionTrigger(run_id, databricks_conn_id, polling_period_seconds=30, retry_limit=3, retry_delay=10, retry_args=None, run_page_url=None, repair_run=False, caller='DatabricksExecutionTrigger')[source]

Bases: airflow.triggers.base.BaseTrigger

The trigger handles the logic of async communication with DataBricks API.

Parameters
  • run_id (int) – id of the run

  • databricks_conn_id (str) – Reference to the Databricks connection.

  • polling_period_seconds (int) – Controls the rate of the poll for the result of this run. By default, the trigger will poll every 30 seconds.

  • retry_limit (int) – The number of times to retry the connection in case of service outages.

  • retry_delay (int) – The number of seconds to wait between retries.

  • retry_args (dict[Any, Any] | None) – An optional dictionary with arguments passed to tenacity.Retrying class.

  • run_page_url (str | None) – The run page url.

serialize()[source]

Return the information needed to reconstruct this Trigger.

Returns

Tuple of (class path, keyword arguments needed to re-instantiate).

Return type

tuple[str, dict[str, Any]]

async run()[source]

Run the trigger in an asynchronous context.

The trigger should yield an Event whenever it wants to fire off an event, and return None if it is finished. Single-event triggers should thus yield and then immediately return.

If it yields, it is likely that it will be resumed very quickly, but it may not be (e.g. if the workload is being moved to another triggerer process, or a multi-event trigger was being used for a single-event task defer).

In either case, Trigger classes should assume they will be persisted, and then rely on cleanup() being called when they are no longer needed.

Was this entry helpful?