airflow.providers.microsoft.azure.operators.data_factory

Module Contents

Classes

AzureDataFactoryPipelineRunLink

Constructs a link to monitor a pipeline run in Azure Data Factory.

AzureDataFactoryRunPipelineOperator

Executes a data factory pipeline.

Bases: airflow.models.BaseOperatorLink

Constructs a link to monitor a pipeline run in Azure Data Factory.

name = Monitor Pipeline Run[source]

Link to external system.

Note: The old signature of this function was (self, operator, dttm: datetime). That is still supported at runtime but is deprecated.

Parameters
Returns

link to external system

Return type

str

class airflow.providers.microsoft.azure.operators.data_factory.AzureDataFactoryRunPipelineOperator(*, pipeline_name, azure_data_factory_conn_id=AzureDataFactoryHook.default_conn_name, wait_for_termination=True, resource_group_name=None, factory_name=None, reference_pipeline_run_id=None, is_recovery=None, start_activity_name=None, start_from_failure=None, parameters=None, timeout=60 * 60 * 24 * 7, check_interval=60, **kwargs)[source]

Bases: airflow.models.BaseOperator

Executes a data factory pipeline.

See also

For more information on how to use this operator, take a look at the guide: AzureDataFactoryRunPipelineOperator

Parameters
  • azure_data_factory_conn_id (str) – The connection identifier for connecting to Azure Data Factory.

  • pipeline_name (str) – The name of the pipeline to execute.

  • wait_for_termination (bool) – Flag to wait on a pipeline run’s termination. By default, this feature is enabled but could be disabled to perform an asynchronous wait for a long-running pipeline execution using the AzureDataFactoryPipelineRunSensor.

  • resource_group_name (Optional[str]) – The resource group name. If a value is not passed in to the operator, the AzureDataFactoryHook will attempt to use the resource group name provided in the corresponding connection.

  • factory_name (Optional[str]) – The data factory name. If a value is not passed in to the operator, the AzureDataFactoryHook will attempt to use the factory name name provided in the corresponding connection.

  • reference_pipeline_run_id (Optional[str]) – The pipeline run identifier. If this run ID is specified the parameters of the specified run will be used to create a new run.

  • is_recovery (Optional[bool]) – Recovery mode flag. If recovery mode is set to True, the specified referenced pipeline run and the new run will be grouped under the same groupId.

  • start_activity_name (Optional[str]) – In recovery mode, the rerun will start from this activity. If not specified, all activities will run.

  • start_from_failure (Optional[bool]) – In recovery mode, if set to true, the rerun will start from failed activities. The property will be used only if start_activity_name is not specified.

  • parameters (Optional[Dict[str, Any]]) – Parameters of the pipeline run. These parameters are referenced in a pipeline via @pipeline().parameters.parameterName and will be used only if the reference_pipeline_run_id is not specified.

  • timeout (int) – Time in seconds to wait for a pipeline to reach a terminal status for non-asynchronous waits. Used only if wait_for_termination is True.

  • check_interval (int) – Time in seconds to check on a pipeline run’s status for non-asynchronous waits. Used only if wait_for_termination is True.

template_fields :Sequence[str] = ['azure_data_factory_conn_id', 'resource_group_name', 'factory_name', 'pipeline_name',...[source]
template_fields_renderers[source]
ui_color = #0678d4[source]
execute(self, context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

on_kill(self)[source]

Override this method to cleanup subprocesses when a task instance gets killed. Any use of the threading, subprocess or multiprocessing module within an operator needs to be cleaned up or it will leave ghost processes behind.

Was this entry helpful?