airflow.providers.microsoft.azure.operators.synapse
¶
Module Contents¶
Classes¶
Executes a Spark job on Azure Synapse. |
|
Constructs a link to monitor a pipeline run in Azure Synapse. |
|
Executes a Synapse Pipeline. |
- class airflow.providers.microsoft.azure.operators.synapse.AzureSynapseRunSparkBatchOperator(*, azure_synapse_conn_id=AzureSynapseHook.default_conn_name, wait_for_termination=True, spark_pool='', payload, timeout=60 * 60 * 24 * 7, check_interval=60, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Executes a Spark job on Azure Synapse.
- Parameters
azure_synapse_conn_id (str) – The connection identifier for connecting to Azure Synapse.
wait_for_termination (bool) – Flag to wait on a job run’s termination.
spark_pool (str) – The target synapse spark pool used to submit the job
payload (azure.synapse.spark.models.SparkBatchJobOptions) – Livy compatible payload which represents the spark job that a user wants to submit
timeout (int) – Time in seconds to wait for a job to reach a terminal status for non-asynchronous waits. Used only if
wait_for_termination
is True.check_interval (int) – Time in seconds to check on a job run’s status for non-asynchronous waits. Used only if
wait_for_termination
is True.
- class airflow.providers.microsoft.azure.operators.synapse.AzureSynapsePipelineRunLink[source]¶
Bases:
airflow.models.BaseOperatorLink
Constructs a link to monitor a pipeline run in Azure Synapse.
- get_fields_from_url(workspace_url)[source]¶
Extracts the workspace_name, subscription_id and resource_group from the Synapse workspace url.
- Parameters
workspace_url – The workspace url.
- get_link(operator, *, ti_key)[source]¶
Link to external system.
Note: The old signature of this function was
(self, operator, dttm: datetime)
. That is still supported at runtime but is deprecated.- Parameters
operator (airflow.models.BaseOperator) – The Airflow operator object this link is associated to.
ti_key (airflow.models.taskinstancekey.TaskInstanceKey) – TaskInstance ID to return link for.
- Returns
link to external system
- class airflow.providers.microsoft.azure.operators.synapse.AzureSynapseRunPipelineOperator(pipeline_name, azure_synapse_conn_id, azure_synapse_workspace_dev_endpoint, wait_for_termination=True, reference_pipeline_run_id=None, is_recovery=None, start_activity_name=None, parameters=None, timeout=60 * 60 * 24 * 7, check_interval=60, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Executes a Synapse Pipeline.
- Parameters
pipeline_name (str) – The name of the pipeline to execute.
azure_synapse_conn_id (str) – The Airflow connection ID for Azure Synapse.
azure_synapse_workspace_dev_endpoint (str) – The Azure Synapse workspace development endpoint.
wait_for_termination (bool) – Flag to wait on a pipeline run’s termination.
reference_pipeline_run_id (str | None) – The pipeline run identifier. If this run ID is specified the parameters of the specified run will be used to create a new run.
is_recovery (bool | None) – Recovery mode flag. If recovery mode is set to True, the specified referenced pipeline run and the new run will be grouped under the same
groupId
.start_activity_name (str | None) – In recovery mode, the rerun will start from this activity. If not specified, all activities will run.
parameters (dict[str, Any] | None) – Parameters of the pipeline run. These parameters are referenced in a pipeline via
@pipeline().parameters.parameterName
and will be used only if thereference_pipeline_run_id
is not specified.timeout (int) – Time in seconds to wait for a pipeline to reach a terminal status for non-asynchronous waits. Used only if
wait_for_termination
is True.check_interval (int) – Time in seconds to check on a pipeline run’s status for non-asynchronous waits. Used only if
wait_for_termination
is True.
- execute(context)[source]¶
Derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.