airflow.providers.google.cloud.operators.datapipeline

This module contains Google Data Pipelines operators.

Module Contents

Classes

CreateDataPipelineOperator

Creates a new Data Pipelines instance from the Data Pipelines API.

RunDataPipelineOperator

Runs a Data Pipelines Instance using the Data Pipelines API.

class airflow.providers.google.cloud.operators.datapipeline.CreateDataPipelineOperator(*, body, project_id=PROVIDE_PROJECT_ID, location=DEFAULT_DATAPIPELINE_LOCATION, gcp_conn_id='google_cloud_default', impersonation_chain=None, **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Creates a new Data Pipelines instance from the Data Pipelines API.

Parameters
  • body (dict) – The request body (contains instance of Pipeline). See: https://cloud.google.com/dataflow/docs/reference/data-pipelines/rest/v1/projects.locations.pipelines/create#request-body

  • project_id (str) – The ID of the GCP project that owns the job.

  • location (str) – The location to direct the Data Pipelines instance to (for example us-central1).

  • gcp_conn_id (str) – The connection ID to connect to the Google Cloud Platform.

  • impersonation_chain (str | Sequence[str] | None) –

    Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

    Warning

    This option requires Apache Beam 2.39.0 or newer.

Returns the created Data Pipelines instance in JSON representation.

execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

class airflow.providers.google.cloud.operators.datapipeline.RunDataPipelineOperator(data_pipeline_name, project_id=PROVIDE_PROJECT_ID, location=DEFAULT_DATAPIPELINE_LOCATION, gcp_conn_id='google_cloud_default', **kwargs)[source]

Bases: airflow.providers.google.cloud.operators.cloud_base.GoogleCloudBaseOperator

Runs a Data Pipelines Instance using the Data Pipelines API.

Parameters
  • data_pipeline_name (str) – The display name of the pipeline. In example projects/PROJECT_ID/locations/LOCATION_ID/pipelines/PIPELINE_ID it would be the PIPELINE_ID.

  • project_id (str) – The ID of the GCP project that owns the job.

  • location (str) – The location to direct the Data Pipelines instance to (for example us-central1).

  • gcp_conn_id (str) – The connection ID to connect to the Google Cloud Platform.

Returns the created Job in JSON representation.

execute(context)[source]

Derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?