airflow.providers.google.cloud.hooks.datafusion

This module contains Google DataFusion hook.

Module Contents

Classes

PipelineStates

Data Fusion pipeline states

DataFusionHook

Hook for Google DataFusion.

Attributes

Operation

FAILURE_STATES

SUCCESS_STATES

airflow.providers.google.cloud.hooks.datafusion.Operation[source]
class airflow.providers.google.cloud.hooks.datafusion.PipelineStates[source]

Data Fusion pipeline states

PENDING = PENDING[source]
STARTING = STARTING[source]
RUNNING = RUNNING[source]
SUSPENDED = SUSPENDED[source]
RESUMING = RESUMING[source]
COMPLETED = COMPLETED[source]
FAILED = FAILED[source]
KILLED = KILLED[source]
REJECTED = REJECTED[source]
airflow.providers.google.cloud.hooks.datafusion.FAILURE_STATES[source]
airflow.providers.google.cloud.hooks.datafusion.SUCCESS_STATES[source]
class airflow.providers.google.cloud.hooks.datafusion.DataFusionHook(api_version='v1beta1', gcp_conn_id='google_cloud_default', delegate_to=None, impersonation_chain=None)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for Google DataFusion.

wait_for_operation(operation)[source]

Waits for long-lasting operation to complete.

wait_for_pipeline_state(pipeline_name, pipeline_id, instance_url, namespace='default', success_states=None, failure_states=None, timeout=5 * 60)[source]

Polls pipeline state and raises an exception if the state is one of failure_states or the operation timed_out.

get_conn()[source]

Retrieves connection to DataFusion.

restart_instance(instance_name, location, project_id)[source]

Restart a single Data Fusion instance. At the end of an operation instance is fully restarted.

Parameters
  • instance_name (str) – The name of the instance to restart.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

delete_instance(instance_name, location, project_id)[source]

Deletes a single Date Fusion instance.

Parameters
  • instance_name (str) – The name of the instance to delete.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

create_instance(instance_name, instance, location, project_id=PROVIDE_PROJECT_ID)[source]

Creates a new Data Fusion instance in the specified project and location.

Parameters
get_instance(instance_name, location, project_id)[source]

Gets details of a single Data Fusion instance.

Parameters
  • instance_name (str) – The name of the instance.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

patch_instance(instance_name, instance, update_mask, location, project_id=PROVIDE_PROJECT_ID)[source]

Updates a single Data Fusion instance.

Parameters
create_pipeline(pipeline_name, pipeline, instance_url, namespace='default')[source]

Creates a Cloud Data Fusion pipeline.

Parameters
delete_pipeline(pipeline_name, instance_url, version_id=None, namespace='default')[source]

Deletes a Cloud Data Fusion pipeline.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • version_id (str | None) – Version of pipeline to delete

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

list_pipelines(instance_url, artifact_name=None, artifact_version=None, namespace='default')[source]

Lists Cloud Data Fusion pipelines.

Parameters
  • artifact_version (str | None) – Artifact version to filter instances

  • artifact_name (str | None) – Artifact name to filter instances

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

get_pipeline_workflow(pipeline_name, instance_url, pipeline_id, namespace='default')[source]
start_pipeline(pipeline_name, instance_url, namespace='default', runtime_args=None)[source]

Starts a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • runtime_args (dict[str, Any] | None) – Optional runtime JSON args to be passed to the pipeline

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

stop_pipeline(pipeline_name, instance_url, namespace='default')[source]

Stops a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

Was this entry helpful?