airflow.providers.google.cloud.hooks.datafusion

This module contains Google DataFusion hook.

Module Contents

airflow.providers.google.cloud.hooks.datafusion.Operation[source]
class airflow.providers.google.cloud.hooks.datafusion.PipelineStates[source]

Data Fusion pipeline states

PENDING = PENDING[source]
STARTING = STARTING[source]
RUNNING = RUNNING[source]
SUSPENDED = SUSPENDED[source]
RESUMING = RESUMING[source]
COMPLETED = COMPLETED[source]
FAILED = FAILED[source]
KILLED = KILLED[source]
REJECTED = REJECTED[source]
airflow.providers.google.cloud.hooks.datafusion.FAILURE_STATES[source]
airflow.providers.google.cloud.hooks.datafusion.SUCCESS_STATES[source]
class airflow.providers.google.cloud.hooks.datafusion.DataFusionHook(api_version: str = 'v1beta1', gcp_conn_id: str = 'google_cloud_default', delegate_to: Optional[str] = None, impersonation_chain: Optional[Union[str, Sequence[str]]] = None)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for Google DataFusion.

_conn :Optional[Resource][source]
wait_for_operation(self, operation: Dict[str, Any])[source]

Waits for long-lasting operation to complete.

wait_for_pipeline_state(self, pipeline_name: str, pipeline_id: str, instance_url: str, namespace: str = 'default', success_states: Optional[List[str]] = None, failure_states: Optional[List[str]] = None, timeout: int = 5 * 60)[source]

Polls pipeline state and raises an exception if the state is one of failure_states or the operation timed_out.

static _name(project_id: str, location: str, instance_name: str)[source]
static _parent(project_id: str, location: str)[source]
static _base_url(instance_url: str, namespace: str)[source]
_cdap_request(self, url: str, method: str, body: Optional[Union[List, Dict]] = None)[source]
get_conn(self)[source]

Retrieves connection to DataFusion.

restart_instance(self, instance_name: str, location: str, project_id: str)[source]

Restart a single Data Fusion instance. At the end of an operation instance is fully restarted.

Parameters
  • instance_name (str) – The name of the instance to restart.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

delete_instance(self, instance_name: str, location: str, project_id: str)[source]

Deletes a single Date Fusion instance.

Parameters
  • instance_name (str) – The name of the instance to delete.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

create_instance(self, instance_name: str, instance: Dict[str, Any], location: str, project_id: str)[source]

Creates a new Data Fusion instance in the specified project and location.

Parameters
get_instance(self, instance_name: str, location: str, project_id: str)[source]

Gets details of a single Data Fusion instance.

Parameters
  • instance_name (str) – The name of the instance.

  • location (str) – The Cloud Data Fusion location in which to handle the request.

  • project_id (str) – The ID of the Google Cloud project that the instance belongs to.

patch_instance(self, instance_name: str, instance: Dict[str, Any], update_mask: str, location: str, project_id: str)[source]

Updates a single Data Fusion instance.

Parameters
create_pipeline(self, pipeline_name: str, pipeline: Dict[str, Any], instance_url: str, namespace: str = 'default')[source]

Creates a Cloud Data Fusion pipeline.

Parameters
delete_pipeline(self, pipeline_name: str, instance_url: str, version_id: Optional[str] = None, namespace: str = 'default')[source]

Deletes a Cloud Data Fusion pipeline.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • version_id (Optional[str]) – Version of pipeline to delete

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

list_pipelines(self, instance_url: str, artifact_name: Optional[str] = None, artifact_version: Optional[str] = None, namespace: str = 'default')[source]

Lists Cloud Data Fusion pipelines.

Parameters
  • artifact_version (Optional[str]) – Artifact version to filter instances

  • artifact_name (Optional[str]) – Artifact name to filter instances

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

_get_workflow_state(self, pipeline_name: str, instance_url: str, pipeline_id: str, namespace: str = 'default')[source]
start_pipeline(self, pipeline_name: str, instance_url: str, namespace: str = 'default', runtime_args: Optional[Dict[str, Any]] = None)[source]

Starts a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • runtime_args (Optional[Dict[str, Any]]) – Optional runtime JSON args to be passed to the pipeline

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

stop_pipeline(self, pipeline_name: str, instance_url: str, namespace: str = 'default')[source]

Stops a Cloud Data Fusion pipeline. Works for both batch and stream pipelines.

Parameters
  • pipeline_name (str) – Your pipeline name.

  • instance_url (str) – Endpoint on which the REST APIs is accessible for the instance.

  • namespace (str) – f your pipeline belongs to a Basic edition instance, the namespace ID is always default. If your pipeline belongs to an Enterprise edition instance, you can create a namespace.

Was this entry helpful?