airflow.providers.google.cloud.hooks.dataprep

This module contains Google Dataprep hook.

Module Contents

class airflow.providers.google.cloud.hooks.dataprep.GoogleDataprepHook(dataprep_conn_id: str = default_conn_name)[source]

Bases: airflow.hooks.base.BaseHook

Hook for connection with Dataprep API. To get connection Dataprep with Airflow you need Dataprep token. https://clouddataprep.com/documentation/api#section/Authentication

It should be added to the Connection in Airflow in JSON format.

conn_name_attr = dataprep_conn_id[source]
default_conn_name = dataprep_default[source]
conn_type = dataprep[source]
hook_name = Google Dataprep[source]
_headers[source]
get_jobs_for_job_group(self, job_id: int)[source]

Get information about the batch jobs within a Cloud Dataprep job.

Parameters

job_id (int) – The ID of the job that will be fetched

get_job_group(self, job_group_id: int, embed: str, include_deleted: bool)[source]

Get the specified job group. A job group is a job that is executed from a specific node in a flow.

Parameters
  • job_group_id (int) – The ID of the job that will be fetched

  • embed (str) – Comma-separated list of objects to pull in as part of the response

  • include_deleted (bool) – if set to “true”, will include deleted objects

run_job_group(self, body_request: dict)[source]

Creates a jobGroup, which launches the specified job as the authenticated user. This performs the same action as clicking on the Run Job button in the application. To get recipe_id please follow the Dataprep API documentation https://clouddataprep.com/documentation/api#operation/runJobGroup

Parameters

body_request (dict) – The identifier for the recipe you would like to run.

_raise_for_status(self, response: requests.models.Response)[source]

Was this entry helpful?