airflow.providers.google.cloud.hooks.dataprep¶
This module contains Google Dataprep hook.
Module Contents¶
Classes¶
| Types of job group run statuses. | |
| Hook for connection with Dataprep API. | 
- class airflow.providers.google.cloud.hooks.dataprep.JobGroupStatuses[source]¶
- 
Types of job group run statuses. 
- class airflow.providers.google.cloud.hooks.dataprep.GoogleDataprepHook(dataprep_conn_id=default_conn_name, api_version='v4')[source]¶
- Bases: - airflow.hooks.base.BaseHook- Hook for connection with Dataprep API. - To get connection Dataprep with Airflow you need Dataprep token. - https://clouddataprep.com/documentation/api#section/Authentication - It should be added to the Connection in Airflow in JSON format. - get_jobs_for_job_group(job_id)[source]¶
- Get information about the batch jobs within a Cloud Dataprep job. - Parameters
- job_id (int) – The ID of the job that will be fetched 
 
 - get_job_group(job_group_id, embed, include_deleted)[source]¶
- Get the specified job group. - A job group is a job that is executed from a specific node in a flow. 
 - run_job_group(body_request)[source]¶
- Creates a - jobGroup, which launches the specified job as the authenticated user.- This performs the same action as clicking on the Run Job button in the application. - To get recipe_id please follow the Dataprep API documentation https://clouddataprep.com/documentation/api#operation/runJobGroup. - Parameters
- body_request (dict) – The identifier for the recipe you would like to run. 
 
 - create_flow(*, body_request)[source]¶
- Creates flow. - Parameters
- body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createFlow 
 
 - copy_flow(*, flow_id, name='', description='', copy_datasources=False)[source]¶
- Create a copy of the provided flow id, as well as all contained recipes. 
 - delete_flow(*, flow_id)[source]¶
- Delete the flow with the provided id. - Parameters
- flow_id (int) – ID of the flow to be copied 
 
 - run_flow(*, flow_id, body_request)[source]¶
- Runs the flow with the provided id copy of the provided flow id. 
 - get_job_group_status(*, job_group_id)[source]¶
- Check the status of the Dataprep task to be finished. - Parameters
- job_group_id (int) – ID of the job group to check 
 
 - create_imported_dataset(*, body_request)[source]¶
- Creates imported dataset. - Parameters
- body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createImportedDataset 
 
 - create_wrangled_dataset(*, body_request)[source]¶
- Creates wrangled dataset. - Parameters
- body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createWrangledDataset 
 
 - create_output_object(*, body_request)[source]¶
- Creates output. - Parameters
- body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createOutputObject 
 
 - create_write_settings(*, body_request)[source]¶
- Creates write settings. - Parameters
- body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#tag/createWriteSetting 
 
 
