airflow.providers.google.cloud.hooks.dataprep
¶
This module contains Google Dataprep hook.
Module Contents¶
Classes¶
Types of job group run statuses. |
|
Hook for connection with Dataprep API. |
- class airflow.providers.google.cloud.hooks.dataprep.JobGroupStatuses[source]¶
-
Types of job group run statuses.
- class airflow.providers.google.cloud.hooks.dataprep.GoogleDataprepHook(dataprep_conn_id=default_conn_name, api_version='v4')[source]¶
Bases:
airflow.hooks.base.BaseHook
Hook for connection with Dataprep API.
To get connection Dataprep with Airflow you need Dataprep token.
https://clouddataprep.com/documentation/api#section/Authentication
It should be added to the Connection in Airflow in JSON format.
- get_jobs_for_job_group(job_id)[source]¶
Get information about the batch jobs within a Cloud Dataprep job.
- Parameters
job_id (int) – The ID of the job that will be fetched
- get_job_group(job_group_id, embed, include_deleted)[source]¶
Get the specified job group.
A job group is a job that is executed from a specific node in a flow.
- run_job_group(body_request)[source]¶
Creates a
jobGroup
, which launches the specified job as the authenticated user.This performs the same action as clicking on the Run Job button in the application.
To get recipe_id please follow the Dataprep API documentation https://clouddataprep.com/documentation/api#operation/runJobGroup.
- Parameters
body_request (dict) – The identifier for the recipe you would like to run.
- create_flow(*, body_request)[source]¶
Creates flow.
- Parameters
body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createFlow
- copy_flow(*, flow_id, name='', description='', copy_datasources=False)[source]¶
Create a copy of the provided flow id, as well as all contained recipes.
- delete_flow(*, flow_id)[source]¶
Delete the flow with the provided id.
- Parameters
flow_id (int) – ID of the flow to be copied
- run_flow(*, flow_id, body_request)[source]¶
Runs the flow with the provided id copy of the provided flow id.
- get_job_group_status(*, job_group_id)[source]¶
Check the status of the Dataprep task to be finished.
- Parameters
job_group_id (int) – ID of the job group to check
- create_imported_dataset(*, body_request)[source]¶
Creates imported dataset.
- Parameters
body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createImportedDataset
- create_wrangled_dataset(*, body_request)[source]¶
Creates wrangled dataset.
- Parameters
body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createWrangledDataset
- create_output_object(*, body_request)[source]¶
Creates output.
- Parameters
body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#operation/createOutputObject
- create_write_settings(*, body_request)[source]¶
Creates write settings.
- Parameters
body_request (dict) – Body of the POST request to be sent. For more details check https://clouddataprep.com/documentation/api#tag/createWriteSetting