airflow.providers.google.cloud.hooks.dataform

Module Contents

Classes

DataformHook

Hook for Google Cloud DataForm APIs.

class airflow.providers.google.cloud.hooks.dataform.DataformHook(gcp_conn_id='google_cloud_default', delegate_to=None, impersonation_chain=None)[source]

Bases: airflow.providers.google.common.hooks.base_google.GoogleBaseHook

Hook for Google Cloud DataForm APIs.

get_dataform_client()[source]

Retrieves client library object that allow access to Cloud Dataform service.

wait_for_workflow_invocation(workflow_invocation_id, repository_id, project_id, region, wait_time=10, timeout=None)[source]

Helper method which polls a job to check if it finishes.

Parameters
  • workflow_invocation_id (str) – Id of the Workflow Invocation

  • repository_id (str) – Id of the Dataform repository

  • project_id (str) – Required. The ID of the Google Cloud project the cluster belongs to.

  • region (str) – Required. The Cloud Dataproc region in which to handle the request.

  • wait_time (int) – Number of seconds between checks

  • timeout (int | None) – How many seconds wait for job to be ready. Used only if asynchronous is False

create_compilation_result(project_id, region, repository_id, compilation_result, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a new CompilationResult in a given project and location.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the task belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the task belongs to.

  • repository_id (str) – Required. The ID of the Dataform repository that the task belongs to.

  • compilation_result (CompilationResult | dict) – Required. The compilation result to create.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

get_compilation_result(project_id, region, repository_id, compilation_result_id, retry=DEFAULT, timeout=None, metadata=())[source]

Fetches a single CompilationResult.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the task belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the task belongs to.

  • repository_id (str) – Required. The ID of the Dataform repository that the task belongs to.

  • compilation_result_id (str) – The Id of the Dataform Compilation Result

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

create_workflow_invocation(project_id, region, repository_id, workflow_invocation, retry=DEFAULT, timeout=None, metadata=())[source]

Creates a new WorkflowInvocation in a given Repository.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the task belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the task belongs to.

  • repository_id (str) – Required. The ID of the Dataform repository that the task belongs to.

  • workflow_invocation (WorkflowInvocation | dict) – Required. The workflow invocation resource to create.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

get_workflow_invocation(project_id, region, repository_id, workflow_invocation_id, retry=DEFAULT, timeout=None, metadata=())[source]

Fetches a single WorkflowInvocation.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the task belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the task belongs to.

  • repository_id (str) – Required. The ID of the Dataform repository that the task belongs to.

  • workflow_invocation_id (str) – Required. The workflow invocation resource’s id.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

cancel_workflow_invocation(project_id, region, repository_id, workflow_invocation_id, retry=DEFAULT, timeout=None, metadata=())[source]

Requests cancellation of a running WorkflowInvocation.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project that the task belongs to.

  • region (str) – Required. The ID of the Google Cloud region that the task belongs to.

  • repository_id (str) – Required. The ID of the Dataform repository that the task belongs to.

  • workflow_invocation_id (str) – Required. The workflow invocation resource’s id.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

create_repository(*, project_id, region, repository_id, retry=DEFAULT, timeout=None, metadata=())[source]

Creates repository

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where repository should be.

  • region (str) – Required. The ID of the Google Cloud region where repository should be.

  • repository_id (str) – Required. The ID of the new Dataform repository.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

delete_repository(*, project_id, region, repository_id, force=True, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes repository.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where repository located.

  • region (str) – Required. The ID of the Google Cloud region where repository located.

  • repository_id (str) – Required. The ID of the Dataform repository that should be deleted.

  • force (bool) – If set to true, any child resources of this repository will also be deleted.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

create_workspace(*, project_id, region, repository_id, workspace_id, retry=DEFAULT, timeout=None, metadata=())[source]

Creates workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace should be.

  • region (str) – Required. The ID of the Google Cloud region where workspace should be.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace should be.

  • workspace_id (str) – Required. The ID of the new Dataform workspace.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

delete_workspace(*, project_id, region, repository_id, workspace_id, retry=DEFAULT, timeout=None, metadata=())[source]

Deletes workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace that should be deleted.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

write_file(*, project_id, region, repository_id, workspace_id, filepath, contents, retry=DEFAULT, timeout=None, metadata=())[source]

Writes a new file to the specified workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace where files should be created.

  • filepath (str) – Required. Path to file including name of the file relative to workspace root.

  • contents (bytes) – Required. Content of the file to be written.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

make_directory(*, project_id, region, repository_id, workspace_id, path, retry=DEFAULT, timeout=None, metadata=())[source]

Makes new directory in specified workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace where directory should be created.

  • path (str) – Required. The directory’s full path including new directory name, relative to the workspace root.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

remove_directory(*, project_id, region, repository_id, workspace_id, path, retry=DEFAULT, timeout=None, metadata=())[source]

Removes directory in specified workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace where directory located.

  • path (str) – Required. The directory’s full path including directory name, relative to the workspace root.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

remove_file(*, project_id, region, repository_id, workspace_id, filepath, retry=DEFAULT, timeout=None, metadata=())[source]

Removes file in specified workspace.

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace where directory located.

  • filepath (str) – Required. The full path including name of the file, relative to the workspace root.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

install_npm_packages(*, project_id, region, repository_id, workspace_id, retry=DEFAULT, timeout=None, metadata=())[source]

Installs npm dependencies in the provided workspace. Requires “package.json” to be created in workspace

Parameters
  • project_id (str) – Required. The ID of the Google Cloud project where workspace located.

  • region (str) – Required. The ID of the Google Cloud region where workspace located.

  • repository_id (str) – Required. The ID of the Dataform repository where workspace located.

  • workspace_id (str) – Required. The ID of the Dataform workspace.

  • retry (Retry | _MethodDefault) – Designation of what errors, if any, should be retried.

  • timeout (float | None) – The timeout for this request.

  • metadata (Sequence[tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Was this entry helpful?