airflow.providers.dbt.cloud.hooks.dbt

Module Contents

Classes

TokenAuth

Helper class for Auth when executing requests.

JobRunInfo

Type class for the job_run_info dictionary.

DbtCloudJobRunStatus

dbt Cloud Job statuses.

DbtCloudHook

Interact with dbt Cloud using the V2 (V3 if supported) API.

Functions

fallback_to_default_account(func)

Provide a fallback value for account_id.

provide_account_id(func)

Provide a fallback value for account_id.

Attributes

DBT_CAUSE_MAX_LENGTH

T

airflow.providers.dbt.cloud.hooks.dbt.DBT_CAUSE_MAX_LENGTH = 255[source]
airflow.providers.dbt.cloud.hooks.dbt.fallback_to_default_account(func)[source]

Provide a fallback value for account_id.

If the account_id is None or not passed to the decorated function, the value will be taken from the configured dbt Cloud Airflow Connection.

class airflow.providers.dbt.cloud.hooks.dbt.TokenAuth(token)[source]

Bases: requests.auth.AuthBase

Helper class for Auth when executing requests.

__call__(request)[source]
class airflow.providers.dbt.cloud.hooks.dbt.JobRunInfo[source]

Bases: airflow.typing_compat.TypedDict

Type class for the job_run_info dictionary.

account_id: int | None[source]
run_id: int[source]
class airflow.providers.dbt.cloud.hooks.dbt.DbtCloudJobRunStatus[source]

Bases: enum.Enum

dbt Cloud Job statuses.

QUEUED = 1[source]
STARTING = 2[source]
RUNNING = 3[source]
SUCCESS = 10[source]
ERROR = 20[source]
CANCELLED = 30[source]
NON_TERMINAL_STATUSES = ()[source]
TERMINAL_STATUSES = ()[source]
classmethod check_is_valid(statuses)[source]

Validate input statuses are a known value.

classmethod is_terminal(status)[source]

Check if the input status is that of a terminal type.

exception airflow.providers.dbt.cloud.hooks.dbt.DbtCloudJobRunException[source]

Bases: airflow.exceptions.AirflowException

An exception that indicates a job run failed to complete.

airflow.providers.dbt.cloud.hooks.dbt.T[source]
airflow.providers.dbt.cloud.hooks.dbt.provide_account_id(func)[source]

Provide a fallback value for account_id.

If the account_id is None or not passed to the decorated function, the value will be taken from the configured dbt Cloud Airflow Connection.

class airflow.providers.dbt.cloud.hooks.dbt.DbtCloudHook(dbt_cloud_conn_id=default_conn_name, *args, **kwargs)[source]

Bases: airflow.providers.http.hooks.http.HttpHook

Interact with dbt Cloud using the V2 (V3 if supported) API.

Parameters

dbt_cloud_conn_id (str) – The ID of the dbt Cloud connection.

conn_name_attr = 'dbt_cloud_conn_id'[source]
default_conn_name = 'dbt_cloud_default'[source]
conn_type = 'dbt_cloud'[source]
hook_name = 'dbt Cloud'[source]
classmethod get_ui_field_behaviour()[source]

Build custom field behavior for the dbt Cloud connection form in the Airflow UI.

static get_request_url_params(tenant, endpoint, include_related=None, *, api_version='v2')[source]

Form URL from base url and endpoint url.

Parameters
  • tenant (str) – The tenant domain name which is need to be replaced in base url.

  • endpoint (str) – Endpoint url to be requested.

  • include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

async get_headers_tenants_from_connection()[source]

Get Headers, tenants from the connection details.

async get_job_details(run_id, account_id=None, include_related=None)[source]

Use Http async call to retrieve metadata for a specific run of a dbt Cloud job.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

async get_job_status(run_id, account_id=None, include_related=None)[source]

Retrieve the status for a specific run of a dbt Cloud job.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

connection()[source]
get_conn(*args, **kwargs)[source]

Create a Requests HTTP session.

Parameters

headers – additional headers to be passed through as a dictionary

list_accounts()[source]

Retrieve all of the dbt Cloud accounts the configured API token is authorized to access.

Returns

List of request responses.

Return type

list[requests.models.Response]

get_account(account_id=None)[source]

Retrieve metadata for a specific dbt Cloud account.

Parameters

account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The request response.

Return type

requests.models.Response

list_projects(account_id=None)[source]

Retrieve metadata for all projects tied to a specified dbt Cloud account.

Parameters

account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

List of request responses.

Return type

list[requests.models.Response]

get_project(project_id, account_id=None)[source]

Retrieve metadata for a specific project.

Parameters
  • project_id (int) – The ID of a dbt Cloud project.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The request response.

Return type

requests.models.Response

list_jobs(account_id=None, order_by=None, project_id=None)[source]

Retrieve metadata for all jobs tied to a specified dbt Cloud account.

If a project_id is supplied, only jobs pertaining to this project will be retrieved.

Parameters
  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • order_by (str | None) – Optional. Field to order the result by. Use ‘-‘ to indicate reverse order. For example, to use reverse order by the run ID use order_by=-id.

  • project_id (int | None) – The ID of a dbt Cloud project.

Returns

List of request responses.

Return type

list[requests.models.Response]

get_job(job_id, account_id=None)[source]

Retrieve metadata for a specific job.

Parameters
  • job_id (int) – The ID of a dbt Cloud job.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The request response.

Return type

requests.models.Response

trigger_job_run(job_id, cause, account_id=None, steps_override=None, schema_override=None, retry_from_failure=False, additional_run_config=None)[source]

Triggers a run of a dbt Cloud job.

Parameters
  • job_id (int) – The ID of a dbt Cloud job.

  • cause (str) – Description of the reason to trigger the job.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • steps_override (list[str] | None) – Optional. List of dbt commands to execute when triggering the job instead of those configured in dbt Cloud.

  • schema_override (str | None) – Optional. Override the destination schema in the configured target for this job.

  • retry_from_failure (bool) – Optional. If set to True and the previous job run has failed, the job will be triggered using the “rerun” endpoint. This parameter cannot be used alongside steps_override, schema_override, or additional_run_config.

  • additional_run_config (dict[str, Any] | None) – Optional. Any additional parameters that should be included in the API request when triggering the job.

Returns

The request response.

Return type

requests.models.Response

list_job_runs(account_id=None, include_related=None, job_definition_id=None, order_by=None)[source]

Retrieve metadata for all dbt Cloud job runs for an account.

If a job_definition_id is supplied, only metadata for runs of that specific job are pulled.

Parameters
  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

  • job_definition_id (int | None) – Optional. The dbt Cloud job ID to retrieve run metadata.

  • order_by (str | None) – Optional. Field to order the result by. Use ‘-‘ to indicate reverse order. For example, to use reverse order by the run ID use order_by=-id.

Returns

List of request responses.

Return type

list[requests.models.Response]

get_job_runs(account_id=None, payload=None)[source]

Retrieve metadata for a specific run of a dbt Cloud job.

Parameters
  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • paylod – Optional. Query Parameters

Returns

The request response.

Return type

requests.models.Response

get_job_run(run_id, account_id=None, include_related=None)[source]

Retrieve metadata for a specific run of a dbt Cloud job.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • include_related (list[str] | None) – Optional. List of related fields to pull with the run. Valid values are “trigger”, “job”, “repository”, and “environment”.

Returns

The request response.

Return type

requests.models.Response

get_job_run_status(run_id, account_id=None)[source]

Retrieve the status for a specific run of a dbt Cloud job.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The status of a dbt Cloud job run.

Return type

int

wait_for_job_run_status(run_id, account_id=None, expected_statuses=DbtCloudJobRunStatus.SUCCESS.value, check_interval=60, timeout=60 * 60 * 24 * 7)[source]

Wait for a dbt Cloud job run to match an expected status.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • expected_statuses (int | Sequence[int] | set[int]) – Optional. The desired status(es) to check against a job run’s current status. Defaults to the success status value.

  • check_interval (int) – Time in seconds to check on a pipeline run’s status.

  • timeout (int) – Time in seconds to wait for a pipeline to reach a terminal status or the expected status.

Returns

Boolean indicating if the job run has reached the expected_status.

Return type

bool

cancel_job_run(run_id, account_id=None)[source]

Cancel a specific dbt Cloud job run.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

list_job_run_artifacts(run_id, account_id=None, step=None)[source]

Retrieve a list of the available artifact files generated for a completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. To list artifacts from other steps in the run, use the step parameter.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • step (int | None) – Optional. The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

Returns

List of request responses.

Return type

list[requests.models.Response]

get_job_run_artifact(run_id, path, account_id=None, step=None)[source]

Retrieve a list of the available artifact files generated for a completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. To list artifacts from other steps in the run, use the step parameter.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • path (str) – The file path related to the artifact file. Paths are rooted at the target/ directory. Use “manifest.json”, “catalog.json”, or “run_results.json” to download dbt-generated artifacts for the run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

  • step (int | None) – Optional. The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

Returns

The request response.

Return type

requests.models.Response

async get_job_run_artifacts_concurrently(run_id, artifacts, account_id=None, step=None)[source]

Retrieve a list of chosen artifact files generated for a step in completed run of a dbt Cloud job.

By default, this returns artifacts from the last step in the run. This takes advantage of the asynchronous calls to speed up the retrieval.

Parameters
  • run_id (int) – The ID of a dbt Cloud job run.

  • step (int | None) – The index of the Step in the Run to query for artifacts. The first step in the run has the index 1. If the step parameter is omitted, artifacts for the last step in the run will be returned.

  • path – The file path related to the artifact file. Paths are rooted at the target/ directory. Use “manifest.json”, “catalog.json”, or “run_results.json” to download dbt-generated artifacts for the run.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The request response.

retry_failed_job_run(job_id, account_id=None)[source]

Retry a failed run for a job from the point of failure, if the run failed. Otherwise, trigger a new run.

Parameters
  • job_id (int) – The ID of a dbt Cloud job.

  • account_id (int | None) – Optional. The ID of a dbt Cloud account.

Returns

The request response.

Return type

requests.models.Response

test_connection()[source]

Test dbt Cloud connection.

Was this entry helpful?