airflow.providers.databricks.operators.databricks_repos¶
This module contains Databricks operators.
Classes¶
| Creates, and optionally checks out, a Databricks Repo using the POST api/2.0/repos API endpoint. | |
| Updates specified repository to a given branch or tag using the PATCH api/2.0/repos API endpoint. | |
| Deletes specified repository using the DELETE api/2.0/repos API endpoint. | 
Module Contents¶
- class airflow.providers.databricks.operators.databricks_repos.DatabricksReposCreateOperator(*, git_url, git_provider=None, branch=None, tag=None, repo_path=None, ignore_existing_repo=False, databricks_conn_id='databricks_default', databricks_retry_limit=3, databricks_retry_delay=1, **kwargs)[source]¶
- Bases: - airflow.models.BaseOperator- Creates, and optionally checks out, a Databricks Repo using the POST api/2.0/repos API endpoint. - Parameters:
- git_url (str) – Required HTTPS URL of a Git repository 
- git_provider (str | None) – Optional name of Git provider. Must be provided if we can’t guess its name from URL. 
- repo_path (str | None) – optional path for a repository. Must be in the format - /Repos/{folder}/{repo-name}. If not specified, it will be created in the user’s directory.
- branch (str | None) – optional name of branch to check out. 
- tag (str | None) – optional name of tag to checkout. 
- ignore_existing_repo (bool) – don’t throw exception if repository with given path already exists. 
- databricks_conn_id (str) – Reference to the Databricks connection. By default and in the common case this will be - databricks_default. To use token based authentication, provide the key- tokenin the extra field for the connection and create the key- hostand leave the- hostfield empty. (templated)
- databricks_retry_limit (int) – Amount of times retry if the Databricks backend is unreachable. Its value must be greater than or equal to 1. 
- databricks_retry_delay (int) – Number of seconds to wait between retries (it might be a floating point number). 
 
 - template_fields: collections.abc.Sequence[str] = ('repo_path', 'tag', 'branch', 'databricks_conn_id')[source]¶
 
- class airflow.providers.databricks.operators.databricks_repos.DatabricksReposUpdateOperator(*, branch=None, tag=None, repo_id=None, repo_path=None, databricks_conn_id='databricks_default', databricks_retry_limit=3, databricks_retry_delay=1, **kwargs)[source]¶
- Bases: - airflow.models.BaseOperator- Updates specified repository to a given branch or tag using the PATCH api/2.0/repos API endpoint. - See: https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/update-repo - Parameters:
- branch (str | None) – optional name of branch to update to. Should be specified if - tagis omitted
- tag (str | None) – optional name of tag to update to. Should be specified if - branchis omitted
- repo_id (str | None) – optional ID of existing repository. Should be specified if - repo_pathis omitted
- repo_path (str | None) – optional path of existing repository. Should be specified if - repo_idis omitted
- databricks_conn_id (str) – Reference to the Databricks connection. By default and in the common case this will be - databricks_default. To use token based authentication, provide the key- tokenin the extra field for the connection and create the key- hostand leave the- hostfield empty. (templated)
- databricks_retry_limit (int) – Amount of times retry if the Databricks backend is unreachable. Its value must be greater than or equal to 1. 
- databricks_retry_delay (int) – Number of seconds to wait between retries (it might be a floating point number). 
 
 - template_fields: collections.abc.Sequence[str] = ('repo_path', 'tag', 'branch', 'databricks_conn_id')[source]¶
 
- class airflow.providers.databricks.operators.databricks_repos.DatabricksReposDeleteOperator(*, repo_id=None, repo_path=None, databricks_conn_id='databricks_default', databricks_retry_limit=3, databricks_retry_delay=1, **kwargs)[source]¶
- Bases: - airflow.models.BaseOperator- Deletes specified repository using the DELETE api/2.0/repos API endpoint. - See: https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/delete-repo - Parameters:
- repo_id (str | None) – optional ID of existing repository. Should be specified if - repo_pathis omitted
- repo_path (str | None) – optional path of existing repository. Should be specified if - repo_idis omitted
- databricks_conn_id (str) – Reference to the Databricks connection. By default and in the common case this will be - databricks_default. To use token based authentication, provide the key- tokenin the extra field for the connection and create the key- hostand leave the- hostfield empty. (templated)
- databricks_retry_limit (int) – Amount of times retry if the Databricks backend is unreachable. Its value must be greater than or equal to 1. 
- databricks_retry_delay (int) – Number of seconds to wait between retries (it might be a floating point number). 
 
 - template_fields: collections.abc.Sequence[str] = ('repo_path', 'databricks_conn_id')[source]¶