DatabricksReposUpdateOperator

Use the DatabricksReposUpdateOperator to update code in an existing Databricks Repos to a given Git branch or tag via api/2.0/repos/ API endpoint.

Using the Operator

Usually this operator is used to update a source code of the Databricks job before its execution. To use this operator you need to provide either branch or tag and either repo_path or repo_id.

Parameter

Input

branch: str

Name of the existing Git branch to update to (required if tag isn't provided).

tag: str

Name of the existing Git tag to update to (required if branch isn't provided).

repo_path: str

Path to existing Databricks Repos, like, /Repos/<user_email>/repo_name (required if repo_id isn't provided).

repo_id: str

ID of existing Databricks Repos (required if repo_path isn't provided).

databricks_conn_id: string

the name of the Airflow connection to use.

databricks_retry_limit: integer

amount of times retry if the Databricks backend is unreachable.

databricks_retry_delay: decimal

number of seconds to wait between retries.

Examples

Updating Databricks Repo by specifying path

An example usage of the DatabricksReposUpdateOperator is as follows:

airflow/providers/databricks/example_dags/example_databricks_repos.py[source]

    # Example of updating a Databricks Repo to the latest code
    repo_path = "/Repos/user@domain.com/demo-repo"
    update_repo = DatabricksReposUpdateOperator(task_id='update_repo', repo_path=repo_path, branch="releases")

Was this entry helpful?