airflow.providers.databricks.hooks.databricks_sql

Module Contents

Classes

DatabricksSqlHook

Interact with Databricks SQL.

Attributes

LIST_SQL_ENDPOINTS_ENDPOINT

airflow.providers.databricks.hooks.databricks_sql.LIST_SQL_ENDPOINTS_ENDPOINT = ['GET', 'api/2.0/sql/endpoints'][source]
class airflow.providers.databricks.hooks.databricks_sql.DatabricksSqlHook(databricks_conn_id=BaseDatabricksHook.default_conn_name, http_path=None, sql_endpoint_name=None, session_configuration=None)[source]

Bases: airflow.providers.databricks.hooks.databricks_base.BaseDatabricksHook, airflow.hooks.dbapi.DbApiHook

Interact with Databricks SQL.

Parameters
  • databricks_conn_id (str) -- Reference to the Databricks connection.

  • http_path (Optional[str]) -- Optional string specifying HTTP path of Databricks SQL Endpoint or cluster. If not specified, it should be either specified in the Databricks connection's extra parameters, or sql_endpoint_name must be specified.

  • sql_endpoint_name (Optional[str]) -- Optional name of Databricks SQL Endpoint. If not specified, http_path must be provided as described above.

  • session_configuration (Optional[Dict[str, str]]) -- An optional dictionary of Spark session parameters. Defaults to None. If not specified, it could be specified in the Databricks connection's extra parameters.

hook_name = Databricks SQL[source]
get_conn(self)[source]

Returns a Databricks SQL connection object

static maybe_split_sql_string(sql)[source]

Splits strings consisting of multiple SQL expressions into an TODO: do we need something more sophisticated?

Parameters

sql (str) -- SQL string potentially consisting of multiple expressions

Returns

list of individual expressions

Return type

List[str]

run(self, sql, autocommit=True, parameters=None, handler=None)[source]

Runs a command or a list of commands. Pass a list of sql statements to the sql parameter to get them to execute sequentially

Parameters
  • sql (Union[str, List[str]]) -- the sql statement to be executed (str) or a list of sql statements to execute

  • autocommit -- What to set the connection's autocommit setting to before executing the query.

  • parameters -- The parameters to render the SQL query with.

  • handler -- The result handler which is called with the result of each statement.

Returns

query results.

test_connection(self)[source]

Test the Databricks SQL connection by running a simple query.

abstract bulk_dump(self, table, tmp_file)[source]

Dumps a database table into a tab-delimited file

Parameters
  • table -- The name of the source table

  • tmp_file -- The path of the target file

abstract bulk_load(self, table, tmp_file)[source]

Loads a tab-delimited file into a database table

Parameters
  • table -- The name of the target table

  • tmp_file -- The path of the file to load into the table

Was this entry helpful?