airflow.providers.common.sql.hooks.sql

Module Contents

Classes

ConnectorProtocol

A protocol where you can connect to a database.

DbApiHook

Abstract base class for sql hooks.

Functions

fetch_all_handler(cursor)

Handler for DbApiHook.run() to return results

Attributes

BaseForDbApiHook

airflow.providers.common.sql.hooks.sql.fetch_all_handler(cursor)[source]

Handler for DbApiHook.run() to return results

class airflow.providers.common.sql.hooks.sql.ConnectorProtocol[source]

Bases: typing_extensions.Protocol

A protocol where you can connect to a database.

connect(host, port, username, schema)[source]

Connect to a database.

Parameters
  • host (str) – The database host to connect to.

  • port (int) – The database port to connect to.

  • username (str) – The database username used for the authentication.

  • schema (str) – The database schema to connect to.

Returns

the authorized connection object.

Return type

Any

airflow.providers.common.sql.hooks.sql.BaseForDbApiHook :Type[airflow.hooks.base.BaseHook][source]
class airflow.providers.common.sql.hooks.sql.DbApiHook(*args, schema=None, log_sql=True, **kwargs)[source]

Bases: airflow.hooks.dbapi.DbApiHook

Abstract base class for sql hooks.

Parameters
  • schema (Optional[str]) – Optional DB schema that overrides the schema specified in the connection. Make sure that if you change the schema parameter value in the constructor of the derived Hook, such change should be done before calling the DBApiHook.__init__().

  • log_sql (bool) – Whether to log SQL query when it’s executed. Defaults to True.

conn_name_attr :str[source]
default_conn_name = default_conn_id[source]
supports_autocommit = False[source]
connector :Optional[ConnectorProtocol][source]
get_conn()[source]

Returns a connection object

get_uri()[source]

Extract the URI from the connection.

Returns

the extracted uri.

Return type

str

get_sqlalchemy_engine(engine_kwargs=None)[source]

Get an sqlalchemy_engine object.

Parameters

engine_kwargs – Kwargs used in create_engine().

Returns

the created engine.

get_pandas_df(sql, parameters=None, **kwargs)[source]

Executes the sql and returns a pandas dataframe

Parameters
  • sql – the sql statement to be executed (str) or a list of sql statements to execute

  • parameters – The parameters to render the SQL query with.

  • kwargs – (optional) passed into pandas.io.sql.read_sql method

get_pandas_df_by_chunks(sql, parameters=None, *, chunksize, **kwargs)[source]

Executes the sql and returns a generator

Parameters
  • sql – the sql statement to be executed (str) or a list of sql statements to execute

  • parameters – The parameters to render the SQL query with

  • chunksize – number of rows to include in each chunk

  • kwargs – (optional) passed into pandas.io.sql.read_sql method

get_records(sql, parameters=None, **kwargs)[source]

Executes the sql and returns a set of records.

Parameters
  • sql (Union[str, List[str]]) – the sql statement to be executed (str) or a list of sql statements to execute

  • parameters (Optional[Union[Iterable, Mapping]]) – The parameters to render the SQL query with.

get_first(sql, parameters=None)[source]

Executes the sql and returns the first resulting row.

Parameters
  • sql (Union[str, List[str]]) – the sql statement to be executed (str) or a list of sql statements to execute

  • parameters – The parameters to render the SQL query with.

static strip_sql_string(sql)[source]
static split_sql_string(sql)[source]

Splits string into multiple SQL expressions

Parameters

sql (str) – SQL string potentially consisting of multiple expressions

Returns

list of individual expressions

Return type

List[str]

run(sql, autocommit=False, parameters=None, handler=None, split_statements=False, return_last=True)[source]

Runs a command or a list of commands. Pass a list of sql statements to the sql parameter to get them to execute sequentially

Parameters
  • sql (Union[str, Iterable[str]]) – the sql statement to be executed (str) or a list of sql statements to execute

  • autocommit (bool) – What to set the connection’s autocommit setting to before executing the query.

  • parameters (Optional[Union[Iterable, Mapping]]) – The parameters to render the SQL query with.

  • handler (Optional[Callable]) – The result handler which is called with the result of each statement.

  • split_statements (bool) – Whether to split a single SQL string into statements and run separately

  • return_last (bool) – Whether to return result for only last statement or for all after split

Returns

return only result of the ALL SQL expressions if handler was provided.

Return type

Optional[Union[Any, List[Any]]]

set_autocommit(conn, autocommit)[source]

Sets the autocommit flag on the connection

get_autocommit(conn)[source]

Get autocommit setting for the provided connection. Return True if conn.autocommit is set to True. Return False if conn.autocommit is not set or set to False or conn does not support autocommit.

Parameters

conn – Connection to get autocommit setting from.

Returns

connection autocommit setting.

Return type

bool

get_cursor()[source]

Returns a cursor

insert_rows(table, rows, target_fields=None, commit_every=1000, replace=False, **kwargs)[source]

A generic way to insert a set of tuples into a table, a new transaction is created every commit_every rows

Parameters
  • table – Name of the target table

  • rows – The rows to insert into the table

  • target_fields – The names of the columns to fill in the table

  • commit_every – The maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.

  • replace – Whether to replace instead of insert

abstract bulk_dump(table, tmp_file)[source]

Dumps a database table into a tab-delimited file

Parameters
  • table – The name of the source table

  • tmp_file – The path of the target file

abstract bulk_load(table, tmp_file)[source]

Loads a tab-delimited file into a database table

Parameters
  • table – The name of the target table

  • tmp_file – The path of the file to load into the table

test_connection()[source]

Tests the connection using db-specific query

Was this entry helpful?