airflow.providers.common.sql.hooks.sql
¶
Module Contents¶
Classes¶
A protocol where you can connect to a database. |
|
Abstract base class for sql hooks. |
Functions¶
|
Determines when results of single query only should be returned. |
|
Handler for DbApiHook.run() to return results |
|
Handler for DbApiHook.run() to return first result |
Attributes¶
- airflow.providers.common.sql.hooks.sql.return_single_query_results(sql, return_last, split_statements)[source]¶
Determines when results of single query only should be returned.
For compatibility reasons, the behaviour of the DBAPIHook is somewhat confusing. In some cases, when multiple queries are run, the return value will be an iterable (list) of results – one for each query. However, in other cases, when single query is run, the return value will be just the result of that single query without wrapping the results in a list.
The cases when single query results are returned without wrapping them in a list are as follows:
sql is string and
return_last
is True (regardless whatsplit_statements
value is)sql is string and
split_statements
is False
In all other cases, the results are wrapped in a list, even if there is only one statement to process. In particular, the return value will be a list of query results in the following circumstances:
when
sql
is an iterable of string statements (regardless whatreturn_last
value is)when
sql
is string,split_statements
is True andreturn_last
is False
- airflow.providers.common.sql.hooks.sql.fetch_all_handler(cursor)[source]¶
Handler for DbApiHook.run() to return results
- airflow.providers.common.sql.hooks.sql.fetch_one_handler(cursor)[source]¶
Handler for DbApiHook.run() to return first result
- class airflow.providers.common.sql.hooks.sql.ConnectorProtocol[source]¶
Bases:
typing_extensions.Protocol
A protocol where you can connect to a database.
- airflow.providers.common.sql.hooks.sql.BaseForDbApiHook: type[airflow.hooks.base.BaseHook][source]¶
- class airflow.providers.common.sql.hooks.sql.DbApiHook(*args, schema=None, log_sql=True, **kwargs)[source]¶
Bases:
airflow.hooks.dbapi.DbApiHook
Abstract base class for sql hooks.
- Parameters
schema (str | None) – Optional DB schema that overrides the schema specified in the connection. Make sure that if you change the schema parameter value in the constructor of the derived Hook, such change should be done before calling the
DBApiHook.__init__()
.log_sql (bool) – Whether to log SQL query when it’s executed. Defaults to True.
- connector: ConnectorProtocol | None[source]¶
- get_sqlalchemy_engine(engine_kwargs=None)[source]¶
Get an sqlalchemy_engine object.
- Parameters
engine_kwargs – Kwargs used in
create_engine()
.- Returns
the created engine.
- get_pandas_df(sql, parameters=None, **kwargs)[source]¶
Executes the sql and returns a pandas dataframe
- Parameters
sql – the sql statement to be executed (str) or a list of sql statements to execute
parameters – The parameters to render the SQL query with.
kwargs – (optional) passed into pandas.io.sql.read_sql method
- get_pandas_df_by_chunks(sql, parameters=None, *, chunksize, **kwargs)[source]¶
Executes the sql and returns a generator
- Parameters
sql – the sql statement to be executed (str) or a list of sql statements to execute
parameters – The parameters to render the SQL query with
chunksize – number of rows to include in each chunk
kwargs – (optional) passed into pandas.io.sql.read_sql method
- run(sql, autocommit=False, parameters=None, handler=None, split_statements=False, return_last=True)[source]¶
Runs a command or a list of commands. Pass a list of sql statements to the sql parameter to get them to execute sequentially.
The method will return either single query results (typically list of rows) or list of those results where each element in the list are results of one of the queries (typically list of list of rows :D)
For compatibility reasons, the behaviour of the DBAPIHook is somewhat confusing. In some cases, when multiple queries are run, the return value will be an iterable (list) of results – one for each query. However, in other cases, when single query is run, the return value will be the result of that single query without wrapping the results in a list.
The cases when single query results are returned without wrapping them in a list are as follows:
sql is string and
return_last
is True (regardless whatsplit_statements
value is)sql is string and
split_statements
is False
In all other cases, the results are wrapped in a list, even if there is only one statement to process. In particular, the return value will be a list of query results in the following circumstances:
when
sql
is an iterable of string statements (regardless whatreturn_last
value is)when
sql
is string,split_statements
is True andreturn_last
is False
After
run
is called, you may access the following properties on the hook object:descriptions
: an array of cursor descriptions. Ifreturn_last
is True, this will bea one-element array containing the cursor
description
for the last statement. Otherwise, it will contain the cursor description for each statement executed.
last_description
: the description for the last statement executed
Note that query result will ONLY be actually returned when a handler is provided; if
handler
is None, this method will return None.Handler is a way to process the rows from cursor (Iterator) into a value that is suitable to be returned to XCom and generally fit in memory.
You can use pre-defined handles (fetch_all_handler`, ‘’fetch_one_handler``) or implement your own handler.
- Parameters
sql (str | Iterable[str]) – the sql statement to be executed (str) or a list of sql statements to execute
autocommit (bool) – What to set the connection’s autocommit setting to before executing the query.
parameters (Iterable | Mapping | None) – The parameters to render the SQL query with.
handler (Callable | None) – The result handler which is called with the result of each statement.
split_statements (bool) – Whether to split a single SQL string into statements and run separately
return_last (bool) – Whether to return result for only last statement or for all after split
- Returns
if handler provided, returns query results (may be list of results depending on params)
- Return type
Any | list[Any] | None
- get_autocommit(conn)[source]¶
Get autocommit setting for the provided connection. Return True if conn.autocommit is set to True. Return False if conn.autocommit is not set or set to False or conn does not support autocommit.
- Parameters
conn – Connection to get autocommit setting from.
- Returns
connection autocommit setting.
- Return type
- insert_rows(table, rows, target_fields=None, commit_every=1000, replace=False, **kwargs)[source]¶
A generic way to insert a set of tuples into a table, a new transaction is created every commit_every rows
- Parameters
table – Name of the target table
rows – The rows to insert into the table
target_fields – The names of the columns to fill in the table
commit_every – The maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
replace – Whether to replace instead of insert
- abstract bulk_dump(table, tmp_file)[source]¶
Dumps a database table into a tab-delimited file
- Parameters
table – The name of the source table
tmp_file – The path of the target file