airflow.providers.apache.druid.hooks.druid
¶
Module Contents¶
Classes¶
Druid Ingestion Type. Could be Native batch ingestion or SQL-based ingestion. |
|
Connection to Druid overlord for ingestion. |
|
Interact with Druid broker. |
- class airflow.providers.apache.druid.hooks.druid.IngestionType[source]¶
Bases:
enum.Enum
Druid Ingestion Type. Could be Native batch ingestion or SQL-based ingestion.
- class airflow.providers.apache.druid.hooks.druid.DruidHook(druid_ingest_conn_id='druid_ingest_default', timeout=1, max_ingestion_time=None, verify_ssl=True)[source]¶
Bases:
airflow.hooks.base.BaseHook
Connection to Druid overlord for ingestion.
To connect to a Druid cluster that is secured with the druid-basic-security extension, add the username and password to the druid ingestion connection.
- Parameters
druid_ingest_conn_id (str) – The connection id to the Druid overlord machine which accepts index jobs
timeout (int) – The interval between polling the Druid job for the status of the ingestion job. Must be greater than or equal to 1
max_ingestion_time (int | None) – The maximum ingestion time before assuming the job failed
verify_ssl (bool) – Whether to use SSL encryption to submit indexing job. If set to False then checks connection information for path to a CA bundle to use. Defaults to True
- class airflow.providers.apache.druid.hooks.druid.DruidDbApiHook(context=None, *args, **kwargs)[source]¶
Bases:
airflow.providers.common.sql.hooks.sql.DbApiHook
Interact with Druid broker.
This hook is purely for users to query druid broker. For ingestion, please use druidHook.
- Parameters
context (dict | None) – Optional query context parameters to pass to the SQL endpoint. Example:
{"sqlFinalizeOuterSketches": True}
See: https://druid.apache.org/docs/latest/querying/sql-query-context/
- get_uri()[source]¶
Get the connection uri for druid broker.
e.g: druid://localhost:8082/druid/v2/sql/
- abstract insert_rows(table, rows, target_fields=None, commit_every=1000, replace=False, **kwargs)[source]¶
Insert a collection of tuples into a table.
Rows are inserted in chunks, each chunk (of size
commit_every
) is done in a new transaction.- Parameters
table (str) – Name of the target table
rows (Iterable[tuple[str]]) – The rows to insert into the table
target_fields (Iterable[str] | None) – The names of the columns to fill in the table
commit_every (int) – The maximum number of rows to insert in one transaction. Set to 0 to insert all rows in one transaction.
replace (bool) – Whether to replace instead of insert
executemany – If True, all rows are inserted at once in chunks defined by the commit_every parameter. This only works if all rows have same number of column names, but leads to better performance.
fast_executemany – If True, the fast_executemany parameter will be set on the cursor used by executemany which leads to better performance, if supported by driver.
autocommit – What to set the connection’s autocommit setting to before executing the query.