SQLExecuteQueryOperator to connect to Apache Hive¶
Use the SQLExecuteQueryOperator to execute
Hive commands in an Apache Hive database.
Note
Previously, HiveOperator was used to perform this kind of operation.
After deprecation this has been removed. Please use SQLExecuteQueryOperator instead.
Note
Make sure you have installed the apache-airflow-providers-apache-hive package
to enable Hive support.
Using the Operator¶
Use the conn_id argument to connect to your Apache Hive instance where
the connection metadata is structured as follows:
Parameter |
Input |
|---|---|
Host: string |
HiveServer2 hostname or IP address |
Schema: string |
Default database name (optional) |
Login: string |
Hive username (if applicable) |
Password: string |
Hive password (if applicable) |
Port: int |
HiveServer2 port (default: 10000) |
Extra: JSON |
Additional connection configuration, such as the authentication method:
|
An example usage of the SQLExecuteQueryOperator to connect to Apache Hive is as follows:
create_table_hive_task = SQLExecuteQueryOperator(
task_id="create_table_hive",
sql="create table hive_example(a string, b int) partitioned by(c int)",
)
Reference¶
For further information, look at:
Note
Parameters provided directly via SQLExecuteQueryOperator() take precedence
over those specified in the Airflow connection metadata (such as schema, login, password, etc).