SQLExecuteQueryOperator to connect to Apache Impala¶
Use the SQLExecuteQueryOperator to execute SQL queries against an
Apache Impala cluster.
Note
Previously, a dedicated operator for Impala might have been used.
After deprecation, please use the SQLExecuteQueryOperator instead.
Note
Make sure you have installed the apache-airflow-providers-apache-impala package to enable Impala support.
Using the Operator¶
Use the conn_id argument to connect to your Apache Impala instance where
the connection metadata is structured as follows:
Parameter  | 
Input  | 
|---|---|
Host: string  | 
Impala daemon hostname or IP address  | 
Schema: string  | 
The default database name (optional)  | 
Login: string  | 
Username for authentication (if applicable)  | 
Password: string  | 
Password for authentication (if applicable)  | 
Port: int  | 
Impala service port (default: 21050)  | 
Extra: JSON  | 
Additional connection configuration, such as:
  | 
An example usage of the SQLExecuteQueryOperator to connect to Apache Impala is as follows:
    create_table_impala_task = SQLExecuteQueryOperator(
        task_id="create_table_impala",
        sql="""
            CREATE TABLE IF NOT EXISTS impala_example (
                a STRING,
                b INT
            )
            PARTITIONED BY (c INT)
        """,
    )
Reference¶
For further information, see:
Note
Parameters provided directly via SQLExecuteQueryOperator() take precedence over those specified
in the Airflow connection metadata (such as schema, login, password, etc).