Hive CLI Connection

The Hive CLI connection type enables the Hive CLI Integrations.

Authenticating to Hive CLI

There are two ways to connect to Hive using Airflow.

  1. Use the Hive Beeline. i.e. make a JDBC connection string with host, port, and schema. Optionally you can connect with a proxy user, and specify a login and password.

  2. Use the Hive CLI. i.e. specify Hive CLI params in the extras field.

Only one authorization method can be used at a time. If you need to manage multiple credentials or keys then you should configure multiple connections.

Default Connection IDs

All hooks and operators related to Hive_CLI use hive_cli_default by default.

Configuring the Connection

Login (optional)

Specify your username for a proxy user or for the Beeline CLI.

Password (optional)

Specify your Beeline CLI password.

Host (optional)

Specify your JDBC Hive host that is used for Hive Beeline.

Port (optional)

Specify your JDBC Hive port that is used for Hive Beeline.

Schema (optional)

Specify your JDBC Hive database that you want to connect to with Beeline or specify a schema for an HQL statement to run with the Hive CLI.

Use Beeline (optional)

Specify as True if using the Beeline CLI. Default is False.

Proxy User (optional)

Specify a proxy user to run HQL code as this user.

Principal (optional)

Specify the JDBC Hive principal to be used with Hive Beeline.

When specifying the connection in environment variable you should specify it using URI syntax.

Note that all components of the URI should be URL-encoded.

For example:

export AIRFLOW_CONN_HIVE_CLI_DEFAULT='hive-cli://beeline-username:beeline-password@jdbc-hive-host:80/hive-database?hive_cli_params=params&use_beeline=True&auth=noSasl&principal=hive%2F_HOST%40EXAMPLE.COM'

Was this entry helpful?