Apache Pinot Hooks

Apache Pinot is a column-oriented, open-source, distributed data store written in Java. Pinot is designed to execute OLAP queries with low latency. It is suited in contexts where fast analytics, such as aggregations, are needed on immutable data, possibly, with real-time data ingestion.

Prerequisite

PinotAdminHook

This hook is a wrapper around the pinot-admin.sh script, which is used for administering a Pinot cluster and provided by Apache Pinot distribution. For now, only small subset of its subcommands are implemented, which are required to ingest offline data into Apache Pinot (i.e., AddSchema, AddTable, CreateSegment, and UploadSegment). Their command options are based on Pinot v0.1.0.

Parameters

For parameter definition, take a look at PinotAdminHook

tests/system/apache/pinot/example_pinot_dag.py[source]

@task
def pinot_admin():
    PinotAdminHook(conn_id="pinot_admin_default", cmd_path="pinot-admin.sh", pinot_admin_system_exit=True)

Reference

For more information, please see the documentation at Apache Pinot improvements for PinotAdminHook<https://pinot.apache.org/>

PinotDbApiHook

This hook uses standard-SQL endpoint since PQL endpoint is soon to be deprecated.

Parameters

For parameter definition, take a look at PinotDbApiHook

tests/system/apache/pinot/example_pinot_dag.py[source]

@task
def pinot_dbi_api():
    PinotDbApiHook(
        task_id="run_example_pinot_script",
        pinot="ls /;",
        pinot_options="-x local",
    )

Reference

For more information, please see the documentation at Pinot documentation on querying data <https://docs.pinot.apache.org/users/api/querying-pinot-using-standard-sql>

Was this entry helpful?