Apache Livy Operators

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or an RPC client library.

LivyOperator

This operator wraps the Apache Livy batch REST API, allowing to submit a Spark application to the underlying cluster.

tests/system/providers/apache/livy/example_livy.py[source]

    livy_java_task = LivyOperator(
        task_id="pi_java_task",
        file="/spark-examples.jar",
        num_executors=1,
        conf={
            "spark.shuffle.compress": "false",
        },
        class_name="org.apache.spark.examples.SparkPi",
    )

    livy_python_task = LivyOperator(task_id="pi_python_task", file="/pi.py", polling_interval=60)

    livy_java_task >> livy_python_task

Reference

For further information, look at Apache Livy.

Was this entry helpful?