airflow.providers.apache.livy.hooks.livy
¶
This module contains the Apache Livy hook.
Module Contents¶
Classes¶
Batch session states. |
|
Hook for Apache Livy through the REST API. |
|
Hook for Apache Livy through the REST API asynchronously. |
- class airflow.providers.apache.livy.hooks.livy.BatchState[source]¶
Bases:
enum.Enum
Batch session states.
- class airflow.providers.apache.livy.hooks.livy.LivyHook(livy_conn_id=default_conn_name, extra_options=None, extra_headers=None, auth_type=None)[source]¶
Bases:
airflow.providers.http.hooks.http.HttpHook
,airflow.utils.log.logging_mixin.LoggingMixin
Hook for Apache Livy through the REST API.
- Parameters
livy_conn_id (str) – reference to a pre-defined Livy Connection.
extra_options (dict[str, Any] | None) – A dictionary of options passed to Livy.
extra_headers (dict[str, Any] | None) – A dictionary of headers passed to the HTTP request to livy.
auth_type (Any | None) – The auth type for the service.
See also
For more details refer to the Apache Livy API reference: https://livy.apache.org/docs/latest/rest-api.html
- run_method(endpoint, method='GET', data=None, headers=None, retry_args=None)[source]¶
Wrap HttpHook; allows to change method on the same HttpHook.
- Parameters
- Returns
http response
- Return type
Any
- post_batch(*args, **kwargs)[source]¶
Perform request to submit batch.
- Returns
batch session id
- Return type
- get_batch_state(session_id, retry_args=None)[source]¶
Fetch the state of the specified batch.
- Parameters
retry_args (dict[str, Any] | None) – Arguments which define the retry behaviour. See Tenacity documentation at https://github.com/jd/tenacity
- Returns
batch state
- Return type
- get_batch_logs(session_id, log_start_position, log_batch_size)[source]¶
Get the session logs for a specified batch.
- static build_post_batch_body(file, args=None, class_name=None, jars=None, py_files=None, files=None, archives=None, name=None, driver_memory=None, driver_cores=None, executor_memory=None, executor_cores=None, num_executors=None, queue=None, proxy_user=None, conf=None)[source]¶
Build the post batch request body.
See also
For more information about the format refer to https://livy.apache.org/docs/latest/rest-api.html
- Parameters
file (str) – Path of the file containing the application to execute (required).
proxy_user (str | None) – User to impersonate when running the job.
class_name (str | None) – Application Java/Spark main class string.
args (Sequence[str | int | float] | None) – Command line arguments for the application s.
py_files (list[str] | None) – Python files to be used in this session.
files (list[str] | None) – files to be used in this session.
driver_memory (str | None) – Amount of memory to use for the driver process string.
driver_cores (int | str | None) – Number of cores to use for the driver process int.
executor_memory (str | None) – Amount of memory to use per executor process string.
executor_cores (int | None) – Number of cores to use for each executor int.
num_executors (int | str | None) – Number of executors to launch for this session int.
archives (list[str] | None) – Archives to be used in this session.
queue (str | None) – The name of the YARN queue to which submitted string.
name (str | None) – The name of this session string.
conf (dict[Any, Any] | None) – Spark configuration properties.
- Returns
request body
- Return type
- class airflow.providers.apache.livy.hooks.livy.LivyAsyncHook(livy_conn_id=default_conn_name, extra_options=None, extra_headers=None)[source]¶
Bases:
airflow.providers.http.hooks.http.HttpAsyncHook
,airflow.utils.log.logging_mixin.LoggingMixin
Hook for Apache Livy through the REST API asynchronously.
- Parameters
See also
For more details refer to the Apache Livy API reference: https://livy.apache.org/docs/latest/rest-api.html
- async run_method(endpoint, method='GET', data=None, headers=None)[source]¶
Wrap HttpAsyncHook; allows to change method on the same HttpAsyncHook.
- async get_batch_logs(session_id, log_start_position, log_batch_size)[source]¶
Get the session logs for a specified batch asynchronously.
- async dump_batch_logs(session_id)[source]¶
Dump the session logs for a specified batch asynchronously.
- static build_post_batch_body(file, args=None, class_name=None, jars=None, py_files=None, files=None, archives=None, name=None, driver_memory=None, driver_cores=None, executor_memory=None, executor_cores=None, num_executors=None, queue=None, proxy_user=None, conf=None)[source]¶
Build the post batch request body.
- Parameters
file (str) – Path of the file containing the application to execute (required).
proxy_user (str | None) – User to impersonate when running the job.
class_name (str | None) – Application Java/Spark main class string.
args (Sequence[str | int | float] | None) – Command line arguments for the application s.
py_files (list[str] | None) – Python files to be used in this session.
files (list[str] | None) – files to be used in this session.
driver_memory (str | None) – Amount of memory to use for the driver process string.
driver_cores (int | str | None) – Number of cores to use for the driver process int.
executor_memory (str | None) – Amount of memory to use per executor process string.
executor_cores (int | None) – Number of cores to use for each executor int.
num_executors (int | str | None) – Number of executors to launch for this session int.
archives (list[str] | None) – Archives to be used in this session.
queue (str | None) – The name of the YARN queue to which submitted string.
name (str | None) – The name of this session string.
conf (dict[Any, Any] | None) – Spark configuration properties.
- Returns
request body
- Return type