airflow.providers.apache.livy.hooks.livy
¶
This module contains the Apache Livy hook.
Module Contents¶
Classes¶
Batch session states |
|
Hook for Apache Livy through the REST API. |
- class airflow.providers.apache.livy.hooks.livy.BatchState[source]¶
Bases:
enum.Enum
Batch session states
- class airflow.providers.apache.livy.hooks.livy.LivyHook(livy_conn_id=default_conn_name, extra_options=None, extra_headers=None)[source]¶
Bases:
airflow.providers.http.hooks.http.HttpHook
,airflow.utils.log.logging_mixin.LoggingMixin
Hook for Apache Livy through the REST API.
- Parameters
See also
For more details refer to the Apache Livy API reference: https://livy.apache.org/docs/latest/rest-api.html
- get_conn(self, headers=None)[source]¶
Returns http session for use with requests
- Parameters
headers (Optional[Dict[str, Any]]) -- additional headers to be passed through as a dictionary
- Returns
requests session
- Return type
- run_method(self, endpoint, method='GET', data=None, headers=None, retry_args=None)[source]¶
Wrapper for HttpHook, allows to change method on the same HttpHook
- Parameters
method (str) -- http method
endpoint (str) -- endpoint
data (Optional[Any]) -- request payload
headers (Optional[Dict[str, Any]]) -- headers
retry_args (Optional[Dict[str, Any]]) -- Arguments which define the retry behaviour. See Tenacity documentation at https://github.com/jd/tenacity
- Returns
http response
- Return type
- post_batch(self, *args, **kwargs)[source]¶
Perform request to submit batch
- Returns
batch session id
- Return type
- get_batch_state(self, session_id, retry_args=None)[source]¶
Fetch the state of the specified batch
- Parameters
session_id (Union[int, str]) -- identifier of the batch sessions
retry_args (Optional[Dict[str, Any]]) -- Arguments which define the retry behaviour. See Tenacity documentation at https://github.com/jd/tenacity
- Returns
batch state
- Return type
- get_batch_logs(self, session_id, log_start_position, log_batch_size)[source]¶
Gets the session logs for a specified batch. :param session_id: identifier of the batch sessions :param log_start_position: Position from where to pull the logs :param log_batch_size: Number of lines to pull in one batch
- Returns
response body
- Return type
- static build_post_batch_body(file, args=None, class_name=None, jars=None, py_files=None, files=None, archives=None, name=None, driver_memory=None, driver_cores=None, executor_memory=None, executor_cores=None, num_executors=None, queue=None, proxy_user=None, conf=None)[source]¶
Build the post batch request body. For more information about the format refer to .. seealso:: https://livy.apache.org/docs/latest/rest-api.html :param file: Path of the file containing the application to execute (required). :param proxy_user: User to impersonate when running the job. :param class_name: Application Java/Spark main class string. :param args: Command line arguments for the application s. :param jars: jars to be used in this sessions. :param py_files: Python files to be used in this session. :param files: files to be used in this session. :param driver_memory: Amount of memory to use for the driver process string. :param driver_cores: Number of cores to use for the driver process int. :param executor_memory: Amount of memory to use per executor process string. :param executor_cores: Number of cores to use for each executor int. :param num_executors: Number of executors to launch for this session int. :param archives: Archives to be used in this session. :param queue: The name of the YARN queue to which submitted string. :param name: The name of this session string. :param conf: Spark configuration properties. :return: request body :rtype: dict