airflow.providers.apache.livy.hooks.livy

This module contains the Apache Livy hook.

Module Contents

class airflow.providers.apache.livy.hooks.livy.BatchState[source]

Bases: enum.Enum

Batch session states

NOT_STARTED = not_started[source]
STARTING = starting[source]
RUNNING = running[source]
IDLE = idle[source]
BUSY = busy[source]
SHUTTING_DOWN = shutting_down[source]
ERROR = error[source]
DEAD = dead[source]
KILLED = killed[source]
SUCCESS = success[source]
class airflow.providers.apache.livy.hooks.livy.LivyHook(livy_conn_id: str = default_conn_name, extra_options: Optional[Dict[str, Any]] = None)[source]

Bases: airflow.providers.http.hooks.http.HttpHook, airflow.utils.log.logging_mixin.LoggingMixin

Hook for Apache Livy through the REST API.

Parameters

livy_conn_id (str) -- reference to a pre-defined Livy Connection.

See also

For more details refer to the Apache Livy API reference: https://livy.apache.org/docs/latest/rest-api.html

TERMINAL_STATES[source]
_def_headers[source]
conn_name_attr = livy_conn_id[source]
default_conn_name = livy_default[source]
conn_type = livy[source]
hook_name = Apache Livy[source]
get_conn(self, headers: Optional[Dict[str, Any]] = None)[source]

Returns http session for use with requests

Parameters

headers (dict) -- additional headers to be passed through as a dictionary

Returns

requests session

Return type

requests.Session

run_method(self, endpoint: str, method: str = 'GET', data: Optional[Any] = None, headers: Optional[Dict[str, Any]] = None)[source]

Wrapper for HttpHook, allows to change method on the same HttpHook

Parameters
  • method (str) -- http method

  • endpoint (str) -- endpoint

  • data (dict) -- request payload

  • headers (dict) -- headers

Returns

http response

Return type

requests.Response

post_batch(self, *args, **kwargs)[source]

Perform request to submit batch

Returns

batch session id

Return type

int

get_batch(self, session_id: Union[int, str])[source]

Fetch info about the specified batch

Parameters

session_id (int) -- identifier of the batch sessions

Returns

response body

Return type

dict

get_batch_state(self, session_id: Union[int, str])[source]

Fetch the state of the specified batch

Parameters

session_id (Union[int, str]) -- identifier of the batch sessions

Returns

batch state

Return type

BatchState

delete_batch(self, session_id: Union[int, str])[source]

Delete the specified batch

Parameters

session_id (int) -- identifier of the batch sessions

Returns

response body

Return type

dict

static _validate_session_id(session_id: Union[int, str])[source]

Validate session id is a int

Parameters

session_id (Union[int, str]) -- session id

static _parse_post_response(response: Dict[Any, Any])[source]

Parse batch response for batch id

Parameters

response (dict) -- response body

Returns

session id

Return type

int

static build_post_batch_body(file: str, args: Optional[Sequence[Union[str, int, float]]] = None, class_name: Optional[str] = None, jars: Optional[List[str]] = None, py_files: Optional[List[str]] = None, files: Optional[List[str]] = None, archives: Optional[List[str]] = None, name: Optional[str] = None, driver_memory: Optional[str] = None, driver_cores: Optional[Union[int, str]] = None, executor_memory: Optional[str] = None, executor_cores: Optional[int] = None, num_executors: Optional[Union[int, str]] = None, queue: Optional[str] = None, proxy_user: Optional[str] = None, conf: Optional[Dict[Any, Any]] = None)[source]

Build the post batch request body. For more information about the format refer to .. seealso:: https://livy.apache.org/docs/latest/rest-api.html

Parameters
  • file (str) -- Path of the file containing the application to execute (required).

  • proxy_user (str) -- User to impersonate when running the job.

  • class_name (str) -- Application Java/Spark main class string.

  • args (Sequence[Union[str, int, float]]) -- Command line arguments for the application s.

  • jars (Sequence[str]) -- jars to be used in this sessions.

  • py_files (Sequence[str]) -- Python files to be used in this session.

  • files (Sequence[str]) -- files to be used in this session.

  • driver_memory (str) -- Amount of memory to use for the driver process string.

  • driver_cores (Union[str, int]) -- Number of cores to use for the driver process int.

  • executor_memory (str) -- Amount of memory to use per executor process string.

  • executor_cores (Union[int, str]) -- Number of cores to use for each executor int.

  • num_executors (Union[str, int]) -- Number of executors to launch for this session int.

  • archives (Sequence[str]) -- Archives to be used in this session.

  • queue (str) -- The name of the YARN queue to which submitted string.

  • name (str) -- The name of this session string.

  • conf (dict) -- Spark configuration properties.

Returns

request body

Return type

dict

static _validate_size_format(size: str)[source]

Validate size format.

Parameters

size (str) -- size value

Returns

true if valid format

Return type

bool

static _validate_list_of_stringables(vals: Sequence[Union[str, int, float]])[source]

Check the values in the provided list can be converted to strings.

Parameters

vals (Sequence[Union[str, int, float]]) -- list to validate

Returns

true if valid

Return type

bool

static _validate_extra_conf(conf: Dict[Any, Any])[source]

Check configuration values are either strings or ints.

Parameters

conf (dict) -- configuration variable

Returns

true if valid

Return type

bool

Was this entry helpful?