This module contains operator to move data from Hive to DynamoDB.

Module Contents



Moves data from Hive to DynamoDB, note that for now the data is loaded

class*, sql: str, table_name: str, table_keys: list, pre_process: Optional[Callable] = None, pre_process_args: Optional[list] = None, pre_process_kwargs: Optional[list] = None, region_name: Optional[str] = None, schema: str = 'default', hiveserver2_conn_id: str = 'hiveserver2_default', aws_conn_id: str = 'aws_default', **kwargs)[source]

Bases: airflow.models.BaseOperator

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

  • sql (str) -- SQL query to execute against the hive database. (templated)

  • table_name (str) -- target DynamoDB table

  • table_keys (list) -- partition key and sort key

  • pre_process (function) -- implement pre-processing of source data

  • pre_process_args (list) -- list of pre_process function arguments

  • pre_process_kwargs (dict) -- dict of pre_process function arguments

  • region_name (str) -- aws region name (example: us-east-1)

  • schema (str) -- hive database schema

  • hiveserver2_conn_id (str) -- Reference to the :ref: Hive Server2 thrift service connection id <howto/connection:hiveserver2>.

  • aws_conn_id (str) -- aws connection

template_fields :Sequence[str] = ['sql'][source]
template_ext :Sequence[str] = ['.sql'][source]
ui_color = #a0e08c[source]
execute(self, context: airflow.utils.context.Context)[source]

This is the main method to derive when creating an operator. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?