airflow.providers.amazon.aws.transfers.hive_to_dynamodb

This module contains operator to move data from Hive to DynamoDB.

Module Contents

class airflow.providers.amazon.aws.transfers.hive_to_dynamodb.HiveToDynamoDBOperator(*, sql: str, table_name: str, table_keys: list, pre_process: Optional[Callable] = None, pre_process_args: Optional[list] = None, pre_process_kwargs: Optional[list] = None, region_name: Optional[str] = None, schema: str = 'default', hiveserver2_conn_id: str = 'hiveserver2_default', aws_conn_id: str = 'aws_default', **kwargs)[source]

Bases: airflow.models.BaseOperator

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

Parameters
  • sql (str) -- SQL query to execute against the hive database. (templated)

  • table_name (str) -- target DynamoDB table

  • table_keys (list) -- partition key and sort key

  • pre_process (function) -- implement pre-processing of source data

  • pre_process_args (list) -- list of pre_process function arguments

  • pre_process_kwargs (dict) -- dict of pre_process function arguments

  • region_name (str) -- aws region name (example: us-east-1)

  • schema (str) -- hive database schema

  • hiveserver2_conn_id (str) -- Reference to the :ref: Hive Server2 thrift service connection id <howto/connection:hiveserver2>.

  • aws_conn_id (str) -- aws connection

template_fields = ['sql'][source]
template_ext = ['.sql'][source]
ui_color = #a0e08c[source]
execute(self, context)[source]

Was this entry helpful?