airflow.providers.amazon.aws.transfers.hive_to_dynamodb

This module contains operator to move data from Hive to DynamoDB.

Module Contents

Classes

HiveToDynamoDBOperator

Moves data from Hive to DynamoDB.

class airflow.providers.amazon.aws.transfers.hive_to_dynamodb.HiveToDynamoDBOperator(*, sql, table_name, table_keys, pre_process=None, pre_process_args=None, pre_process_kwargs=None, region_name=None, schema='default', hiveserver2_conn_id='hiveserver2_default', aws_conn_id='aws_default', **kwargs)[source]

Bases: airflow.models.BaseOperator

Moves data from Hive to DynamoDB.

Note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

See also

For more information on how to use this operator, take a look at the guide: Apache Hive to Amazon DynamoDB transfer operator

Parameters
  • sql (str) – SQL query to execute against the hive database. (templated)

  • table_name (str) – target DynamoDB table

  • table_keys (list) – partition key and sort key

  • pre_process (Callable | None) – implement pre-processing of source data

  • pre_process_args (list | None) – list of pre_process function arguments

  • pre_process_kwargs (list | None) – dict of pre_process function arguments

  • region_name (str | None) – aws region name (example: us-east-1)

  • schema (str) – hive database schema

  • hiveserver2_conn_id (str) – Reference to the :ref: Hive Server2 thrift service connection id <howto/connection:hiveserver2>.

  • aws_conn_id (str) – aws connection

template_fields: Sequence[str] = ('sql',)[source]
template_ext: Sequence[str] = ('.sql',)[source]
template_fields_renderers[source]
ui_color = '#a0e08c'[source]
execute(context)[source]

This is the main method to derive when creating an operator.

Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?