airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb

This DAG will not work unless you create an Amazon EMR cluster running Apache Hive and copy data into it following steps 1-4 (inclusive) here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/EMRforDynamoDB.Tutorial.html

Module Contents

Functions

create_dynamodb_table()

get_dynamodb_item_count()

A DynamoDB table has an ItemCount value, but it is only updated every six hours.

delete_dynamodb_table()

configure_hive_connection()

Attributes

DYNAMODB_TABLE_NAME

HIVE_CONNECTION_ID

HIVE_HOSTNAME

DYNAMODB_TABLE_HASH_KEY

HIVE_SQL

doc_md

airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.DYNAMODB_TABLE_NAME = example_hive_to_dynamodb_table[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.HIVE_CONNECTION_ID[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.HIVE_HOSTNAME[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.DYNAMODB_TABLE_HASH_KEY = feature_id[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.HIVE_SQL = SELECT feature_id, feature_name, feature_class, state_alpha FROM hive_features[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.create_dynamodb_table()[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.get_dynamodb_item_count()[source]

A DynamoDB table has an ItemCount value, but it is only updated every six hours. To verify this DAG worked, we will scan the table and count the items manually.

airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.delete_dynamodb_table()[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.configure_hive_connection()[source]
airflow.providers.amazon.aws.example_dags.example_hive_to_dynamodb.doc_md[source]

Was this entry helpful?