MongoDB to Amazon S3

Use the MongoToS3Operator transfer to copy data from a MongoDB collection into an Amazon Simple Storage Service (S3) file.

Prerequisite Tasks

To use these operators, you must do a few things:

Operators

MongoDB To Amazon S3 transfer operator

This operator copies a set of data from a MongoDB collection to an Amazon S3 files. In order to select the data you want to copy, you need to use the mongo_query parameter.

To get more information about this operator visit: MongoToS3Operator

Example usage:

airflow/providers/amazon/aws/example_dags/example_mongo_to_s3.py[source]

create_local_to_s3_job = MongoToS3Operator(
    task_id="create_mongo_to_s3_job",
    mongo_collection=MONGO_COLLECTION,
    # Mongo query by matching values
    # Here returns all documents which have "OK" as value for the key "status"
    mongo_query={"status": "OK"},
    s3_bucket=S3_BUCKET,
    s3_key=S3_KEY,
    mongo_db=MONGO_DATABASE,
    replace=True,
)

You can find more information about PyMongo used by Airflow to communicate with MongoDB here.

Was this entry helpful?