MongoDB to Amazon S3

Use the MongoToS3Operator transfer to copy data from a MongoDB collection into an Amazon Simple Storage Service (S3) file.

Prerequisite Tasks

To use these operators, you must do a few things:

Operators

MongoDB To Amazon S3 transfer operator

This operator copies a set of data from a MongoDB collection to an Amazon S3 files. In order to select the data you want to copy, you need to use the mongo_query parameter.

To get more information about this operator visit: MongoToS3Operator

Example usage:

tests/system/providers/amazon/aws/example_mongo_to_s3.py[source]

mongo_to_s3_job = MongoToS3Operator(
    task_id="mongo_to_s3_job",
    mongo_collection=mongo_collection,
    # Mongo query by matching values
    # Here returns all documents which have "OK" as value for the key "status"
    mongo_query={"status": "OK"},
    s3_bucket=s3_bucket,
    s3_key=s3_key,
    mongo_db=mongo_database,
    replace=True,
)

You can find more information about PyMongo used by Airflow to communicate with MongoDB here.

Was this entry helpful?