MongoDB to Amazon S3¶
Use the MongoToS3Operator transfer to copy data from a MongoDB collection into an Amazon Simple Storage Service
(S3) file.
Prerequisite Tasks¶
To use these operators, you must do a few things:
Create necessary resources using AWS Console or AWS CLI.
Install API libraries via pip.
pip install 'apache-airflow[amazon]'Detailed information is available Installation
Operators¶
MongoDB To Amazon S3 transfer operator¶
This operator copies a set of data from a MongoDB collection to an Amazon S3 files.
In order to select the data you want to copy, you need to use the mongo_query parameter.
To get more information about this operator visit:
MongoToS3Operator
Example usage:
create_local_to_s3_job = MongoToS3Operator(
    task_id="create_mongo_to_s3_job",
    mongo_collection=MONGO_COLLECTION,
    # Mongo query by matching values
    # Here returns all documents which have "OK" as value for the key "status"
    mongo_query={"status": "OK"},
    s3_bucket=S3_BUCKET,
    s3_key=S3_KEY,
    mongo_db=MONGO_DATABASE,
    replace=True,
)
You can find more information about PyMongo used by Airflow to communicate with MongoDB
here.
