airflow.providers.mongo.hooks.mongo
¶
Hook for Mongo DB.
Module Contents¶
Classes¶
Interact with Mongo. This hook uses the Mongo conn_id. |
- class airflow.providers.mongo.hooks.mongo.MongoHook(conn_id=default_conn_name, *args, **kwargs)[source]¶
Bases:
airflow.hooks.base.BaseHook
Interact with Mongo. This hook uses the Mongo conn_id.
PyMongo Wrapper to Interact With Mongo Database Mongo Connection Documentation https://docs.mongodb.com/manual/reference/connection-string/index.html You can specify connection string options in extra field of your connection https://docs.mongodb.com/manual/reference/connection-string/index.html#connection-string-options
If you want use DNS seedlist, set srv to True.
- ex.
{“srv”: true, “replicaSet”: “test”, “ssl”: true, “connectTimeoutMS”: 30000}
- Parameters
mongo_conn_id – The Mongo connection id to use when connecting to MongoDB.
- get_collection(mongo_collection, mongo_db=None)[source]¶
Fetches a mongo collection object for querying.
Uses connection schema as DB unless specified.
- aggregate(mongo_collection, aggregate_query, mongo_db=None, **kwargs)[source]¶
Runs an aggregation pipeline and returns the results.
https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html#pymongo.collection.Collection.aggregate https://pymongo.readthedocs.io/en/stable/examples/aggregation.html
- find(mongo_collection: str, query: dict, find_one: typing_extensions.Literal[False], mongo_db: str | None = None, projection: list | dict | None = None, **kwargs) pymongo.cursor.Cursor [source]¶
- find(mongo_collection: str, query: dict, find_one: typing_extensions.Literal[True], mongo_db: str | None = None, projection: list | dict | None = None, **kwargs) Any | None
Runs a mongo find query and returns the results.
- insert_one(mongo_collection, doc, mongo_db=None, **kwargs)[source]¶
Inserts a single document into a mongo collection.
- insert_many(mongo_collection, docs, mongo_db=None, **kwargs)[source]¶
Inserts many docs into a mongo collection.
- update_one(mongo_collection, filter_doc, update_doc, mongo_db=None, **kwargs)[source]¶
Updates a single document in a mongo collection.
- Parameters
mongo_collection (str) – The name of the collection to update.
filter_doc (dict) – A query that matches the documents to update.
update_doc (dict) – The modifications to apply.
mongo_db (str | None) – The name of the database to use. Can be omitted; then the database from the connection string is used.
- update_many(mongo_collection, filter_doc, update_doc, mongo_db=None, **kwargs)[source]¶
Updates one or more documents in a mongo collection.
- Parameters
mongo_collection (str) – The name of the collection to update.
filter_doc (dict) – A query that matches the documents to update.
update_doc (dict) – The modifications to apply.
mongo_db (str | None) – The name of the database to use. Can be omitted; then the database from the connection string is used.
- replace_one(mongo_collection, doc, filter_doc=None, mongo_db=None, **kwargs)[source]¶
Replaces a single document in a mongo collection.
Note
If no
filter_doc
is given, it is assumed that the replacement document contain the_id
field which is then used as filters.- Parameters
mongo_collection (str) – The name of the collection to update.
doc (dict) – The new document.
filter_doc (dict | None) – A query that matches the documents to replace. Can be omitted; then the _id field from doc will be used.
mongo_db (str | None) – The name of the database to use. Can be omitted; then the database from the connection string is used.
- replace_many(mongo_collection, docs, filter_docs=None, mongo_db=None, upsert=False, collation=None, **kwargs)[source]¶
Replaces many documents in a mongo collection.
Uses bulk_write with multiple ReplaceOne operations https://pymongo.readthedocs.io/en/stable/api/pymongo/collection.html#pymongo.collection.Collection.bulk_write
Note
If no
filter_docs``are given, it is assumed that all replacement documents contain the ``_id
field which are then used as filters.- Parameters
mongo_collection (str) – The name of the collection to update.
filter_docs (list[dict] | None) – A list of queries that match the documents to replace. Can be omitted; then the _id fields from airflow.docs will be used.
mongo_db (str | None) – The name of the database to use. Can be omitted; then the database from the connection string is used.
upsert (bool) – If
True
, perform an insert if no documents match the filters for the replace operation.collation (pymongo.collation.Collation | None) – An instance of
Collation
. This option is only supported on MongoDB 3.4 and above.
- delete_one(mongo_collection, filter_doc, mongo_db=None, **kwargs)[source]¶
Deletes a single document in a mongo collection.
- delete_many(mongo_collection, filter_doc, mongo_db=None, **kwargs)[source]¶
Deletes one or more documents in a mongo collection.
- distinct(mongo_collection, distinct_key, filter_doc=None, mongo_db=None, **kwargs)[source]¶
Returns a list of distinct values for the given key across a collection.
- Parameters
mongo_collection (str) – The name of the collection to perform distinct on.
distinct_key (str) – The field to return distinct values from.
filter_doc (dict | None) – A query that matches the documents get distinct values from. Can be omitted; then will cover the entire collection.
mongo_db (str | None) – The name of the database to use. Can be omitted; then the database from the connection string is used.