airflow.providers.pinecone.operators.pinecone
¶
Module Contents¶
Classes¶
Ingest vector embeddings into Pinecone. |
|
Create a pod based index in Pinecone. |
|
Create a serverless index in Pinecone. |
- class airflow.providers.pinecone.operators.pinecone.PineconeIngestOperator(*, conn_id=PineconeHook.default_conn_name, index_name, input_vectors, namespace='', batch_size=None, upsert_kwargs=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Ingest vector embeddings into Pinecone.
See also
For more information on how to use this operator, take a look at the guide: Ingest data into a pinecone index
- Parameters
conn_id (str) – The connection id to use when connecting to Pinecone.
index_name (str) – Name of the Pinecone index.
input_vectors (list[pinecone.Vector] | list[tuple] | list[dict]) – Data to be ingested, in the form of a list of vectors, list of tuples, or list of dictionaries.
namespace (str) – The namespace to write to. If not specified, the default namespace is used.
batch_size (int | None) – The number of vectors to upsert in each batch.
upsert_kwargs (dict | None) –
- class airflow.providers.pinecone.operators.pinecone.CreatePodIndexOperator(*, conn_id=PineconeHook.default_conn_name, index_name, dimension, environment=None, replicas=None, shards=None, pods=None, pod_type='p1.x1', metadata_config=None, source_collection=None, metric='cosine', timeout=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Create a pod based index in Pinecone.
See also
For more information on how to use this operator, take a look at the guide: Create a Pod based Index
- Parameters
conn_id (str) – The connection id to use when connecting to Pinecone.
index_name (str) – Name of the Pinecone index.
dimension (int) – The dimension of the vectors to be indexed.
environment (str | None) – The environment to use when creating the index.
replicas (int | None) – The number of replicas to use.
shards (int | None) – The number of shards to use.
pods (int | None) – The number of pods to use.
pod_type (str) – The type of pod to use. Defaults to p1.x1
metadata_config (dict | None) – The metadata configuration to use.
source_collection (str | None) – The source collection to use.
metric (str) – The metric to use. Defaults to cosine.
timeout (int | None) – The timeout to use.
- class airflow.providers.pinecone.operators.pinecone.CreateServerlessIndexOperator(*, conn_id=PineconeHook.default_conn_name, index_name, dimension, cloud, region=None, metric=None, timeout=None, **kwargs)[source]¶
Bases:
airflow.models.BaseOperator
Create a serverless index in Pinecone.
See also
For more information on how to use this operator, take a look at the guide: Create a Serverless Index
- Parameters
conn_id (str) – The connection id to use when connecting to Pinecone.
index_name (str) – Name of the Pinecone index.
dimension (int) – The dimension of the vectors to be indexed.
cloud (str) – The cloud to use when creating the index.
region (str | None) – The region to use when creating the index.
metric (str | None) – The metric to use.
timeout (int | None) – The timeout to use.