`langchain.embeddings.cache`.CacheBackedEmbeddings¶

class langchain.embeddings.cache.CacheBackedEmbeddings(underlying_embeddings: Embeddings, document_embedding_store: BaseStore[str, List[float]], *, batch_size: Optional[int] = None)[source]¶

Interface for caching results from embedding models.

The interface allows works with any store that implements the abstract store interface accepting keys of type str and values of list of floats.

If need be, the interface can be extended to accept other implementations of the value serializer and deserializer, as well as the key encoder.

Examples

Initialize the embedder.

Parameters

underlying_embeddings (Embeddings) – the embedder to use for computing embeddings.
document_embedding_store (BaseStore[str, List[float]]) – The store to use for caching document embeddings.
batch_size (Optional[int]) – The number of documents to embed between store updates.

Methods

`__init__`(underlying_embeddings, ...[, ...])	Initialize the embedder.
`aembed_documents`(texts)	Embed a list of texts.
`aembed_query`(text)	Embed query text.
`embed_documents`(texts)	Embed a list of texts.
`embed_query`(text)	Embed query text.
`from_bytes_store`(underlying_embeddings, ...)	On-ramp that adds the necessary serialization and encoding to the store.

__init__(underlying_embeddings: Embeddings, document_embedding_store: BaseStore[str, List[float]], *, batch_size: Optional[int] = None) → None[source]¶

Initialize the embedder.

Parameters

underlying_embeddings (Embeddings) – the embedder to use for computing embeddings.
document_embedding_store (BaseStore[str, List[float]]) – The store to use for caching document embeddings.
batch_size (Optional[int]) – The number of documents to embed between store updates.

Return type

None

async aembed_documents(texts: List[str]) → List[List[float]][source]¶

Embed a list of texts.

The method first checks the cache for the embeddings. If the embeddings are not found, the method uses the underlying embedder to embed the documents and stores the results in the cache.

Parameters: texts (List[str]) – A list of texts to embed.
Returns: A list of embeddings for the given texts.
Return type: List[List[float]]

async aembed_query(text: str) → List[float][source]¶

Embed query text.

This method does not support caching at the moment.

Support for caching queries is easily to implement, but might make sense to hold off to see the most common patterns.

If the cache has an eviction policy, we may need to be a bit more careful about sharing the cache between documents and queries. Generally, one is OK evicting query caches, but document caches should be kept.

Parameters: text (str) – The text to embed.
Returns: The embedding for the given text.
Return type: List[float]

embed_documents(texts: List[str]) → List[List[float]][source]¶

Embed a list of texts.

The method first checks the cache for the embeddings. If the embeddings are not found, the method uses the underlying embedder to embed the documents and stores the results in the cache.

Parameters: texts (List[str]) – A list of texts to embed.
Returns: A list of embeddings for the given texts.
Return type: List[List[float]]

embed_query(text: str) → List[float][source]¶

Embed query text.

This method does not support caching at the moment.

Support for caching queries is easily to implement, but might make sense to hold off to see the most common patterns.

If the cache has an eviction policy, we may need to be a bit more careful about sharing the cache between documents and queries. Generally, one is OK evicting query caches, but document caches should be kept.

Parameters: text (str) – The text to embed.
Returns: The embedding for the given text.
Return type: List[float]

classmethod from_bytes_store(underlying_embeddings: Embeddings, document_embedding_cache: BaseStore[str, bytes], *, namespace: str = '', batch_size: Optional[int] = None) → CacheBackedEmbeddings[source]¶

On-ramp that adds the necessary serialization and encoding to the store.

Parameters

underlying_embeddings (Embeddings) – The embedder to use for embedding.
document_embedding_cache (BaseStore[str, bytes]) – The cache to use for storing document embeddings.
* –
namespace (str) –
batch_size (Optional[int]) –

Return type

CacheBackedEmbeddings

:param : :param namespace: The namespace to use for document cache.

This namespace is used to avoid collisions with other caches. For example, set it to the name of the embedding model used.

Parameters

batch_size (Optional[int]) – The number of documents to embed between store updates.
underlying_embeddings (Embeddings) –
document_embedding_cache (BaseStore[str, bytes]) –
namespace (str) –

Return type

CacheBackedEmbeddings

Examples using CacheBackedEmbeddings¶

langchain.embeddings.cache.CacheBackedEmbeddings¶

Examples using CacheBackedEmbeddings¶

`langchain.embeddings.cache`.CacheBackedEmbeddings¶