Share via


Using the Faiss connector (Preview)

Warning

The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release.

Not supported at this time

Overview

The Faiss Vector Store connector is a Vector Store implementation provided by Semantic Kernel that uses no external database and stores data in memory and vectors in a Faiss Index. It uses the InMemoryVectorCollection for the other parts of the records, while using the Faiss indexes for search. This Vector Store is useful for prototyping scenarios or where high-speed in-memory operations are required.

The connector has the following characteristics.

Feature Area Support
Collection maps to In-memory and Faiss indexes dictionary
Supported key property types Any that is allowed to be a dict key, see the python documentation for details here
Supported data property types Any type
Supported vector property types
  • list[float]
  • list[int]
  • numpy array
Supported index types Flat (see custom indexes)
Supported distance functions
  • Dot Product Similarity
  • Euclidean Squared Distance
Supports multiple vectors in a record Yes
is_filterable supported? Yes
is_full_text_searchable supported? Yes

Getting started

Add the Semantic Kernel package to your project.

pip install semantic-kernel[faiss]

In the snippets below, it is assumed that you have a data model class defined named 'DataModel'.

from semantic_kernel.connectors.memory.faiss import FaissStore

vector_store = FaissStore()
vector_collection = vector_store.get_collection("collection_name", DataModel)

It is possible to construct a direct reference to a named collection.

from semantic_kernel.connectors.memory.faiss import FaissCollection

vector_collection = FaissCollection("collection_name", DataModel)

Custom indexes

The Faiss connector is limited to the Flat index type.

Given the complexity of Faiss indexes, you are free to create your own index(es), including building the faiss-gpu package and using indexes from that. When doing this, any metrics defined on a vector field is ignored. If you have multiple vectors in your datamodel, you can pass in custom indexes only for the ones you want and let the built-in indexes be created, with a flat index and the metric defined in the model.

Important to note, if the index requires training, then make sure to do that as well, whenever we use the index, a check is done on the is_trained attribute of the index.

The index is always available (custom or built-in) in the indexes property of the collection. You can use this to get the index and do any operations you want on it, so you can do training afterwards, just make sure to do that before you want to use any CRUD against it.

To pass in your custom index, use either:


import faiss

from semantic_kernel.connectors.memory.faiss import FaissCollection

index = faiss.IndexHNSW(d=768, M=16, efConstruction=200) # or some other index
vector_collection = FaissCollection(
    collection_name="collection_name", 
    data_model_type=DataModel, 
    indexes={"vector_field_name": index}
)

or:


import faiss

from semantic_kernel.connectors.memory.faiss import FaissCollection

index = faiss.IndexHNSW(d=768, M=16, efConstruction=200) # or some other index
vector_collection = FaissCollection(
    collection_name="collection_name", 
    data_model_type=DataModel,
)
await vector_collection.create_collection(
    indexes={"vector_field_name": index}
)
# or when you have only one vector field:
await vector_collection.create_collection(
    index=index
)

Not supported at this time