Share via


Using the Weaviate Vector Store connector (Preview)

Warning

The Weaviate Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release.

Warning

The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release.

Warning

The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release.

Overview

The Weaviate Vector Store connector can be used to access and manage data in Weaviate. The connector has the following characteristics.

Feature Area Support
Collection maps to Weaviate Collection
Supported key property types Guid
Supported data property types
  • string
  • byte
  • short
  • int
  • long
  • double
  • float
  • decimal
  • bool
  • DateTime
  • DateTimeOffset
  • Guid
  • and enumerables of each of these types
Supported vector property types
  • ReadOnlyMemory<float>
  • Embedding<float>
  • float[]
Supported index types
  • Hnsw
  • Flat
  • Dynamic
Supported distance functions
  • CosineDistance
  • NegativeDotProductSimilarity
  • EuclideanSquaredDistance
  • HammingDistance
  • ManhattanDistance
Supported filter clauses
  • AnyTagEqualTo
  • EqualTo
Supports multiple vectors in a record Yes
IsIndexed supported? Yes
IsFullTextIndexed supported? Yes
StorageName supported? No, use JsonSerializerOptions and JsonPropertyNameAttribute instead. See here for more info.
HybridSearch supported? Yes
Feature Area Support
Collection maps to Weaviate Collection
Supported key property types Guid
Supported data property types
  • string
  • byte
  • short
  • int
  • long
  • double
  • float
  • decimal
  • bool
  • and iterables of each of these types
Supported vector property types
  • list[float]
  • list[int]
  • ndarray
Supported index types
  • Hnsw
  • Flat
  • Dynamic
Supported distance functions
  • CosineDistance
  • NegativeDotProductSimilarity
  • EuclideanSquaredDistance
  • Hamming
  • ManhattanDistance
Supported filter clauses
  • AnyTagEqualTo
  • EqualTo
Supports multiple vectors in a record Yes
IsFilterable supported? Yes
IsFullTextSearchable supported? Yes

Coming soon.

Limitations

Notable Weaviate connector functionality limitations.

Feature Area Workaround
Using the 'vector' property for single vector objects is not supported Use of the 'vectors' property is supported instead.

Warning

Weaviate requires collection names to start with an upper case letter. If you do not provide a collection name with an upper case letter, Weaviate will return an error when you try and create your collection. The error that you will see is Cannot query field "mycollection" on type "GetObjectsObj". Did you mean "Mycollection"? where mycollection is your collection name. In this example, if you change your collection name to Mycollection instead, this will fix the error.

Getting started

Add the Weaviate Vector Store connector NuGet package to your project.

dotnet add package Microsoft.SemanticKernel.Connectors.Weaviate --prerelease

You can add the vector store to the dependency injection container available on the KernelBuilder or to the IServiceCollection dependency injection container using extension methods provided by Semantic Kernel. The Weaviate vector store uses an HttpClient to communicate with the Weaviate service. There are two options for providing the URL/endpoint for the Weaviate service. It can be provided via options or by setting the base address of the HttpClient.

This first example shows how to set the service URL via options. Also note that these methods will retrieve an HttpClient instance for making calls to the Weaviate service from the dependency injection service provider.

using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;

// Using Kernel Builder.
var kernelBuilder = Kernel
    .CreateBuilder();
kernelBuilder.Services
    .AddWeaviateVectorStore(new Uri("http://localhost:8080/v1/"), apiKey: null);
using Microsoft.Extensions.DependencyInjection;

// Using IServiceCollection with ASP.NET Core.
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddWeaviateVectorStore(new Uri("http://localhost:8080/v1/"), apiKey: null);

Overloads where you can specify your own HttpClient are also provided. In this case it's possible to set the service url via the HttpClient BaseAddress option.

using System.Net.Http;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;

// Using Kernel Builder.
var kernelBuilder = Kernel.CreateBuilder();
using HttpClient client = new HttpClient { BaseAddress = new Uri("http://localhost:8080/v1/") };
kernelBuilder.Services.AddWeaviateVectorStore(_ => client);
using System.Net.Http;
using Microsoft.Extensions.DependencyInjection;

// Using IServiceCollection with ASP.NET Core.
var builder = WebApplication.CreateBuilder(args);
using HttpClient client = new HttpClient { BaseAddress = new Uri("http://localhost:8080/v1/") };
builder.Services.AddWeaviateVectorStore(_ => client);

You can construct a Weaviate Vector Store instance directly as well.

using System.Net.Http;
using Microsoft.SemanticKernel.Connectors.Weaviate;

var vectorStore = new WeaviateVectorStore(
    new HttpClient { BaseAddress = new Uri("http://localhost:8080/v1/") });

It is possible to construct a direct reference to a named collection.

using System.Net.Http;
using Microsoft.SemanticKernel.Connectors.Weaviate;

var collection = new WeaviateCollection<Guid, Hotel>(
    new HttpClient { BaseAddress = new Uri("http://localhost:8080/v1/") },
    "Skhotels");

If needed, it is possible to pass an Api Key, as an option, when using any of the above mentioned mechanisms, e.g.

using Microsoft.SemanticKernel;

var kernelBuilder = Kernel
    .CreateBuilder();
kernelBuilder.Services
    .AddWeaviateVectorStore(new Uri("http://localhost:8080/v1/"), secretVar);

Data mapping

The Weaviate Vector Store connector provides a default mapper when mapping from the data model to storage. Weaviate requires properties to be mapped into id, payload and vectors groupings. The default mapper uses the model annotations or record definition to determine the type of each property and to do this mapping.

  • The data model property annotated as a key will be mapped to the Weaviate id property.
  • The data model properties annotated as data will be mapped to the Weaviate properties object.
  • The data model properties annotated as vectors will be mapped to the Weaviate vectors object.

The default mapper uses System.Text.Json.JsonSerializer to convert to the storage schema. This means that usage of the JsonPropertyNameAttribute is supported if a different storage name to the data model property name is required.

Here is an example of a data model with JsonPropertyNameAttribute set and how that will be represented in Weaviate.

using System.Text.Json.Serialization;
using Microsoft.Extensions.VectorData;

public class Hotel
{
    [VectorStoreKey]
    public Guid HotelId { get; set; }

    [VectorStoreData(IsIndexed = true)]
    public string HotelName { get; set; }

    [VectorStoreData(IsFullTextIndexed = true)]
    public string Description { get; set; }

    [JsonPropertyName("HOTEL_DESCRIPTION_EMBEDDING")]
    [VectorStoreVector(4, DistanceFunction = DistanceFunction.CosineDistance, IndexKind = IndexKind.QuantizedFlat)]
    public ReadOnlyMemory<float>? DescriptionEmbedding { get; set; }
}
{
    "id": "11111111-1111-1111-1111-111111111111",
    "properties": { "HotelName": "Hotel Happy", "Description": "A place where everyone can be happy." },
    "vectors": {
        "HOTEL_DESCRIPTION_EMBEDDING": [0.9, 0.1, 0.1, 0.1],
    }
}

Getting Started

Add the Weaviate Vector Store connector dependencies to your project.

pip install semantic-kernel[weaviate]

You can then create the vector store, it uses environment settings to connect:

For using Weaviate Cloud:

  • url: WEAVIATE_URL
  • api_key: WEAVIATE_API_KEY

For using Weaviate Local (i.e. Weaviate in a Docker container):

  • local_host: WEAVIATE_LOCAL_HOST
  • local_port: WEAVIATE_LOCAL_PORT
  • local_grpc_port: WEAVIATE_LOCAL_GRPC_PORT

If you want to use embedded:

  • use_embed: WEAVIATE_USE_EMBED

These should be set exclusively, so only one set of the above is present, otherwise it will raise an exception.

from semantic_kernel.connectors.memory.weaviate import WeaviateStore

store = WeaviateStore()

Alternatively, you can also pass in your own mongodb client if you want to have more control over the client construction:

import weaviate
from semantic_kernel.connectors.memory.weaviate import WeaviateStore

client = weaviate.WeaviateAsyncClient(...)
store = WeaviateStore(async_client=client)

You can also create a collection directly, without the store.

from semantic_kernel.connectors.memory.weaviate import WeaviateCollection

# `hotel` is a class created with the @vectorstoremodel decorator
collection = WeaviateCollection(
    collection_name="my_collection",
    data_model_type=hotel
)

Serialization

The Weaviate client returns it's own objects which are parsed and turned into dicts in the regular flow, for more details on this concept see the serialization documentation.

Coming soon

More info coming soon.