Using the Azure CosmosDB MongoDB Vector Store connector (Preview)

Warning

The Semantic Kernel Vector Store functionality is in preview, and improvements that require breaking changes may still occur in limited circumstances before release.

Overview

The Azure CosmosDB MongoDB Vector Store connector can be used to access and manage data in Azure CosmosDB MongoDB. The connector has the following characteristics.

Feature Area Support
Collection maps to Azure Cosmos DB MongoDB Collection + Index
Supported key property types string
Supported data property types
  • string
  • int
  • long
  • double
  • float
  • decimal
  • bool
  • DateTime
  • and enumerables of each of these types
Supported vector property types
  • ReadOnlyMemory<float>
  • ReadOnlyMemory<double>
Supported index types
  • Hnsw
  • IvfFlat
Supported distance functions
  • CosineDistance
  • DotProductSimilarity
  • EuclideanDistance
Supports multiple vectors in a record Yes
IsFilterable supported? Yes
IsFullTextSearchable supported? No
StoragePropertyName supported? No, use BsonElementAttribute instead. See here for more info.

Getting started

Add the Azure CosmosDB MongoDB Vector Store connector NuGet package to your project.

dotnet add package Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB --prerelease

You can add the vector store to the dependency injection container available on the KernelBuilder or to the IServiceCollection dependency injection container using extension methods provided by Semantic Kernel.

using Microsoft.SemanticKernel;

// Using Kernel Builder.
var kernelBuilder = Kernel
    .CreateBuilder()
    .AddAzureCosmosDBMongoDBVectorStore(connectionString, databaseName);
using Microsoft.SemanticKernel;

// Using IServiceCollection with ASP.NET Core.
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddAzureCosmosDBMongoDBVectorStore(connectionString, databaseName);

Extension methods that take no parameters are also provided. These require an instance of MongoDB.Driver.IMongoDatabase to be separately registered with the dependency injection container.

using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using MongoDB.Driver;

// Using Kernel Builder.
var kernelBuilder = Kernel.CreateBuilder();
kernelBuilder.Services.AddSingleton<IMongoDatabase>(
    sp =>
    {
        var mongoClient = new MongoClient(connectionString);
        return mongoClient.GetDatabase(databaseName);
    });
kernelBuilder.AddAzureCosmosDBMongoDBVectorStore();
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using MongoDB.Driver;

// Using IServiceCollection with ASP.NET Core.
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddSingleton<IMongoDatabase>(
    sp =>
    {
        var mongoClient = new MongoClient(connectionString);
        return mongoClient.GetDatabase(databaseName);
    });
builder.Services.AddAzureCosmosDBMongoDBVectorStore();

You can construct an Azure CosmosDB MongoDB Vector Store instance directly.

using Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB;
using MongoDB.Driver;

var mongoClient = new MongoClient(connectionString);
var database = mongoClient.GetDatabase(databaseName);
var vectorStore = new AzureCosmosDBMongoDBVectorStore(database);

It is possible to construct a direct reference to a named collection.

using Microsoft.SemanticKernel.Connectors.AzureCosmosDBMongoDB;
using MongoDB.Driver;

var mongoClient = new MongoClient(connectionString);
var database = mongoClient.GetDatabase(databaseName);
var collection = new AzureCosmosDBMongoDBVectorStoreRecordCollection<Hotel>(
    database,
    "skhotels");

Data mapping

The Azure CosmosDB MongoDB Vector Store connector provides a default mapper when mapping data from the data model to storage.

This mapper does a direct conversion of the list of properties on the data model to the fields in Azure CosmosDB MongoDB and uses MongoDB.Bson.Serialization to convert to the storage schema. This means that usage of the MongoDB.Bson.Serialization.Attributes.BsonElement is supported if a different storage name to the data model property name is required. The only exception is the key of the record which is mapped to a database field named _id, since all CosmosDB MongoDB records must use this name for ids.

Property name override

For data properties and vector properties, you can provide override field names to use in storage that is different to the property names on the data model. This is not supported for keys, since a key has a fixed name in MongoDB.

The property name override is done by setting the BsonElement attribute on the data model properties.

Here is an example of a data model with BsonElement set.

using Microsoft.Extensions.VectorData;

public class Hotel
{
    [VectorStoreRecordKey]
    public ulong HotelId { get; set; }

    [BsonElement("hotel_name")]
    [VectorStoreRecordData(IsFilterable = true)]
    public string HotelName { get; set; }

    [BsonElement("hotel_description")]
    [VectorStoreRecordData(IsFullTextSearchable = true)]
    public string Description { get; set; }

    [BsonElement("hotel_description_embedding")]
    [VectorStoreRecordVector(4, DistanceFunction.CosineDistance, IndexKind.Hnsw)]
    public ReadOnlyMemory<float>? DescriptionEmbedding { get; set; }
}

Coming soon

More info coming soon.

Coming soon

More info coming soon.