VECTOR_DISTANCE (Transact-SQL) (Preview)

Applies to: Azure SQL Database Azure SQL Managed Instance

Note

This data type is in preview and is subject to change. Make sure to read preview usage terms in the Service Level Agreements (SLA) for Online Services document.

Calculates the distance between two vectors using a specified distance metric.

Syntax

Transact-SQL syntax conventions

VECTOR_DISTANCE ( distance_metric, vector1, vector2 )

Arguments

distance_metric

A string with the name of the distance metric to use to calculate the distance between the two given vectors. The following distance metrics are supported:

  • cosine - Cosine distance
  • euclidean - Euclidean distance
  • dot - (Negative) Dot product

vector1

An expression that evaluates to a vector in binary format.

vector2

An expression that evaluates to a vector in binary format.

Distance Metrics

Metric Description Range Examples
cosine Cosine (angular) distance [0, 2] 0: identical vectors
2: opposing vectors
euclidean Euclidean distance [-∞, +∞] 0: identical vectors
dot Dot product-based indication of distance,
obtained by calculating the negative dot product
[-∞, +∞] Smaller numbers indicate more similar vectors

Return value

The function returns a scalar float value that represents the distance between the two vectors using the specified distance metric.

An error is returned if distance_metric isn't a valid metric and if the provided binary values don't represent valid vectors.

Examples

Review the Azure SQL Database Vector Search Samples GitHub repo to get end-to-end working samples.

Examples assume the existence of a table named wikipedia_articles with a column title_vector of type vector that stores title's embeddings of Wikipedia articles. title_vector is assumed to be an embedding generated with an embedding model like text-embedding-ada-002 or text-embedding-3-small , which returns vectors with 1,536 dimensions.

Example 1

The following example creates a vector with three dimensions from a string with a JSON array.

DECLARE @v1 VECTOR(2) = '[1,1]';
DECLARE @v2 VECTOR(2) = '[-1,-1]';

SELECT 
    VECTOR_DISTANCE('euclidean', @v1, @v2) AS euclidean,
    VECTOR_DISTANCE('cosine', @v1, @v2) AS cosine,
    VECTOR_DISTANCE('dot', @v1, @v2) AS negative_dot_product;

Example 2

The following example returns the top 10 most similar articles to a given article, based on the cosine distance between their title vectors. Details of the database used in the sample can be found here: Download and import the Wikipedia Article with Vector Embeddings

DECLARE @v AS VECTOR(1536);
SELECT @v = title_vector FROM [dbo].[wikipedia_articles] WHERE title = 'Alan Turing';

SELECT TOP(10) 
  id, 
  title, 
  VECTOR_DISTANCE('cosine', @v, title_vector) AS distance 
FROM 
  [dbo].[wikipedia_articles] 
ORDER BY
  distance

Example 3

The following example returns all the similar articles to a given article, based on the cosine distance between their title vectors, selecting only those with a distance less than 0.3.

DECLARE @v AS VECTOR(1536);
SELECT @v = title_vector FROM [dbo].[wikipedia_articles] WHERE title = 'Alan Turing';

SELECT  
  id, 
  title,
  VECTOR_DISTANCE('cosine', @v, title_vector) AS distance
FROM 
  [dbo].[wikipedia_articles] 
WHERE
  VECTOR_DISTANCE('cosine', @v, title_vector) < 0.3
ORDER BY
  distance