Indexes - Get

Reference

Service:: Search Service

API Version:: 2024-07-01

Retrieves an index definition.

GET {endpoint}/indexes('{indexName}')?api-version=2024-07-01

URI Parameters

Name	In	Required	Type	Description
endpoint	path	True	string	The endpoint URL of the search service.
indexName	path	True	string	The name of the index to retrieve.
api-version	query	True	string	Client Api Version.

Request Header

Name	Required	Type	Description
x-ms-client-request-id		string uuid	The tracking ID sent with the request to help with debugging.

Responses

Name	Type	Description
200 OK	SearchIndex
Other Status Codes	ErrorResponse	Error response.

Examples

SearchServiceGetIndex

Sample request

HTTP

GET https://myservice.search.windows.net/indexes('hotels')?api-version=2024-07-01

Sample response

Status code:: 200

{
  "name": "hotels",
  "fields": [
    {
      "name": "hotelId",
      "type": "Edm.String",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": true,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "baseRate",
      "type": "Edm.Double",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "description",
      "type": "Edm.String",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "descriptionEmbedding",
      "type": "Collection(Edm.Single)",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": 1536,
      "vectorSearchProfile": "myHnswProfile",
      "synonymMaps": []
    },
    {
      "name": "description_fr",
      "type": "Edm.String",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": "fr.lucene",
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "hotelName",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "category",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "tags",
      "type": "Collection(Edm.String)",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": false,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": "tagsAnalyzer",
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "parkingIncluded",
      "type": "Edm.Boolean",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "smokingAllowed",
      "type": "Edm.Boolean",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "lastRenovationDate",
      "type": "Edm.DateTimeOffset",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "rating",
      "type": "Edm.Int32",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "location",
      "type": "Edm.GeographyPoint",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    }
  ],
  "scoringProfiles": [
    {
      "name": "geo",
      "functionAggregation": "sum",
      "text": {
        "weights": {
          "hotelName": 5
        }
      },
      "functions": [
        {
          "type": "distance",
          "boost": 5,
          "fieldName": "location",
          "interpolation": "logarithmic",
          "distance": {
            "referencePointParameter": "currentLocation",
            "boostingDistance": 10
          }
        }
      ]
    }
  ],
  "defaultScoringProfile": "geo",
  "suggesters": [
    {
      "name": "sg",
      "searchMode": "analyzingInfixMatching",
      "sourceFields": [
        "hotelName"
      ]
    }
  ],
  "analyzers": [
    {
      "name": "tagsAnalyzer",
      "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
      "charFilters": [
        "html_strip"
      ],
      "tokenizer": "standard_v2"
    }
  ],
  "tokenizers": [],
  "tokenFilters": [],
  "charFilters": [],
  "corsOptions": {
    "allowedOrigins": [
      "tempuri.org"
    ],
    "maxAgeInSeconds": 60
  },
  "encryptionKey": {
    "keyVaultKeyName": "myKeyName",
    "keyVaultKeyVersion": "myKeyVersion",
    "keyVaultUri": "https://myKeyVault.vault.azure.net",
    "accessCredentials": {
      "applicationId": "00000000-0000-0000-0000-000000000000",
      "applicationSecret": null
    }
  },
  "semantic": {
    "configurations": [
      {
        "name": "semanticHotels",
        "prioritizedFields": {
          "titleField": {
            "fieldName": "hotelName"
          },
          "prioritizedContentFields": [
            {
              "fieldName": "description"
            },
            {
              "fieldName": "description_fr"
            }
          ],
          "prioritizedKeywordsFields": [
            {
              "fieldName": "tags"
            },
            {
              "fieldName": "category"
            }
          ]
        }
      }
    ]
  },
  "vectorSearch": {
    "algorithms": [
      {
        "name": "myHnsw",
        "kind": "hnsw",
        "hnswParameters": {
          "metric": "cosine",
          "m": 4,
          "efConstruction": 400,
          "efSearch": 500
        }
      },
      {
        "name": "myExhaustive",
        "kind": "exhaustiveKnn",
        "exhaustiveKnnParameters": {
          "metric": "cosine"
        }
      }
    ],
    "profiles": [
      {
        "name": "myHnswProfile",
        "algorithm": "myHnsw"
      },
      {
        "name": "myAlgorithm",
        "algorithm": "myExhaustive"
      }
    ]
  }
}

Definitions

Name	Description
AsciiFoldingTokenFilter	Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. This token filter is implemented using Apache Lucene.
AzureActiveDirectoryApplicationCredentials	Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault.
AzureOpenAIEmbeddingSkill	Allows you to generate a vector embedding for a given text input using the Azure OpenAI resource.
AzureOpenAIModelName	The Azure Open AI model name that will be called.
AzureOpenAIParameters	Specifies the parameters for connecting to the Azure OpenAI resource.
AzureOpenAIVectorizer	Specifies the Azure OpenAI resource used to vectorize a query string.
BinaryQuantizationVectorSearchCompressionConfiguration	Contains configuration options specific to the binary quantization compression method used during indexing and querying.
BM25Similarity	Ranking function based on the Okapi BM25 similarity algorithm. BM25 is a TF-IDF-like algorithm that includes length normalization (controlled by the 'b' parameter) as well as term frequency saturation (controlled by the 'k1' parameter).
CharFilterName	Defines the names of all character filters supported by the search engine.
CjkBigramTokenFilter	Forms bigrams of CJK terms that are generated from the standard tokenizer. This token filter is implemented using Apache Lucene.
CjkBigramTokenFilterScripts	Scripts that can be ignored by CjkBigramTokenFilter.
ClassicSimilarity	Legacy similarity algorithm which uses the Lucene TFIDFSimilarity implementation of TF-IDF. This variation of TF-IDF introduces static document length normalization as well as coordinating factors that penalize documents that only partially match the searched queries.
ClassicTokenizer	Grammar-based tokenizer that is suitable for processing most European-language documents. This tokenizer is implemented using Apache Lucene.
CommonGramTokenFilter	Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. This token filter is implemented using Apache Lucene.
CorsOptions	Defines options to control Cross-Origin Resource Sharing (CORS) for an index.
CustomAnalyzer	Allows you to take control over the process of converting text into indexable/searchable tokens. It's a user-defined configuration consisting of a single predefined tokenizer and one or more filters. The tokenizer is responsible for breaking text into tokens, and the filters for modifying tokens emitted by the tokenizer.
DictionaryDecompounderTokenFilter	Decomposes compound words found in many Germanic languages. This token filter is implemented using Apache Lucene.
DistanceScoringFunction	Defines a function that boosts scores based on distance from a geographic location.
DistanceScoringParameters	Provides parameter values to a distance scoring function.
EdgeNGramTokenFilter	Generates n-grams of the given size(s) starting from the front or the back of an input token. This token filter is implemented using Apache Lucene.
EdgeNGramTokenFilterSide	Specifies which side of the input an n-gram should be generated from.
EdgeNGramTokenFilterV2	Generates n-grams of the given size(s) starting from the front or the back of an input token. This token filter is implemented using Apache Lucene.
EdgeNGramTokenizer	Tokenizes the input from an edge into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.
ElisionTokenFilter	Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). This token filter is implemented using Apache Lucene.
ErrorAdditionalInfo	The resource management error additional info.
ErrorDetail	The error detail.
ErrorResponse	Error response
ExhaustiveKnnParameters	Contains the parameters specific to exhaustive KNN algorithm.
ExhaustiveKnnVectorSearchAlgorithmConfiguration	Contains configuration options specific to the exhaustive KNN algorithm used during querying, which will perform brute-force search across the entire vector index.
FreshnessScoringFunction	Defines a function that boosts scores based on the value of a date-time field.
FreshnessScoringParameters	Provides parameter values to a freshness scoring function.
HnswParameters	Contains the parameters specific to the HNSW algorithm.
HnswVectorSearchAlgorithmConfiguration	Contains configuration options specific to the HNSW approximate nearest neighbors algorithm used during indexing and querying. The HNSW algorithm offers a tunable trade-off between search speed and accuracy.
InputFieldMappingEntry	Input field mapping for a skill.
KeepTokenFilter	A token filter that only keeps tokens with text contained in a specified list of words. This token filter is implemented using Apache Lucene.
KeywordMarkerTokenFilter	Marks terms as keywords. This token filter is implemented using Apache Lucene.
KeywordTokenizer	Emits the entire input as a single token. This tokenizer is implemented using Apache Lucene.
KeywordTokenizerV2	Emits the entire input as a single token. This tokenizer is implemented using Apache Lucene.
LengthTokenFilter	Removes words that are too long or too short. This token filter is implemented using Apache Lucene.
LexicalAnalyzerName	Defines the names of all text analyzers supported by the search engine.
LexicalTokenizerName	Defines the names of all tokenizers supported by the search engine.
LimitTokenFilter	Limits the number of tokens while indexing. This token filter is implemented using Apache Lucene.
LuceneStandardAnalyzer	Standard Apache Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter.
LuceneStandardTokenizer	Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene.
LuceneStandardTokenizerV2	Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene.
MagnitudeScoringFunction	Defines a function that boosts scores based on the magnitude of a numeric field.
MagnitudeScoringParameters	Provides parameter values to a magnitude scoring function.
MappingCharFilter	A character filter that applies mappings defined with the mappings option. Matching is greedy (longest pattern matching at a given point wins). Replacement is allowed to be the empty string. This character filter is implemented using Apache Lucene.
MicrosoftLanguageStemmingTokenizer	Divides text using language-specific rules and reduces words to their base forms.
MicrosoftLanguageTokenizer	Divides text using language-specific rules.
MicrosoftStemmingTokenizerLanguage	Lists the languages supported by the Microsoft language stemming tokenizer.
MicrosoftTokenizerLanguage	Lists the languages supported by the Microsoft language tokenizer.
NGramTokenFilter	Generates n-grams of the given size(s). This token filter is implemented using Apache Lucene.
NGramTokenFilterV2	Generates n-grams of the given size(s). This token filter is implemented using Apache Lucene.
NGramTokenizer	Tokenizes the input into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.
OutputFieldMappingEntry	Output field mapping for a skill.
PathHierarchyTokenizerV2	Tokenizer for path-like hierarchies. This tokenizer is implemented using Apache Lucene.
PatternAnalyzer	Flexibly separates text into terms via a regular expression pattern. This analyzer is implemented using Apache Lucene.
PatternCaptureTokenFilter	Uses Java regexes to emit multiple tokens - one for each capture group in one or more patterns. This token filter is implemented using Apache Lucene.
PatternReplaceCharFilter	A character filter that replaces characters in the input string. It uses a regular expression to identify character sequences to preserve and a replacement pattern to identify characters to replace. For example, given the input text "aa bb aa bb", pattern "(aa)\s+(bb)", and replacement "$1#$2", the result would be "aa#bb aa#bb". This character filter is implemented using Apache Lucene.
PatternReplaceTokenFilter	A character filter that replaces characters in the input string. It uses a regular expression to identify character sequences to preserve and a replacement pattern to identify characters to replace. For example, given the input text "aa bb aa bb", pattern "(aa)\s+(bb)", and replacement "$1#$2", the result would be "aa#bb aa#bb". This token filter is implemented using Apache Lucene.
PatternTokenizer	Tokenizer that uses regex pattern matching to construct distinct tokens. This tokenizer is implemented using Apache Lucene.
PhoneticEncoder	Identifies the type of phonetic encoder to use with a PhoneticTokenFilter.
PhoneticTokenFilter	Create tokens for phonetic matches. This token filter is implemented using Apache Lucene.
PrioritizedFields	Describes the title, content, and keywords fields to be used for semantic ranking, captions, highlights, and answers.
RegexFlags	Defines flags that can be combined to control how regular expressions are used in the pattern analyzer and pattern tokenizer.
ScalarQuantizationParameters	Contains the parameters specific to Scalar Quantization.
ScalarQuantizationVectorSearchCompressionConfiguration	Contains configuration options specific to the scalar quantization compression method used during indexing and querying.
ScoringFunctionAggregation	Defines the aggregation function used to combine the results of all the scoring functions in a scoring profile.
ScoringFunctionInterpolation	Defines the function used to interpolate score boosting across a range of documents.
ScoringProfile	Defines parameters for a search index that influence scoring in search queries.
SearchField	Represents a field in an index definition, which describes the name, data type, and search behavior of a field.
SearchFieldDataType	Defines the data type of a field in a search index.
SearchIndex	Represents a search index definition, which describes the fields and search behavior of an index.
SearchIndexerDataNoneIdentity	Clears the identity property of a datasource.
SearchIndexerDataUserAssignedIdentity	Specifies the identity for a datasource to use.
SearchResourceEncryptionKey	A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps.
SemanticConfiguration	Defines a specific configuration to be used in the context of semantic capabilities.
SemanticField	A field that is used as part of the semantic configuration.
SemanticSettings	Defines parameters for a search index that influence semantic capabilities.
ShingleTokenFilter	Creates combinations of tokens as a single token. This token filter is implemented using Apache Lucene.
SnowballTokenFilter	A filter that stems words using a Snowball-generated stemmer. This token filter is implemented using Apache Lucene.
SnowballTokenFilterLanguage	The language to use for a Snowball token filter.
StemmerOverrideTokenFilter	Provides the ability to override other stemming filters with custom dictionary-based stemming. Any dictionary-stemmed terms will be marked as keywords so that they will not be stemmed with stemmers down the chain. Must be placed before any stemming filters. This token filter is implemented using Apache Lucene.
StemmerTokenFilter	Language specific stemming filter. This token filter is implemented using Apache Lucene.
StemmerTokenFilterLanguage	The language to use for a stemmer token filter.
StopAnalyzer	Divides text at non-letters; Applies the lowercase and stopword token filters. This analyzer is implemented using Apache Lucene.
StopwordsList	Identifies a predefined list of language-specific stopwords.
StopwordsTokenFilter	Removes stop words from a token stream. This token filter is implemented using Apache Lucene.
Suggester	Defines how the Suggest API should apply to a group of fields in the index.
SuggesterSearchMode	A value indicating the capabilities of the suggester.
SynonymTokenFilter	Matches single or multi-word synonyms in a token stream. This token filter is implemented using Apache Lucene.
TagScoringFunction	Defines a function that boosts scores of documents with string values matching a given list of tags.
TagScoringParameters	Provides parameter values to a tag scoring function.
TextWeights	Defines weights on index fields for which matches should boost scoring in search queries.
TokenCharacterKind	Represents classes of characters on which a token filter can operate.
TokenFilterName	Defines the names of all token filters supported by the search engine.
TruncateTokenFilter	Truncates the terms to a specific length. This token filter is implemented using Apache Lucene.
UaxUrlEmailTokenizer	Tokenizes urls and emails as one token. This tokenizer is implemented using Apache Lucene.
UniqueTokenFilter	Filters out tokens with same text as the previous token. This token filter is implemented using Apache Lucene.
VectorEncodingFormat	The encoding format for interpreting vector field contents.
VectorSearch	Contains configuration options related to vector search.
VectorSearchAlgorithmKind	The algorithm used for indexing and querying.
VectorSearchAlgorithmMetric	The similarity metric to use for vector comparisons. It is recommended to choose the same similarity metric as the embedding model was trained on.
VectorSearchCompressionKind	The compression method used for indexing and querying.
VectorSearchCompressionTargetDataType	The quantized data type of compressed vector values.
VectorSearchProfile	Defines a combination of configurations to use with vector search.
VectorSearchVectorizerKind	The vectorization method to be used during query time.
WebApiParameters	Specifies the properties for connecting to a user-defined vectorizer.
WebApiVectorizer	Specifies a user-defined vectorizer for generating the vector embedding of a query string. Integration of an external vectorizer is achieved using the custom Web API interface of a skillset.
WordDelimiterTokenFilter	Splits words into subwords and performs optional transformations on subword groups. This token filter is implemented using Apache Lucene.

AsciiFoldingTokenFilter

Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.AsciiFoldingTokenFilter		A URI fragment specifying the type of token filter.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
preserveOriginal	boolean	False	A value indicating whether the original token will be kept. Default is false.

AzureActiveDirectoryApplicationCredentials

Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault.

Name	Type	Description
applicationId	string	An AAD Application ID that was granted the required access permissions to the Azure Key Vault that is to be used when encrypting your data at rest. The Application ID should not be confused with the Object ID for your AAD Application.
applicationSecret	string	The authentication key of the specified AAD application.

AzureOpenAIEmbeddingSkill

Allows you to generate a vector embedding for a given text input using the Azure OpenAI resource.

Name	Type	Description
@odata.type	string: #Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill	A URI fragment specifying the type of skill.
apiKey	string	API key of the designated Azure OpenAI resource.
authIdentity	SearchIndexerDataIdentity: SearchIndexerDataNoneIdentity SearchIndexerDataUserAssignedIdentity	The user-assigned managed identity used for outbound connections.
context	string	Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document.
deploymentId	string	ID of the Azure OpenAI model deployment on the designated resource.
description	string	The description of the skill which describes the inputs, outputs, and usage of the skill.
dimensions	integer	The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.
inputs	InputFieldMappingEntry[]	Inputs of the skills could be a column in the source data set, or the output of an upstream skill.
modelName	AzureOpenAIModelName	The name of the embedding model that is deployed at the provided deploymentId path.
name	string	The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'.
outputs	OutputFieldMappingEntry[]	The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill.
resourceUri	string	The resource URI of the Azure OpenAI resource.

AzureOpenAIModelName

The Azure Open AI model name that will be called.

Name	Type	Description
text-embedding-3-large	string
text-embedding-3-small	string
text-embedding-ada-002	string

AzureOpenAIParameters

Specifies the parameters for connecting to the Azure OpenAI resource.

Name	Type	Description
apiKey	string	API key of the designated Azure OpenAI resource.
authIdentity	SearchIndexerDataIdentity: SearchIndexerDataNoneIdentity SearchIndexerDataUserAssignedIdentity	The user-assigned managed identity used for outbound connections.
deploymentId	string	ID of the Azure OpenAI model deployment on the designated resource.
modelName	AzureOpenAIModelName	The name of the embedding model that is deployed at the provided deploymentId path.
resourceUri	string	The resource URI of the Azure OpenAI resource.

AzureOpenAIVectorizer

Specifies the Azure OpenAI resource used to vectorize a query string.

Name	Type	Description
azureOpenAIParameters	AzureOpenAIParameters: AzureOpenAIEmbeddingSkill	Contains the parameters specific to Azure OpenAI embedding vectorization.
kind	string: azureOpenAI	The name of the kind of vectorization method being configured for use with vector search.
name	string	The name to associate with this particular vectorization method.

BinaryQuantizationVectorSearchCompressionConfiguration

Contains configuration options specific to the binary quantization compression method used during indexing and querying.

Name	Type	Default value	Description
defaultOversampling	number		Default oversampling factor. Oversampling will internally request more documents (specified by this multiplier) in the initial search. This increases the set of results that will be reranked using recomputed similarity scores from full-precision vectors. Minimum value is 1, meaning no oversampling (1x). This parameter can only be set when rerankWithOriginalVectors is true. Higher values improve recall at the expense of latency.
kind	string: binaryQuantization		The name of the kind of compression method being configured for use with vector search.
name	string		The name to associate with this particular configuration.
rerankWithOriginalVectors	boolean	True	If set to true, once the ordered set of results calculated using compressed vectors are obtained, they will be reranked again by recalculating the full-precision similarity scores. This will improve recall at the expense of latency.

BM25Similarity

Ranking function based on the Okapi BM25 similarity algorithm. BM25 is a TF-IDF-like algorithm that includes length normalization (controlled by the 'b' parameter) as well as term frequency saturation (controlled by the 'k1' parameter).

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.BM25Similarity
b	number	This property controls how the length of a document affects the relevance score. By default, a value of 0.75 is used. A value of 0.0 means no length normalization is applied, while a value of 1.0 means the score is fully normalized by the length of the document.
k1	number	This property controls the scaling function between the term frequency of each matching terms and the final relevance score of a document-query pair. By default, a value of 1.2 is used. A value of 0.0 means the score does not scale with an increase in term frequency.

CharFilterName

Defines the names of all character filters supported by the search engine.

Name	Type	Description
html_strip	string	A character filter that attempts to strip out HTML constructs. See https://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/charfilter/HTMLStripCharFilter.html

CjkBigramTokenFilter

Forms bigrams of CJK terms that are generated from the standard tokenizer. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.CjkBigramTokenFilter		A URI fragment specifying the type of token filter.
ignoreScripts	CjkBigramTokenFilterScripts[]		The scripts to ignore.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
outputUnigrams	boolean	False	A value indicating whether to output both unigrams and bigrams (if true), or just bigrams (if false). Default is false.

CjkBigramTokenFilterScripts

Scripts that can be ignored by CjkBigramTokenFilter.

Name	Type	Description
han	string	Ignore Han script when forming bigrams of CJK terms.
hangul	string	Ignore Hangul script when forming bigrams of CJK terms.
hiragana	string	Ignore Hiragana script when forming bigrams of CJK terms.
katakana	string	Ignore Katakana script when forming bigrams of CJK terms.

ClassicSimilarity

Legacy similarity algorithm which uses the Lucene TFIDFSimilarity implementation of TF-IDF. This variation of TF-IDF introduces static document length normalization as well as coordinating factors that penalize documents that only partially match the searched queries.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.ClassicSimilarity

ClassicTokenizer

Grammar-based tokenizer that is suitable for processing most European-language documents. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.ClassicTokenizer		A URI fragment specifying the type of tokenizer.
maxTokenLength	integer	255	The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

CommonGramTokenFilter

Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.CommonGramTokenFilter		A URI fragment specifying the type of token filter.
commonWords	string[]		The set of common words.
ignoreCase	boolean	False	A value indicating whether common words matching will be case insensitive. Default is false.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
queryMode	boolean	False	A value that indicates whether the token filter is in query mode. When in query mode, the token filter generates bigrams and then removes common words and single terms followed by a common word. Default is false.

CorsOptions

Defines options to control Cross-Origin Resource Sharing (CORS) for an index.

Name	Type	Description
allowedOrigins	string[]	The list of origins from which JavaScript code will be granted access to your index. Can contain a list of hosts of the form {protocol}://{fully-qualified-domain-name}[:{port#}], or a single '*' to allow all origins (not recommended).
maxAgeInSeconds	integer	The duration for which browsers should cache CORS preflight responses. Defaults to 5 minutes.

CustomAnalyzer

Allows you to take control over the process of converting text into indexable/searchable tokens. It's a user-defined configuration consisting of a single predefined tokenizer and one or more filters. The tokenizer is responsible for breaking text into tokens, and the filters for modifying tokens emitted by the tokenizer.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.CustomAnalyzer	A URI fragment specifying the type of analyzer.
charFilters	CharFilterName[]	A list of character filters used to prepare input text before it is processed by the tokenizer. For instance, they can replace certain characters or symbols. The filters are run in the order in which they are listed.
name	string	The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
tokenFilters	TokenFilterName[]	A list of token filters used to filter out or modify the tokens generated by a tokenizer. For example, you can specify a lowercase filter that converts all characters to lowercase. The filters are run in the order in which they are listed.
tokenizer	LexicalTokenizerName	The name of the tokenizer to use to divide continuous text into a sequence of tokens, such as breaking a sentence into words.

DictionaryDecompounderTokenFilter

Decomposes compound words found in many Germanic languages. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.DictionaryDecompounderTokenFilter		A URI fragment specifying the type of token filter.
maxSubwordSize	integer	15	The maximum subword size. Only subwords shorter than this are outputted. Default is 15. Maximum is 300.
minSubwordSize	integer	2	The minimum subword size. Only subwords longer than this are outputted. Default is 2. Maximum is 300.
minWordSize	integer	5	The minimum word size. Only words longer than this get processed. Default is 5. Maximum is 300.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
onlyLongestMatch	boolean	False	A value indicating whether to add only the longest matching subword to the output. Default is false.
wordList	string[]		The list of words to match against.

DistanceScoringFunction

Defines a function that boosts scores based on distance from a geographic location.

Name	Type	Description
boost	number	A multiplier for the raw score. Must be a positive number not equal to 1.0.
distance	DistanceScoringParameters	Parameter values for the distance scoring function.
fieldName	string	The name of the field used as input to the scoring function.
interpolation	ScoringFunctionInterpolation	A value indicating how boosting will be interpolated across document scores; defaults to "Linear".
type	string: distance	Indicates the type of function to use. Valid values include magnitude, freshness, distance, and tag. The function type must be lower case.

DistanceScoringParameters

Provides parameter values to a distance scoring function.

Name	Type	Description
boostingDistance	number	The distance in kilometers from the reference location where the boosting range ends.
referencePointParameter	string	The name of the parameter passed in search queries to specify the reference location.

EdgeNGramTokenFilter

Generates n-grams of the given size(s) starting from the front or the back of an input token. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.EdgeNGramTokenFilter		A URI fragment specifying the type of token filter.
maxGram	integer	2	The maximum n-gram length. Default is 2.
minGram	integer	1	The minimum n-gram length. Default is 1. Must be less than the value of maxGram.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
side	EdgeNGramTokenFilterSide	front	Specifies which side of the input the n-gram should be generated from. Default is "front".

EdgeNGramTokenFilterSide

Specifies which side of the input an n-gram should be generated from.

Name	Type	Description
back	string	Specifies that the n-gram should be generated from the back of the input.
front	string	Specifies that the n-gram should be generated from the front of the input.

EdgeNGramTokenFilterV2

Generates n-grams of the given size(s) starting from the front or the back of an input token. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.EdgeNGramTokenFilterV2		A URI fragment specifying the type of token filter.
maxGram	integer	2	The maximum n-gram length. Default is 2. Maximum is 300.
minGram	integer	1	The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
side	EdgeNGramTokenFilterSide	front	Specifies which side of the input the n-gram should be generated from. Default is "front".

EdgeNGramTokenizer

Tokenizes the input from an edge into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.EdgeNGramTokenizer		A URI fragment specifying the type of tokenizer.
maxGram	integer	2	The maximum n-gram length. Default is 2. Maximum is 300.
minGram	integer	1	The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
tokenChars	TokenCharacterKind[]		Character classes to keep in the tokens.

ElisionTokenFilter

Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). This token filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.ElisionTokenFilter	A URI fragment specifying the type of token filter.
articles	string[]	The set of articles to remove.
name	string	The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

ErrorAdditionalInfo

The resource management error additional info.

Name	Type	Description
info	object	The additional info.
type	string	The additional info type.

ErrorDetail

The error detail.

Name	Type	Description
additionalInfo	ErrorAdditionalInfo[]	The error additional info.
code	string	The error code.
details	ErrorDetail[]	The error details.
message	string	The error message.
target	string	The error target.

ErrorResponse

Error response

Name	Type	Description
error	ErrorDetail	The error object.

ExhaustiveKnnParameters

Contains the parameters specific to exhaustive KNN algorithm.

Name	Type	Description
metric	VectorSearchAlgorithmMetric	The similarity metric to use for vector comparisons.

ExhaustiveKnnVectorSearchAlgorithmConfiguration

Contains configuration options specific to the exhaustive KNN algorithm used during querying, which will perform brute-force search across the entire vector index.

Name	Type	Description
exhaustiveKnnParameters	ExhaustiveKnnParameters	Contains the parameters specific to exhaustive KNN algorithm.
kind	string: exhaustiveKnn	The name of the kind of algorithm being configured for use with vector search.
name	string	The name to associate with this particular configuration.

FreshnessScoringFunction

Defines a function that boosts scores based on the value of a date-time field.

Name	Type	Description
boost	number	A multiplier for the raw score. Must be a positive number not equal to 1.0.
fieldName	string	The name of the field used as input to the scoring function.
freshness	FreshnessScoringParameters	Parameter values for the freshness scoring function.
interpolation	ScoringFunctionInterpolation	A value indicating how boosting will be interpolated across document scores; defaults to "Linear".
type	string: freshness	Indicates the type of function to use. Valid values include magnitude, freshness, distance, and tag. The function type must be lower case.

FreshnessScoringParameters

Provides parameter values to a freshness scoring function.

Name	Type	Description
boostingDuration	string	The expiration period after which boosting will stop for a particular document.

HnswParameters

Contains the parameters specific to the HNSW algorithm.

Name	Type	Default value	Description
efConstruction	integer	400	The size of the dynamic list containing the nearest neighbors, which is used during index time. Increasing this parameter may improve index quality, at the expense of increased indexing time. At a certain point, increasing this parameter leads to diminishing returns.
efSearch	integer	500	The size of the dynamic list containing the nearest neighbors, which is used during search time. Increasing this parameter may improve search results, at the expense of slower search. At a certain point, increasing this parameter leads to diminishing returns.
m	integer	4	The number of bi-directional links created for every new element during construction. Increasing this parameter value may improve recall and reduce retrieval times for datasets with high intrinsic dimensionality at the expense of increased memory consumption and longer indexing time.
metric	VectorSearchAlgorithmMetric		The similarity metric to use for vector comparisons.

HnswVectorSearchAlgorithmConfiguration

Contains configuration options specific to the HNSW approximate nearest neighbors algorithm used during indexing and querying. The HNSW algorithm offers a tunable trade-off between search speed and accuracy.

Name	Type	Description
hnswParameters	HnswParameters	Contains the parameters specific to HNSW algorithm.
kind	string: hnsw	The name of the kind of algorithm being configured for use with vector search.
name	string	The name to associate with this particular configuration.

InputFieldMappingEntry

Input field mapping for a skill.

Name	Type	Description
inputs	InputFieldMappingEntry[]	The recursive inputs used when creating a complex type.
name	string	The name of the input.
source	string	The source of the input.
sourceContext	string	The source context used for selecting recursive inputs.

KeepTokenFilter

A token filter that only keeps tokens with text contained in a specified list of words. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.KeepTokenFilter		A URI fragment specifying the type of token filter.
keepWords	string[]		The list of words to keep.
keepWordsCase	boolean	False	A value indicating whether to lower case all words first. Default is false.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

KeywordMarkerTokenFilter

Marks terms as keywords. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.KeywordMarkerTokenFilter		A URI fragment specifying the type of token filter.
ignoreCase	boolean	False	A value indicating whether to ignore case. If true, all words are converted to lower case first. Default is false.
keywords	string[]		A list of words to mark as keywords.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

KeywordTokenizer

Emits the entire input as a single token. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.KeywordTokenizer		A URI fragment specifying the type of tokenizer.
bufferSize	integer	256	The read buffer size in bytes. Default is 256.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

KeywordTokenizerV2

Emits the entire input as a single token. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.KeywordTokenizerV2		A URI fragment specifying the type of tokenizer.
maxTokenLength	integer	256	The maximum token length. Default is 256. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

LengthTokenFilter

Removes words that are too long or too short. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.LengthTokenFilter		A URI fragment specifying the type of token filter.
max	integer	300	The maximum length in characters. Default and maximum is 300.
min	integer	0	The minimum length in characters. Default is 0. Maximum is 300. Must be less than the value of max.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

LexicalAnalyzerName

Defines the names of all text analyzers supported by the search engine.

Name	Type	Description
ar.lucene	string	Lucene analyzer for Arabic.
ar.microsoft	string	Microsoft analyzer for Arabic.
bg.lucene	string	Lucene analyzer for Bulgarian.
bg.microsoft	string	Microsoft analyzer for Bulgarian.
bn.microsoft	string	Microsoft analyzer for Bangla.
ca.lucene	string	Lucene analyzer for Catalan.
ca.microsoft	string	Microsoft analyzer for Catalan.
cs.lucene	string	Lucene analyzer for Czech.
cs.microsoft	string	Microsoft analyzer for Czech.
da.lucene	string	Lucene analyzer for Danish.
da.microsoft	string	Microsoft analyzer for Danish.
de.lucene	string	Lucene analyzer for German.
de.microsoft	string	Microsoft analyzer for German.
el.lucene	string	Lucene analyzer for Greek.
el.microsoft	string	Microsoft analyzer for Greek.
en.lucene	string	Lucene analyzer for English.
en.microsoft	string	Microsoft analyzer for English.
es.lucene	string	Lucene analyzer for Spanish.
es.microsoft	string	Microsoft analyzer for Spanish.
et.microsoft	string	Microsoft analyzer for Estonian.
eu.lucene	string	Lucene analyzer for Basque.
fa.lucene	string	Lucene analyzer for Persian.
fi.lucene	string	Lucene analyzer for Finnish.
fi.microsoft	string	Microsoft analyzer for Finnish.
fr.lucene	string	Lucene analyzer for French.
fr.microsoft	string	Microsoft analyzer for French.
ga.lucene	string	Lucene analyzer for Irish.
gl.lucene	string	Lucene analyzer for Galician.
gu.microsoft	string	Microsoft analyzer for Gujarati.
he.microsoft	string	Microsoft analyzer for Hebrew.
hi.lucene	string	Lucene analyzer for Hindi.
hi.microsoft	string	Microsoft analyzer for Hindi.
hr.microsoft	string	Microsoft analyzer for Croatian.
hu.lucene	string	Lucene analyzer for Hungarian.
hu.microsoft	string	Microsoft analyzer for Hungarian.
hy.lucene	string	Lucene analyzer for Armenian.
id.lucene	string	Lucene analyzer for Indonesian.
id.microsoft	string	Microsoft analyzer for Indonesian (Bahasa).
is.microsoft	string	Microsoft analyzer for Icelandic.
it.lucene	string	Lucene analyzer for Italian.
it.microsoft	string	Microsoft analyzer for Italian.
ja.lucene	string	Lucene analyzer for Japanese.
ja.microsoft	string	Microsoft analyzer for Japanese.
keyword	string	Treats the entire content of a field as a single token. This is useful for data like zip codes, ids, and some product names. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/KeywordAnalyzer.html
kn.microsoft	string	Microsoft analyzer for Kannada.
ko.lucene	string	Lucene analyzer for Korean.
ko.microsoft	string	Microsoft analyzer for Korean.
lt.microsoft	string	Microsoft analyzer for Lithuanian.
lv.lucene	string	Lucene analyzer for Latvian.
lv.microsoft	string	Microsoft analyzer for Latvian.
ml.microsoft	string	Microsoft analyzer for Malayalam.
mr.microsoft	string	Microsoft analyzer for Marathi.
ms.microsoft	string	Microsoft analyzer for Malay (Latin).
nb.microsoft	string	Microsoft analyzer for Norwegian (Bokmål).
nl.lucene	string	Lucene analyzer for Dutch.
nl.microsoft	string	Microsoft analyzer for Dutch.
no.lucene	string	Lucene analyzer for Norwegian.
pa.microsoft	string	Microsoft analyzer for Punjabi.
pattern	string	Flexibly separates text into terms via a regular expression pattern. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/PatternAnalyzer.html
pl.lucene	string	Lucene analyzer for Polish.
pl.microsoft	string	Microsoft analyzer for Polish.
pt-BR.lucene	string	Lucene analyzer for Portuguese (Brazil).
pt-BR.microsoft	string	Microsoft analyzer for Portuguese (Brazil).
pt-PT.lucene	string	Lucene analyzer for Portuguese (Portugal).
pt-PT.microsoft	string	Microsoft analyzer for Portuguese (Portugal).
ro.lucene	string	Lucene analyzer for Romanian.
ro.microsoft	string	Microsoft analyzer for Romanian.
ru.lucene	string	Lucene analyzer for Russian.
ru.microsoft	string	Microsoft analyzer for Russian.
simple	string	Divides text at non-letters and converts them to lower case. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/SimpleAnalyzer.html
sk.microsoft	string	Microsoft analyzer for Slovak.
sl.microsoft	string	Microsoft analyzer for Slovenian.
sr-cyrillic.microsoft	string	Microsoft analyzer for Serbian (Cyrillic).
sr-latin.microsoft	string	Microsoft analyzer for Serbian (Latin).
standard.lucene	string	Standard Lucene analyzer.
standardasciifolding.lucene	string	Standard ASCII Folding Lucene analyzer. See https://zcusa.951200.xyz/rest/api/searchservice/Custom-analyzers-in-Azure-Search#Analyzers
stop	string	Divides text at non-letters; Applies the lowercase and stopword token filters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/StopAnalyzer.html
sv.lucene	string	Lucene analyzer for Swedish.
sv.microsoft	string	Microsoft analyzer for Swedish.
ta.microsoft	string	Microsoft analyzer for Tamil.
te.microsoft	string	Microsoft analyzer for Telugu.
th.lucene	string	Lucene analyzer for Thai.
th.microsoft	string	Microsoft analyzer for Thai.
tr.lucene	string	Lucene analyzer for Turkish.
tr.microsoft	string	Microsoft analyzer for Turkish.
uk.microsoft	string	Microsoft analyzer for Ukrainian.
ur.microsoft	string	Microsoft analyzer for Urdu.
vi.microsoft	string	Microsoft analyzer for Vietnamese.
whitespace	string	An analyzer that uses the whitespace tokenizer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/WhitespaceAnalyzer.html
zh-Hans.lucene	string	Lucene analyzer for Chinese (Simplified).
zh-Hans.microsoft	string	Microsoft analyzer for Chinese (Simplified).
zh-Hant.lucene	string	Lucene analyzer for Chinese (Traditional).
zh-Hant.microsoft	string	Microsoft analyzer for Chinese (Traditional).

LexicalTokenizerName

Defines the names of all tokenizers supported by the search engine.

Name	Type	Description
classic	string	Grammar-based tokenizer that is suitable for processing most European-language documents. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicTokenizer.html
edgeNGram	string	Tokenizes the input from an edge into n-grams of the given size(s). See https://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenizer.html
keyword_v2	string	Emits the entire input as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/KeywordTokenizer.html
letter	string	Divides text at non-letters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LetterTokenizer.html
lowercase	string	Divides text at non-letters and converts them to lower case. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/LowerCaseTokenizer.html
microsoft_language_stemming_tokenizer	string	Divides text using language-specific rules and reduces words to their base forms.
microsoft_language_tokenizer	string	Divides text using language-specific rules.
nGram	string	Tokenizes the input into n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenizer.html
path_hierarchy_v2	string	Tokenizer for path-like hierarchies. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizer.html
pattern	string	Tokenizer that uses regex pattern matching to construct distinct tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternTokenizer.html
standard_v2	string	Standard Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html
uax_url_email	string	Tokenizes urls and emails as one token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/UAX29URLEmailTokenizer.html
whitespace	string	Divides text at whitespace. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizer.html

LimitTokenFilter

Limits the number of tokens while indexing. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.LimitTokenFilter		A URI fragment specifying the type of token filter.
consumeAllTokens	boolean	False	A value indicating whether all tokens from the input must be consumed even if maxTokenCount is reached. Default is false.
maxTokenCount	integer	1	The maximum number of tokens to produce. Default is 1.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

LuceneStandardAnalyzer

Standard Apache Lucene analyzer; Composed of the standard tokenizer, lowercase filter and stop filter.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.StandardAnalyzer		A URI fragment specifying the type of analyzer.
maxTokenLength	integer	255	The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
name	string		The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
stopwords	string[]		A list of stopwords.

LuceneStandardTokenizer

Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.StandardTokenizer		A URI fragment specifying the type of tokenizer.
maxTokenLength	integer	255	The maximum token length. Default is 255. Tokens longer than the maximum length are split.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

LuceneStandardTokenizerV2

Breaks text following the Unicode Text Segmentation rules. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.StandardTokenizerV2		A URI fragment specifying the type of tokenizer.
maxTokenLength	integer	255	The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

MagnitudeScoringFunction

Defines a function that boosts scores based on the magnitude of a numeric field.

Name	Type	Description
boost	number	A multiplier for the raw score. Must be a positive number not equal to 1.0.
fieldName	string	The name of the field used as input to the scoring function.
interpolation	ScoringFunctionInterpolation	A value indicating how boosting will be interpolated across document scores; defaults to "Linear".
magnitude	MagnitudeScoringParameters	Parameter values for the magnitude scoring function.
type	string: magnitude	Indicates the type of function to use. Valid values include magnitude, freshness, distance, and tag. The function type must be lower case.

MagnitudeScoringParameters

Provides parameter values to a magnitude scoring function.

Name	Type	Description
boostingRangeEnd	number	The field value at which boosting ends.
boostingRangeStart	number	The field value at which boosting starts.
constantBoostBeyondRange	boolean	A value indicating whether to apply a constant boost for field values beyond the range end value; default is false.

MappingCharFilter

A character filter that applies mappings defined with the mappings option. Matching is greedy (longest pattern matching at a given point wins). Replacement is allowed to be the empty string. This character filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.MappingCharFilter	A URI fragment specifying the type of char filter.
mappings	string[]	A list of mappings of the following format: "a=>b" (all occurrences of the character "a" will be replaced with character "b").
name	string	The name of the char filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

MicrosoftLanguageStemmingTokenizer

Divides text using language-specific rules and reduces words to their base forms.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.MicrosoftLanguageStemmingTokenizer		A URI fragment specifying the type of tokenizer.
isSearchTokenizer	boolean	False	A value indicating how the tokenizer is used. Set to true if used as the search tokenizer, set to false if used as the indexing tokenizer. Default is false.
language	MicrosoftStemmingTokenizerLanguage		The language to use. The default is English.
maxTokenLength	integer	255	The maximum token length. Tokens longer than the maximum length are split. Maximum token length that can be used is 300 characters. Tokens longer than 300 characters are first split into tokens of length 300 and then each of those tokens is split based on the max token length set. Default is 255.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

MicrosoftLanguageTokenizer

Divides text using language-specific rules.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.MicrosoftLanguageTokenizer		A URI fragment specifying the type of tokenizer.
isSearchTokenizer	boolean	False	A value indicating how the tokenizer is used. Set to true if used as the search tokenizer, set to false if used as the indexing tokenizer. Default is false.
language	MicrosoftTokenizerLanguage		The language to use. The default is English.
maxTokenLength	integer	255	The maximum token length. Tokens longer than the maximum length are split. Maximum token length that can be used is 300 characters. Tokens longer than 300 characters are first split into tokens of length 300 and then each of those tokens is split based on the max token length set. Default is 255.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

MicrosoftStemmingTokenizerLanguage

Lists the languages supported by the Microsoft language stemming tokenizer.

Name	Type	Description
arabic	string	Selects the Microsoft stemming tokenizer for Arabic.
bangla	string	Selects the Microsoft stemming tokenizer for Bangla.
bulgarian	string	Selects the Microsoft stemming tokenizer for Bulgarian.
catalan	string	Selects the Microsoft stemming tokenizer for Catalan.
croatian	string	Selects the Microsoft stemming tokenizer for Croatian.
czech	string	Selects the Microsoft stemming tokenizer for Czech.
danish	string	Selects the Microsoft stemming tokenizer for Danish.
dutch	string	Selects the Microsoft stemming tokenizer for Dutch.
english	string	Selects the Microsoft stemming tokenizer for English.
estonian	string	Selects the Microsoft stemming tokenizer for Estonian.
finnish	string	Selects the Microsoft stemming tokenizer for Finnish.
french	string	Selects the Microsoft stemming tokenizer for French.
german	string	Selects the Microsoft stemming tokenizer for German.
greek	string	Selects the Microsoft stemming tokenizer for Greek.
gujarati	string	Selects the Microsoft stemming tokenizer for Gujarati.
hebrew	string	Selects the Microsoft stemming tokenizer for Hebrew.
hindi	string	Selects the Microsoft stemming tokenizer for Hindi.
hungarian	string	Selects the Microsoft stemming tokenizer for Hungarian.
icelandic	string	Selects the Microsoft stemming tokenizer for Icelandic.
indonesian	string	Selects the Microsoft stemming tokenizer for Indonesian.
italian	string	Selects the Microsoft stemming tokenizer for Italian.
kannada	string	Selects the Microsoft stemming tokenizer for Kannada.
latvian	string	Selects the Microsoft stemming tokenizer for Latvian.
lithuanian	string	Selects the Microsoft stemming tokenizer for Lithuanian.
malay	string	Selects the Microsoft stemming tokenizer for Malay.
malayalam	string	Selects the Microsoft stemming tokenizer for Malayalam.
marathi	string	Selects the Microsoft stemming tokenizer for Marathi.
norwegianBokmaal	string	Selects the Microsoft stemming tokenizer for Norwegian (Bokmål).
polish	string	Selects the Microsoft stemming tokenizer for Polish.
portuguese	string	Selects the Microsoft stemming tokenizer for Portuguese.
portugueseBrazilian	string	Selects the Microsoft stemming tokenizer for Portuguese (Brazil).
punjabi	string	Selects the Microsoft stemming tokenizer for Punjabi.
romanian	string	Selects the Microsoft stemming tokenizer for Romanian.
russian	string	Selects the Microsoft stemming tokenizer for Russian.
serbianCyrillic	string	Selects the Microsoft stemming tokenizer for Serbian (Cyrillic).
serbianLatin	string	Selects the Microsoft stemming tokenizer for Serbian (Latin).
slovak	string	Selects the Microsoft stemming tokenizer for Slovak.
slovenian	string	Selects the Microsoft stemming tokenizer for Slovenian.
spanish	string	Selects the Microsoft stemming tokenizer for Spanish.
swedish	string	Selects the Microsoft stemming tokenizer for Swedish.
tamil	string	Selects the Microsoft stemming tokenizer for Tamil.
telugu	string	Selects the Microsoft stemming tokenizer for Telugu.
turkish	string	Selects the Microsoft stemming tokenizer for Turkish.
ukrainian	string	Selects the Microsoft stemming tokenizer for Ukrainian.
urdu	string	Selects the Microsoft stemming tokenizer for Urdu.

MicrosoftTokenizerLanguage

Lists the languages supported by the Microsoft language tokenizer.

Name	Type	Description
bangla	string	Selects the Microsoft tokenizer for Bangla.
bulgarian	string	Selects the Microsoft tokenizer for Bulgarian.
catalan	string	Selects the Microsoft tokenizer for Catalan.
chineseSimplified	string	Selects the Microsoft tokenizer for Chinese (Simplified).
chineseTraditional	string	Selects the Microsoft tokenizer for Chinese (Traditional).
croatian	string	Selects the Microsoft tokenizer for Croatian.
czech	string	Selects the Microsoft tokenizer for Czech.
danish	string	Selects the Microsoft tokenizer for Danish.
dutch	string	Selects the Microsoft tokenizer for Dutch.
english	string	Selects the Microsoft tokenizer for English.
french	string	Selects the Microsoft tokenizer for French.
german	string	Selects the Microsoft tokenizer for German.
greek	string	Selects the Microsoft tokenizer for Greek.
gujarati	string	Selects the Microsoft tokenizer for Gujarati.
hindi	string	Selects the Microsoft tokenizer for Hindi.
icelandic	string	Selects the Microsoft tokenizer for Icelandic.
indonesian	string	Selects the Microsoft tokenizer for Indonesian.
italian	string	Selects the Microsoft tokenizer for Italian.
japanese	string	Selects the Microsoft tokenizer for Japanese.
kannada	string	Selects the Microsoft tokenizer for Kannada.
korean	string	Selects the Microsoft tokenizer for Korean.
malay	string	Selects the Microsoft tokenizer for Malay.
malayalam	string	Selects the Microsoft tokenizer for Malayalam.
marathi	string	Selects the Microsoft tokenizer for Marathi.
norwegianBokmaal	string	Selects the Microsoft tokenizer for Norwegian (Bokmål).
polish	string	Selects the Microsoft tokenizer for Polish.
portuguese	string	Selects the Microsoft tokenizer for Portuguese.
portugueseBrazilian	string	Selects the Microsoft tokenizer for Portuguese (Brazil).
punjabi	string	Selects the Microsoft tokenizer for Punjabi.
romanian	string	Selects the Microsoft tokenizer for Romanian.
russian	string	Selects the Microsoft tokenizer for Russian.
serbianCyrillic	string	Selects the Microsoft tokenizer for Serbian (Cyrillic).
serbianLatin	string	Selects the Microsoft tokenizer for Serbian (Latin).
slovenian	string	Selects the Microsoft tokenizer for Slovenian.
spanish	string	Selects the Microsoft tokenizer for Spanish.
swedish	string	Selects the Microsoft tokenizer for Swedish.
tamil	string	Selects the Microsoft tokenizer for Tamil.
telugu	string	Selects the Microsoft tokenizer for Telugu.
thai	string	Selects the Microsoft tokenizer for Thai.
ukrainian	string	Selects the Microsoft tokenizer for Ukrainian.
urdu	string	Selects the Microsoft tokenizer for Urdu.
vietnamese	string	Selects the Microsoft tokenizer for Vietnamese.

NGramTokenFilter

Generates n-grams of the given size(s). This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.NGramTokenFilter		A URI fragment specifying the type of token filter.
maxGram	integer	2	The maximum n-gram length. Default is 2.
minGram	integer	1	The minimum n-gram length. Default is 1. Must be less than the value of maxGram.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

NGramTokenFilterV2

Generates n-grams of the given size(s). This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.NGramTokenFilterV2		A URI fragment specifying the type of token filter.
maxGram	integer	2	The maximum n-gram length. Default is 2. Maximum is 300.
minGram	integer	1	The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

NGramTokenizer

Tokenizes the input into n-grams of the given size(s). This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.NGramTokenizer		A URI fragment specifying the type of tokenizer.
maxGram	integer	2	The maximum n-gram length. Default is 2. Maximum is 300.
minGram	integer	1	The minimum n-gram length. Default is 1. Maximum is 300. Must be less than the value of maxGram.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
tokenChars	TokenCharacterKind[]		Character classes to keep in the tokens.

OutputFieldMappingEntry

Output field mapping for a skill.

Name	Type	Description
name	string	The name of the output defined by the skill.
targetName	string	The target name of the output. It is optional and default to name.

PathHierarchyTokenizerV2

Tokenizer for path-like hierarchies. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.PathHierarchyTokenizerV2		A URI fragment specifying the type of tokenizer.
delimiter	string	/	The delimiter character to use. Default is "/".
maxTokenLength	integer	300	The maximum token length. Default and maximum is 300.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
replacement	string	/	A value that, if set, replaces the delimiter character. Default is "/".
reverse	boolean	False	A value indicating whether to generate tokens in reverse order. Default is false.
skip	integer	0	The number of initial tokens to skip. Default is 0.

PatternAnalyzer

Flexibly separates text into terms via a regular expression pattern. This analyzer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.PatternAnalyzer		A URI fragment specifying the type of analyzer.
flags	RegexFlags		Regular expression flags.
lowercase	boolean	True	A value indicating whether terms should be lower-cased. Default is true.
name	string		The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
pattern	string	\W+	A regular expression pattern to match token separators. Default is an expression that matches one or more non-word characters.
stopwords	string[]		A list of stopwords.

PatternCaptureTokenFilter

Uses Java regexes to emit multiple tokens - one for each capture group in one or more patterns. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.PatternCaptureTokenFilter		A URI fragment specifying the type of token filter.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
patterns	string[]		A list of patterns to match against each token.
preserveOriginal	boolean	True	A value indicating whether to return the original token even if one of the patterns matches. Default is true.

PatternReplaceCharFilter

A character filter that replaces characters in the input string. It uses a regular expression to identify character sequences to preserve and a replacement pattern to identify characters to replace. For example, given the input text "aa bb aa bb", pattern "(aa)\s+(bb)", and replacement "$1#$2", the result would be "aa#bb aa#bb". This character filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.PatternReplaceCharFilter	A URI fragment specifying the type of char filter.
name	string	The name of the char filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
pattern	string	A regular expression pattern.
replacement	string	The replacement text.

PatternReplaceTokenFilter

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.PatternReplaceTokenFilter	A URI fragment specifying the type of token filter.
name	string	The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
pattern	string	A regular expression pattern.
replacement	string	The replacement text.

PatternTokenizer

Tokenizer that uses regex pattern matching to construct distinct tokens. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.PatternTokenizer		A URI fragment specifying the type of tokenizer.
flags	RegexFlags		Regular expression flags.
group	integer	-1	The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens. Use -1 if you want to use the entire pattern to split the input into tokens, irrespective of matching groups. Default is -1.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
pattern	string	\W+	A regular expression pattern to match token separators. Default is an expression that matches one or more non-word characters.

PhoneticEncoder

Identifies the type of phonetic encoder to use with a PhoneticTokenFilter.

Name	Type	Description
beiderMorse	string	Encodes a token into a Beider-Morse value.
caverphone1	string	Encodes a token into a Caverphone 1.0 value.
caverphone2	string	Encodes a token into a Caverphone 2.0 value.
cologne	string	Encodes a token into a Cologne Phonetic value.
doubleMetaphone	string	Encodes a token into a double metaphone value.
haasePhonetik	string	Encodes a token using the Haase refinement of the Kölner Phonetik algorithm.
koelnerPhonetik	string	Encodes a token using the Kölner Phonetik algorithm.
metaphone	string	Encodes a token into a Metaphone value.
nysiis	string	Encodes a token into a NYSIIS value.
refinedSoundex	string	Encodes a token into a Refined Soundex value.
soundex	string	Encodes a token into a Soundex value.

PhoneticTokenFilter

Create tokens for phonetic matches. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.PhoneticTokenFilter		A URI fragment specifying the type of token filter.
encoder	PhoneticEncoder	metaphone	The phonetic encoder to use. Default is "metaphone".
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
replace	boolean	True	A value indicating whether encoded tokens should replace original tokens. If false, encoded tokens are added as synonyms. Default is true.

PrioritizedFields

Describes the title, content, and keywords fields to be used for semantic ranking, captions, highlights, and answers.

Name	Type	Description
prioritizedContentFields	SemanticField[]	Defines the content fields to be used for semantic ranking, captions, highlights, and answers. For the best result, the selected fields should contain text in natural language form. The order of the fields in the array represents their priority. Fields with lower priority may get truncated if the content is long.
prioritizedKeywordsFields	SemanticField[]	Defines the keyword fields to be used for semantic ranking, captions, highlights, and answers. For the best result, the selected fields should contain a list of keywords. The order of the fields in the array represents their priority. Fields with lower priority may get truncated if the content is long.
titleField	SemanticField	Defines the title field to be used for semantic ranking, captions, highlights, and answers. If you don't have a title field in your index, leave this blank.

RegexFlags

Defines flags that can be combined to control how regular expressions are used in the pattern analyzer and pattern tokenizer.

Name	Type	Description
CANON_EQ	string	Enables canonical equivalence.
CASE_INSENSITIVE	string	Enables case-insensitive matching.
COMMENTS	string	Permits whitespace and comments in the pattern.
DOTALL	string	Enables dotall mode.
LITERAL	string	Enables literal parsing of the pattern.
MULTILINE	string	Enables multiline mode.
UNICODE_CASE	string	Enables Unicode-aware case folding.
UNIX_LINES	string	Enables Unix lines mode.

ScalarQuantizationParameters

Contains the parameters specific to Scalar Quantization.

Name	Type	Description
quantizedDataType	VectorSearchCompressionTargetDataType	The quantized data type of compressed vector values.

ScalarQuantizationVectorSearchCompressionConfiguration

Contains configuration options specific to the scalar quantization compression method used during indexing and querying.

Name	Type	Default value	Description
defaultOversampling	number		Default oversampling factor. Oversampling will internally request more documents (specified by this multiplier) in the initial search. This increases the set of results that will be reranked using recomputed similarity scores from full-precision vectors. Minimum value is 1, meaning no oversampling (1x). This parameter can only be set when rerankWithOriginalVectors is true. Higher values improve recall at the expense of latency.
kind	string: scalarQuantization		The name of the kind of compression method being configured for use with vector search.
name	string		The name to associate with this particular configuration.
rerankWithOriginalVectors	boolean	True	If set to true, once the ordered set of results calculated using compressed vectors are obtained, they will be reranked again by recalculating the full-precision similarity scores. This will improve recall at the expense of latency.
scalarQuantizationParameters	ScalarQuantizationParameters		Contains the parameters specific to Scalar Quantization.

ScoringFunctionAggregation

Defines the aggregation function used to combine the results of all the scoring functions in a scoring profile.

Name	Type	Description
average	string	Boost scores by the average of all scoring function results.
firstMatching	string	Boost scores using the first applicable scoring function in the scoring profile.
maximum	string	Boost scores by the maximum of all scoring function results.
minimum	string	Boost scores by the minimum of all scoring function results.
sum	string	Boost scores by the sum of all scoring function results.

ScoringFunctionInterpolation

Defines the function used to interpolate score boosting across a range of documents.

Name	Type	Description
constant	string	Boosts scores by a constant factor.
linear	string	Boosts scores by a linearly decreasing amount. This is the default interpolation for scoring functions.
logarithmic	string	Boosts scores by an amount that decreases logarithmically. Boosts decrease quickly for higher scores, and more slowly as the scores decrease. This interpolation option is not allowed in tag scoring functions.
quadratic	string	Boosts scores by an amount that decreases quadratically. Boosts decrease slowly for higher scores, and more quickly as the scores decrease. This interpolation option is not allowed in tag scoring functions.

ScoringProfile

Defines parameters for a search index that influence scoring in search queries.

Name	Type	Description
functionAggregation	ScoringFunctionAggregation	A value indicating how the results of individual scoring functions should be combined. Defaults to "Sum". Ignored if there are no scoring functions.
functions	ScoringFunction[]: DistanceScoringFunction[] FreshnessScoringFunction[] MagnitudeScoringFunction[] TagScoringFunction[]	The collection of functions that influence the scoring of documents.
name	string	The name of the scoring profile.
text	TextWeights	Parameters that boost scoring based on text matches in certain index fields.

SearchField

Represents a field in an index definition, which describes the name, data type, and search behavior of a field.

Name	Type	Description
analyzer	LexicalAnalyzerName	The name of the analyzer to use for the field. This option can be used only with searchable fields and it can't be set together with either searchAnalyzer or indexAnalyzer. Once the analyzer is chosen, it cannot be changed for the field. Must be null for complex fields.
dimensions	integer	The dimensionality of the vector field.
facetable	boolean	A value indicating whether to enable the field to be referenced in facet queries. Typically used in a presentation of search results that includes hit count by category (for example, search for digital cameras and see hits by brand, by megapixels, by price, and so on). This property must be null for complex fields. Fields of type Edm.GeographyPoint or Collection(Edm.GeographyPoint) cannot be facetable. Default is true for all other simple fields.
fields	SearchField[]	A list of sub-fields if this is a field of type Edm.ComplexType or Collection(Edm.ComplexType). Must be null or empty for simple fields.
filterable	boolean	A value indicating whether to enable the field to be referenced in $filter queries. filterable differs from searchable in how strings are handled. Fields of type Edm.String or Collection(Edm.String) that are filterable do not undergo word-breaking, so comparisons are for exact matches only. For example, if you set such a field f to "sunny day", $filter=f eq 'sunny' will find no matches, but $filter=f eq 'sunny day' will. This property must be null for complex fields. Default is true for simple fields and null for complex fields.
indexAnalyzer	LexicalAnalyzerName	The name of the analyzer used at indexing time for the field. This option can be used only with searchable fields. It must be set together with searchAnalyzer and it cannot be set together with the analyzer option. This property cannot be set to the name of a language analyzer; use the analyzer property instead if you need a language analyzer. Once the analyzer is chosen, it cannot be changed for the field. Must be null for complex fields.
key	boolean	A value indicating whether the field uniquely identifies documents in the index. Exactly one top-level field in each index must be chosen as the key field and it must be of type Edm.String. Key fields can be used to look up documents directly and update or delete specific documents. Default is false for simple fields and null for complex fields.
name	string	The name of the field, which must be unique within the fields collection of the index or parent field.
retrievable	boolean	A value indicating whether the field can be returned in a search result. You can disable this option if you want to use a field (for example, margin) as a filter, sorting, or scoring mechanism but do not want the field to be visible to the end user. This property must be true for key fields, and it must be null for complex fields. This property can be changed on existing fields. Enabling this property does not cause any increase in index storage requirements. Default is true for simple fields, false for vector fields, and null for complex fields.
searchAnalyzer	LexicalAnalyzerName	The name of the analyzer used at search time for the field. This option can be used only with searchable fields. It must be set together with indexAnalyzer and it cannot be set together with the analyzer option. This property cannot be set to the name of a language analyzer; use the analyzer property instead if you need a language analyzer. This analyzer can be updated on an existing field. Must be null for complex fields.
searchable	boolean	A value indicating whether the field is full-text searchable. This means it will undergo analysis such as word-breaking during indexing. If you set a searchable field to a value like "sunny day", internally it will be split into the individual tokens "sunny" and "day". This enables full-text searches for these terms. Fields of type Edm.String or Collection(Edm.String) are searchable by default. This property must be false for simple fields of other non-string data types, and it must be null for complex fields. Note: searchable fields consume extra space in your index to accommodate additional tokenized versions of the field value for full-text searches. If you want to save space in your index and you don't need a field to be included in searches, set searchable to false.
sortable	boolean	A value indicating whether to enable the field to be referenced in $orderby expressions. By default, the search engine sorts results by score, but in many experiences users will want to sort by fields in the documents. A simple field can be sortable only if it is single-valued (it has a single value in the scope of the parent document). Simple collection fields cannot be sortable, since they are multi-valued. Simple sub-fields of complex collections are also multi-valued, and therefore cannot be sortable. This is true whether it's an immediate parent field, or an ancestor field, that's the complex collection. Complex fields cannot be sortable and the sortable property must be null for such fields. The default for sortable is true for single-valued simple fields, false for multi-valued simple fields, and null for complex fields.
stored	boolean	An immutable value indicating whether the field will be persisted separately on disk to be returned in a search result. You can disable this option if you don't plan to return the field contents in a search response to save on storage overhead. This can only be set during index creation and only for vector fields. This property cannot be changed for existing fields or set as false for new fields. If this property is set as false, the property 'retrievable' must also be set to false. This property must be true or unset for key fields, for new fields, and for non-vector fields, and it must be null for complex fields. Disabling this property will reduce index storage requirements. The default is true for vector fields.
synonymMaps	string[]	A list of the names of synonym maps to associate with this field. This option can be used only with searchable fields. Currently only one synonym map per field is supported. Assigning a synonym map to a field ensures that query terms targeting that field are expanded at query-time using the rules in the synonym map. This attribute can be changed on existing fields. Must be null or an empty collection for complex fields.
type	SearchFieldDataType	The data type of the field.
vectorEncoding	VectorEncodingFormat	The encoding format to interpret the field contents.
vectorSearchProfile	string	The name of the vector search profile that specifies the algorithm and vectorizer to use when searching the vector field.

SearchFieldDataType

Defines the data type of a field in a search index.

Name	Type	Description
Edm.Boolean	string	Indicates that a field contains a Boolean value (true or false).
Edm.Byte	string	Indicates that a field contains a 8-bit unsigned integer. This is only valid when used with Collection(Edm.Byte).
Edm.ComplexType	string	Indicates that a field contains one or more complex objects that in turn have sub-fields of other types.
Edm.DateTimeOffset	string	Indicates that a field contains a date/time value, including timezone information.
Edm.Double	string	Indicates that a field contains an IEEE double-precision floating point number.
Edm.GeographyPoint	string	Indicates that a field contains a geo-location in terms of longitude and latitude.
Edm.Half	string	Indicates that a field contains a half-precision floating point number. This is only valid when used with Collection(Edm.Half).
Edm.Int16	string	Indicates that a field contains a 16-bit signed integer. This is only valid when used with Collection(Edm.Int16).
Edm.Int32	string	Indicates that a field contains a 32-bit signed integer.
Edm.Int64	string	Indicates that a field contains a 64-bit signed integer.
Edm.SByte	string	Indicates that a field contains a 8-bit signed integer. This is only valid when used with Collection(Edm.SByte).
Edm.Single	string	Indicates that a field contains a single-precision floating point number. This is only valid when used with Collection(Edm.Single).
Edm.String	string	Indicates that a field contains a string.

SearchIndex

Represents a search index definition, which describes the fields and search behavior of an index.

Name	Type	Description
@odata.etag	string	The ETag of the index.
analyzers	LexicalAnalyzer[]: CustomAnalyzer[] LuceneStandardAnalyzer[] PatternAnalyzer[] StopAnalyzer[]	The analyzers for the index.
charFilters	CharFilter[]: MappingCharFilter[] PatternReplaceCharFilter[]	The character filters for the index.
corsOptions	CorsOptions	Options to control Cross-Origin Resource Sharing (CORS) for the index.
defaultScoringProfile	string	The name of the scoring profile to use if none is specified in the query. If this property is not set and no scoring profile is specified in the query, then default scoring (tf-idf) will be used.
encryptionKey	SearchResourceEncryptionKey	A description of an encryption key that you create in Azure Key Vault. This key is used to provide an additional level of encryption-at-rest for your data when you want full assurance that no one, not even Microsoft, can decrypt your data. Once you have encrypted your data, it will always remain encrypted. The search service will ignore attempts to set this property to null. You can change this property as needed if you want to rotate your encryption key; Your data will be unaffected. Encryption with customer-managed keys is not available for free search services, and is only available for paid services created on or after January 1, 2019.
fields	SearchField[]	The fields of the index.
name	string	The name of the index.
scoringProfiles	ScoringProfile[]	The scoring profiles for the index.
semantic	SemanticSettings	Defines parameters for a search index that influence semantic capabilities.
similarity	Similarity: BM25Similarity ClassicSimilarity	The type of similarity algorithm to be used when scoring and ranking the documents matching a search query. The similarity algorithm can only be defined at index creation time and cannot be modified on existing indexes. If null, the ClassicSimilarity algorithm is used.
suggesters	Suggester[]	The suggesters for the index.
tokenFilters	TokenFilter[]: AsciiFoldingTokenFilter[] CjkBigramTokenFilter[] CommonGramTokenFilter[] DictionaryDecompounderTokenFilter[] EdgeNGramTokenFilter[] EdgeNGramTokenFilterV2[] ElisionTokenFilter[] KeepTokenFilter[] KeywordMarkerTokenFilter[] LengthTokenFilter[] LimitTokenFilter[] NGramTokenFilter[] NGramTokenFilterV2[] PatternCaptureTokenFilter[] PatternReplaceTokenFilter[] PhoneticTokenFilter[] ShingleTokenFilter[] SnowballTokenFilter[] StemmerOverrideTokenFilter[] StemmerTokenFilter[] StopwordsTokenFilter[] SynonymTokenFilter[] TruncateTokenFilter[] UniqueTokenFilter[] WordDelimiterTokenFilter[]	The token filters for the index.
tokenizers	LexicalTokenizer[]: ClassicTokenizer[] EdgeNGramTokenizer[] KeywordTokenizer[] KeywordTokenizerV2[] LuceneStandardTokenizer[] LuceneStandardTokenizerV2[] MicrosoftLanguageStemmingTokenizer[] MicrosoftLanguageTokenizer[] NGramTokenizer[] PathHierarchyTokenizerV2[] PatternTokenizer[] UaxUrlEmailTokenizer[]	The tokenizers for the index.
vectorSearch	VectorSearch	Contains configuration options related to vector search.

SearchIndexerDataNoneIdentity

Clears the identity property of a datasource.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.DataNoneIdentity	A URI fragment specifying the type of identity.

SearchIndexerDataUserAssignedIdentity

Specifies the identity for a datasource to use.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.DataUserAssignedIdentity	A URI fragment specifying the type of identity.
userAssignedIdentity	string	The fully qualified Azure resource Id of a user assigned managed identity typically in the form "/subscriptions/12345678-1234-1234-1234-1234567890ab/resourceGroups/rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myId" that should have been assigned to the search service.

SearchResourceEncryptionKey

A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps.

Name	Type	Description
accessCredentials	AzureActiveDirectoryApplicationCredentials	Optional Azure Active Directory credentials used for accessing your Azure Key Vault. Not required if using managed identity instead.
keyVaultKeyName	string	The name of your Azure Key Vault key to be used to encrypt your data at rest.
keyVaultKeyVersion	string	The version of your Azure Key Vault key to be used to encrypt your data at rest.
keyVaultUri	string	The URI of your Azure Key Vault, also referred to as DNS name, that contains the key to be used to encrypt your data at rest. An example URI might be `https://my-keyvault-name.vault.azure.net`.

SemanticConfiguration

Defines a specific configuration to be used in the context of semantic capabilities.

Name	Type	Description
name	string	The name of the semantic configuration.
prioritizedFields	PrioritizedFields	Describes the title, content, and keyword fields to be used for semantic ranking, captions, highlights, and answers. At least one of the three sub properties (titleField, prioritizedKeywordsFields and prioritizedContentFields) need to be set.

SemanticField

A field that is used as part of the semantic configuration.

Name	Type	Description
fieldName	string

SemanticSettings

Defines parameters for a search index that influence semantic capabilities.

Name	Type	Description
configurations	SemanticConfiguration[]	The semantic configurations for the index.
defaultConfiguration	string	Allows you to set the name of a default semantic configuration in your index, making it optional to pass it on as a query parameter every time.

ShingleTokenFilter

Creates combinations of tokens as a single token. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.ShingleTokenFilter		A URI fragment specifying the type of token filter.
filterToken	string	_	The string to insert for each position at which there is no token. Default is an underscore ("_").
maxShingleSize	integer	2	The maximum shingle size. Default and minimum value is 2.
minShingleSize	integer	2	The minimum shingle size. Default and minimum value is 2. Must be less than the value of maxShingleSize.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
outputUnigrams	boolean	True	A value indicating whether the output stream will contain the input tokens (unigrams) as well as shingles. Default is true.
outputUnigramsIfNoShingles	boolean	False	A value indicating whether to output unigrams for those times when no shingles are available. This property takes precedence when outputUnigrams is set to false. Default is false.
tokenSeparator	string		The string to use when joining adjacent tokens to form a shingle. Default is a single space (" ").

SnowballTokenFilter

A filter that stems words using a Snowball-generated stemmer. This token filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.SnowballTokenFilter	A URI fragment specifying the type of token filter.
language	SnowballTokenFilterLanguage	The language to use.
name	string	The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

SnowballTokenFilterLanguage

The language to use for a Snowball token filter.

Name	Type	Description
armenian	string	Selects the Lucene Snowball stemming tokenizer for Armenian.
basque	string	Selects the Lucene Snowball stemming tokenizer for Basque.
catalan	string	Selects the Lucene Snowball stemming tokenizer for Catalan.
danish	string	Selects the Lucene Snowball stemming tokenizer for Danish.
dutch	string	Selects the Lucene Snowball stemming tokenizer for Dutch.
english	string	Selects the Lucene Snowball stemming tokenizer for English.
finnish	string	Selects the Lucene Snowball stemming tokenizer for Finnish.
french	string	Selects the Lucene Snowball stemming tokenizer for French.
german	string	Selects the Lucene Snowball stemming tokenizer for German.
german2	string	Selects the Lucene Snowball stemming tokenizer that uses the German variant algorithm.
hungarian	string	Selects the Lucene Snowball stemming tokenizer for Hungarian.
italian	string	Selects the Lucene Snowball stemming tokenizer for Italian.
kp	string	Selects the Lucene Snowball stemming tokenizer for Dutch that uses the Kraaij-Pohlmann stemming algorithm.
lovins	string	Selects the Lucene Snowball stemming tokenizer for English that uses the Lovins stemming algorithm.
norwegian	string	Selects the Lucene Snowball stemming tokenizer for Norwegian.
porter	string	Selects the Lucene Snowball stemming tokenizer for English that uses the Porter stemming algorithm.
portuguese	string	Selects the Lucene Snowball stemming tokenizer for Portuguese.
romanian	string	Selects the Lucene Snowball stemming tokenizer for Romanian.
russian	string	Selects the Lucene Snowball stemming tokenizer for Russian.
spanish	string	Selects the Lucene Snowball stemming tokenizer for Spanish.
swedish	string	Selects the Lucene Snowball stemming tokenizer for Swedish.
turkish	string	Selects the Lucene Snowball stemming tokenizer for Turkish.

StemmerOverrideTokenFilter

Provides the ability to override other stemming filters with custom dictionary-based stemming. Any dictionary-stemmed terms will be marked as keywords so that they will not be stemmed with stemmers down the chain. Must be placed before any stemming filters. This token filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.StemmerOverrideTokenFilter	A URI fragment specifying the type of token filter.
name	string	The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
rules	string[]	A list of stemming rules in the following format: "word => stem", for example: "ran => run".

StemmerTokenFilter

Language specific stemming filter. This token filter is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.StemmerTokenFilter	A URI fragment specifying the type of token filter.
language	StemmerTokenFilterLanguage	The language to use.
name	string	The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

StemmerTokenFilterLanguage

The language to use for a stemmer token filter.

Name	Type	Description
arabic	string	Selects the Lucene stemming tokenizer for Arabic.
armenian	string	Selects the Lucene stemming tokenizer for Armenian.
basque	string	Selects the Lucene stemming tokenizer for Basque.
brazilian	string	Selects the Lucene stemming tokenizer for Portuguese (Brazil).
bulgarian	string	Selects the Lucene stemming tokenizer for Bulgarian.
catalan	string	Selects the Lucene stemming tokenizer for Catalan.
czech	string	Selects the Lucene stemming tokenizer for Czech.
danish	string	Selects the Lucene stemming tokenizer for Danish.
dutch	string	Selects the Lucene stemming tokenizer for Dutch.
dutchKp	string	Selects the Lucene stemming tokenizer for Dutch that uses the Kraaij-Pohlmann stemming algorithm.
english	string	Selects the Lucene stemming tokenizer for English.
finnish	string	Selects the Lucene stemming tokenizer for Finnish.
french	string	Selects the Lucene stemming tokenizer for French.
galician	string	Selects the Lucene stemming tokenizer for Galician.
german	string	Selects the Lucene stemming tokenizer for German.
german2	string	Selects the Lucene stemming tokenizer that uses the German variant algorithm.
greek	string	Selects the Lucene stemming tokenizer for Greek.
hindi	string	Selects the Lucene stemming tokenizer for Hindi.
hungarian	string	Selects the Lucene stemming tokenizer for Hungarian.
indonesian	string	Selects the Lucene stemming tokenizer for Indonesian.
irish	string	Selects the Lucene stemming tokenizer for Irish.
italian	string	Selects the Lucene stemming tokenizer for Italian.
latvian	string	Selects the Lucene stemming tokenizer for Latvian.
lightEnglish	string	Selects the Lucene stemming tokenizer for English that does light stemming.
lightFinnish	string	Selects the Lucene stemming tokenizer for Finnish that does light stemming.
lightFrench	string	Selects the Lucene stemming tokenizer for French that does light stemming.
lightGerman	string	Selects the Lucene stemming tokenizer for German that does light stemming.
lightHungarian	string	Selects the Lucene stemming tokenizer for Hungarian that does light stemming.
lightItalian	string	Selects the Lucene stemming tokenizer for Italian that does light stemming.
lightNorwegian	string	Selects the Lucene stemming tokenizer for Norwegian (Bokmål) that does light stemming.
lightNynorsk	string	Selects the Lucene stemming tokenizer for Norwegian (Nynorsk) that does light stemming.
lightPortuguese	string	Selects the Lucene stemming tokenizer for Portuguese that does light stemming.
lightRussian	string	Selects the Lucene stemming tokenizer for Russian that does light stemming.
lightSpanish	string	Selects the Lucene stemming tokenizer for Spanish that does light stemming.
lightSwedish	string	Selects the Lucene stemming tokenizer for Swedish that does light stemming.
lovins	string	Selects the Lucene stemming tokenizer for English that uses the Lovins stemming algorithm.
minimalEnglish	string	Selects the Lucene stemming tokenizer for English that does minimal stemming.
minimalFrench	string	Selects the Lucene stemming tokenizer for French that does minimal stemming.
minimalGalician	string	Selects the Lucene stemming tokenizer for Galician that does minimal stemming.
minimalGerman	string	Selects the Lucene stemming tokenizer for German that does minimal stemming.
minimalNorwegian	string	Selects the Lucene stemming tokenizer for Norwegian (Bokmål) that does minimal stemming.
minimalNynorsk	string	Selects the Lucene stemming tokenizer for Norwegian (Nynorsk) that does minimal stemming.
minimalPortuguese	string	Selects the Lucene stemming tokenizer for Portuguese that does minimal stemming.
norwegian	string	Selects the Lucene stemming tokenizer for Norwegian (Bokmål).
porter2	string	Selects the Lucene stemming tokenizer for English that uses the Porter2 stemming algorithm.
portuguese	string	Selects the Lucene stemming tokenizer for Portuguese.
portugueseRslp	string	Selects the Lucene stemming tokenizer for Portuguese that uses the RSLP stemming algorithm.
possessiveEnglish	string	Selects the Lucene stemming tokenizer for English that removes trailing possessives from words.
romanian	string	Selects the Lucene stemming tokenizer for Romanian.
russian	string	Selects the Lucene stemming tokenizer for Russian.
sorani	string	Selects the Lucene stemming tokenizer for Sorani.
spanish	string	Selects the Lucene stemming tokenizer for Spanish.
swedish	string	Selects the Lucene stemming tokenizer for Swedish.
turkish	string	Selects the Lucene stemming tokenizer for Turkish.

StopAnalyzer

Divides text at non-letters; Applies the lowercase and stopword token filters. This analyzer is implemented using Apache Lucene.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.StopAnalyzer	A URI fragment specifying the type of analyzer.
name	string	The name of the analyzer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
stopwords	string[]	A list of stopwords.

StopwordsList

Identifies a predefined list of language-specific stopwords.

Name	Type	Description
arabic	string	Selects the stopword list for Arabic.
armenian	string	Selects the stopword list for Armenian.
basque	string	Selects the stopword list for Basque.
brazilian	string	Selects the stopword list for Portuguese (Brazil).
bulgarian	string	Selects the stopword list for Bulgarian.
catalan	string	Selects the stopword list for Catalan.
czech	string	Selects the stopword list for Czech.
danish	string	Selects the stopword list for Danish.
dutch	string	Selects the stopword list for Dutch.
english	string	Selects the stopword list for English.
finnish	string	Selects the stopword list for Finnish.
french	string	Selects the stopword list for French.
galician	string	Selects the stopword list for Galician.
german	string	Selects the stopword list for German.
greek	string	Selects the stopword list for Greek.
hindi	string	Selects the stopword list for Hindi.
hungarian	string	Selects the stopword list for Hungarian.
indonesian	string	Selects the stopword list for Indonesian.
irish	string	Selects the stopword list for Irish.
italian	string	Selects the stopword list for Italian.
latvian	string	Selects the stopword list for Latvian.
norwegian	string	Selects the stopword list for Norwegian.
persian	string	Selects the stopword list for Persian.
portuguese	string	Selects the stopword list for Portuguese.
romanian	string	Selects the stopword list for Romanian.
russian	string	Selects the stopword list for Russian.
sorani	string	Selects the stopword list for Sorani.
spanish	string	Selects the stopword list for Spanish.
swedish	string	Selects the stopword list for Swedish.
thai	string	Selects the stopword list for Thai.
turkish	string	Selects the stopword list for Turkish.

StopwordsTokenFilter

Removes stop words from a token stream. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.StopwordsTokenFilter		A URI fragment specifying the type of token filter.
ignoreCase	boolean	False	A value indicating whether to ignore case. If true, all words are converted to lower case first. Default is false.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
removeTrailing	boolean	True	A value indicating whether to ignore the last search term if it's a stop word. Default is true.
stopwords	string[]		The list of stopwords. This property and the stopwords list property cannot both be set.
stopwordsList	StopwordsList	english	A predefined list of stopwords to use. This property and the stopwords property cannot both be set. Default is English.

Suggester

Defines how the Suggest API should apply to a group of fields in the index.

Name	Type	Description
name	string	The name of the suggester.
searchMode	SuggesterSearchMode	A value indicating the capabilities of the suggester.
sourceFields	string[]	The list of field names to which the suggester applies. Each field must be searchable.

SuggesterSearchMode

A value indicating the capabilities of the suggester.

Name	Type	Description
analyzingInfixMatching	string	Matches consecutive whole terms and prefixes in a field. For example, for the field 'The fastest brown fox', the queries 'fast' and 'fastest brow' would both match.

SynonymTokenFilter

Matches single or multi-word synonyms in a token stream. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.SynonymTokenFilter		A URI fragment specifying the type of token filter.
expand	boolean	True	A value indicating whether all words in the list of synonyms (if => notation is not used) will map to one another. If true, all words in the list of synonyms (if => notation is not used) will map to one another. The following list: incredible, unbelievable, fabulous, amazing is equivalent to: incredible, unbelievable, fabulous, amazing => incredible, unbelievable, fabulous, amazing. If false, the following list: incredible, unbelievable, fabulous, amazing will be equivalent to: incredible, unbelievable, fabulous, amazing => incredible. Default is true.
ignoreCase	boolean	False	A value indicating whether to case-fold input for matching. Default is false.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
synonyms	string[]		A list of synonyms in following one of two formats: 1. incredible, unbelievable, fabulous => amazing - all terms on the left side of => symbol will be replaced with all terms on its right side; 2. incredible, unbelievable, fabulous, amazing - comma separated list of equivalent words. Set the expand option to change how this list is interpreted.

TagScoringFunction

Defines a function that boosts scores of documents with string values matching a given list of tags.

Name	Type	Description
boost	number	A multiplier for the raw score. Must be a positive number not equal to 1.0.
fieldName	string	The name of the field used as input to the scoring function.
interpolation	ScoringFunctionInterpolation	A value indicating how boosting will be interpolated across document scores; defaults to "Linear".
tag	TagScoringParameters	Parameter values for the tag scoring function.
type	string: tag	Indicates the type of function to use. Valid values include magnitude, freshness, distance, and tag. The function type must be lower case.

TagScoringParameters

Provides parameter values to a tag scoring function.

Name	Type	Description
tagsParameter	string	The name of the parameter passed in search queries to specify the list of tags to compare against the target field.

TextWeights

Defines weights on index fields for which matches should boost scoring in search queries.

Name	Type	Description
weights	object	The dictionary of per-field weights to boost document scoring. The keys are field names and the values are the weights for each field.

TokenCharacterKind

Represents classes of characters on which a token filter can operate.

Name	Type	Description
digit	string	Keeps digits in tokens.
letter	string	Keeps letters in tokens.
punctuation	string	Keeps punctuation in tokens.
symbol	string	Keeps symbols in tokens.
whitespace	string	Keeps whitespace in tokens.

TokenFilterName

Defines the names of all token filters supported by the search engine.

Name	Type	Description
apostrophe	string	Strips all characters after an apostrophe (including the apostrophe itself). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html
arabic_normalization	string	A token filter that applies the Arabic normalizer to normalize the orthography. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ar/ArabicNormalizationFilter.html
asciifolding	string	Converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if such equivalents exist. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html
cjk_bigram	string	Forms bigrams of CJK terms that are generated from the standard tokenizer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKBigramFilter.html
cjk_width	string	Normalizes CJK width differences. Folds fullwidth ASCII variants into the equivalent basic Latin, and half-width Katakana variants into the equivalent Kana. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/cjk/CJKWidthFilter.html
classic	string	Removes English possessives, and dots from acronyms. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/standard/ClassicFilter.html
common_grams	string	Construct bigrams for frequently occurring terms while indexing. Single terms are still indexed too, with bigrams overlaid. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html
edgeNGram_v2	string	Generates n-grams of the given size(s) starting from the front or the back of an input token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/EdgeNGramTokenFilter.html
elision	string	Removes elisions. For example, "l'avion" (the plane) will be converted to "avion" (plane). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/util/ElisionFilter.html
german_normalization	string	Normalizes German characters according to the heuristics of the German2 snowball algorithm. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html
hindi_normalization	string	Normalizes text in Hindi to remove some differences in spelling variations. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizationFilter.html
indic_normalization	string	Normalizes the Unicode representation of text in Indian languages. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/in/IndicNormalizationFilter.html
keyword_repeat	string	Emits each incoming token twice, once as keyword and once as non-keyword. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/KeywordRepeatFilter.html
kstem	string	A high-performance kstem filter for English. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/en/KStemFilter.html
length	string	Removes words that are too long or too short. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LengthFilter.html
limit	string	Limits the number of tokens while indexing. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/LimitTokenCountFilter.html
lowercase	string	Normalizes token text to lower case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html
nGram_v2	string	Generates n-grams of the given size(s). See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ngram/NGramTokenFilter.html
persian_normalization	string	Applies normalization for Persian. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizationFilter.html
phonetic	string	Create tokens for phonetic matches. See https://lucene.apache.org/core/4_10_3/analyzers-phonetic/org/apache/lucene/analysis/phonetic/package-tree.html
porter_stem	string	Uses the Porter stemming algorithm to transform the token stream. See http://tartarus.org/~martin/PorterStemmer
reverse	string	Reverses the token string. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/reverse/ReverseStringFilter.html
scandinavian_folding	string	Folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o. It also discriminates against use of double vowels aa, ae, ao, oe and oo, leaving just the first one. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html
scandinavian_normalization	string	Normalizes use of the interchangeable Scandinavian characters. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.html
shingle	string	Creates combinations of tokens as a single token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/shingle/ShingleFilter.html
snowball	string	A filter that stems words using a Snowball-generated stemmer. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/snowball/SnowballFilter.html
sorani_normalization	string	Normalizes the Unicode representation of Sorani text. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/ckb/SoraniNormalizationFilter.html
stemmer	string	Language specific stemming filter. See https://zcusa.951200.xyz/rest/api/searchservice/Custom-analyzers-in-Azure-Search#TokenFilters
stopwords	string	Removes stop words from a token stream. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/core/StopFilter.html
trim	string	Trims leading and trailing whitespace from tokens. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TrimFilter.html
truncate	string	Truncates the terms to a specific length. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/TruncateTokenFilter.html
unique	string	Filters out tokens with same text as the previous token. See http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/miscellaneous/RemoveDuplicatesTokenFilter.html
uppercase	string	Normalizes token text to upper case. See https://lucene.apache.org/core/6_6_1/analyzers-common/org/apache/lucene/analysis/core/UpperCaseFilter.html
word_delimiter	string	Splits words into subwords and performs optional transformations on subword groups.

TruncateTokenFilter

Truncates the terms to a specific length. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.TruncateTokenFilter		A URI fragment specifying the type of token filter.
length	integer	300	The length at which terms will be truncated. Default and maximum is 300.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

UaxUrlEmailTokenizer

Tokenizes urls and emails as one token. This tokenizer is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.UaxUrlEmailTokenizer		A URI fragment specifying the type of tokenizer.
maxTokenLength	integer	255	The maximum token length. Default is 255. Tokens longer than the maximum length are split. The maximum token length that can be used is 300 characters.
name	string		The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

UniqueTokenFilter

Filters out tokens with same text as the previous token. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.UniqueTokenFilter		A URI fragment specifying the type of token filter.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
onlyOnSamePosition	boolean	False	A value indicating whether to remove duplicates only at the same position. Default is false.

VectorEncodingFormat

The encoding format for interpreting vector field contents.

Name	Type	Description
packedBit	string	Encoding format representing bits packed into a wider data type.

VectorSearch

Contains configuration options related to vector search.

Name	Type	Description
algorithms	VectorSearchAlgorithmConfiguration[]: ExhaustiveKnnVectorSearchAlgorithmConfiguration[] HnswVectorSearchAlgorithmConfiguration[]	Contains configuration options specific to the algorithm used during indexing or querying.
compressions	VectorSearchCompressionConfiguration[]: BinaryQuantizationVectorSearchCompressionConfiguration[] ScalarQuantizationVectorSearchCompressionConfiguration[]	Contains configuration options specific to the compression method used during indexing or querying.
profiles	VectorSearchProfile[]	Defines combinations of configurations to use with vector search.
vectorizers	VectorSearchVectorizer[]: AzureOpenAIVectorizer[] WebApiVectorizer[]	Contains configuration options on how to vectorize text vector queries.

VectorSearchAlgorithmKind

The algorithm used for indexing and querying.

Name	Type	Description
exhaustiveKnn	string	Exhaustive KNN algorithm which will perform brute-force search.
hnsw	string	HNSW (Hierarchical Navigable Small World), a type of approximate nearest neighbors algorithm.

VectorSearchAlgorithmMetric

The similarity metric to use for vector comparisons. It is recommended to choose the same similarity metric as the embedding model was trained on.

Name	Type	Description
cosine	string	Measures the angle between vectors to quantify their similarity, disregarding magnitude. The smaller the angle, the closer the similarity.
dotProduct	string	Calculates the sum of element-wise products to gauge alignment and magnitude similarity. The larger and more positive, the closer the similarity.
euclidean	string	Computes the straight-line distance between vectors in a multi-dimensional space. The smaller the distance, the closer the similarity.
hamming	string	Only applicable to bit-packed binary data types. Determines dissimilarity by counting differing positions in binary vectors. The fewer differences, the closer the similarity.

VectorSearchCompressionKind

The compression method used for indexing and querying.

Name	Type	Description
binaryQuantization	string	Binary Quantization, a type of compression method. In binary quantization, the original vectors values are compressed to the narrower binary type by discretizing and representing each component of a vector using binary values, thereby reducing the overall data size.
scalarQuantization	string	Scalar Quantization, a type of compression method. In scalar quantization, the original vectors values are compressed to a narrower type by discretizing and representing each component of a vector using a reduced set of quantized values, thereby reducing the overall data size.

VectorSearchCompressionTargetDataType

The quantized data type of compressed vector values.

Name	Type	Description
int8	string

VectorSearchProfile

Defines a combination of configurations to use with vector search.

Name	Type	Description
algorithm	string	The name of the vector search algorithm configuration that specifies the algorithm and optional parameters.
compression	string	The name of the compression method configuration that specifies the compression method and optional parameters.
name	string	The name to associate with this particular vector search profile.
vectorizer	string	The name of the vectorization being configured for use with vector search.

VectorSearchVectorizerKind

The vectorization method to be used during query time.

Name	Type	Description
azureOpenAI	string	Generate embeddings using an Azure OpenAI resource at query time.
customWebApi	string	Generate embeddings using a custom web endpoint at query time.

WebApiParameters

Specifies the properties for connecting to a user-defined vectorizer.

Name	Type	Description
authIdentity	SearchIndexerDataIdentity: SearchIndexerDataNoneIdentity SearchIndexerDataUserAssignedIdentity	The user-assigned managed identity used for outbound connections. If an authResourceId is provided and it's not specified, the system-assigned managed identity is used. On updates to the indexer, if the identity is unspecified, the value remains unchanged. If set to "none", the value of this property is cleared.
authResourceId	string	Applies to custom endpoints that connect to external code in an Azure function or some other application that provides the transformations. This value should be the application ID created for the function or app when it was registered with Azure Active Directory. When specified, the vectorization connects to the function or app using a managed ID (either system or user-assigned) of the search service and the access token of the function or app, using this value as the resource id for creating the scope of the access token.
httpHeaders	object	The headers required to make the HTTP request.
httpMethod	string	The method for the HTTP request.
timeout	string	The desired timeout for the request. Default is 30 seconds.
uri	string	The URI of the Web API providing the vectorizer.

WebApiVectorizer

Specifies a user-defined vectorizer for generating the vector embedding of a query string. Integration of an external vectorizer is achieved using the custom Web API interface of a skillset.

Name	Type	Description
customWebApiParameters	WebApiParameters	Specifies the properties of the user-defined vectorizer.
kind	string: customWebApi	The name of the kind of vectorization method being configured for use with vector search.
name	string	The name to associate with this particular vectorization method.

WordDelimiterTokenFilter

Splits words into subwords and performs optional transformations on subword groups. This token filter is implemented using Apache Lucene.

Name	Type	Default value	Description
@odata.type	string: #Microsoft.Azure.Search.WordDelimiterTokenFilter		A URI fragment specifying the type of token filter.
catenateAll	boolean	False	A value indicating whether all subword parts will be catenated. For example, if this is set to true, "Azure-Search-1" becomes "AzureSearch1". Default is false.
catenateNumbers	boolean	False	A value indicating whether maximum runs of number parts will be catenated. For example, if this is set to true, "1-2" becomes "12". Default is false.
catenateWords	boolean	False	A value indicating whether maximum runs of word parts will be catenated. For example, if this is set to true, "Azure-Search" becomes "AzureSearch". Default is false.
generateNumberParts	boolean	True	A value indicating whether to generate number subwords. Default is true.
generateWordParts	boolean	True	A value indicating whether to generate part words. If set, causes parts of words to be generated; for example "AzureSearch" becomes "Azure" "Search". Default is true.
name	string		The name of the token filter. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
preserveOriginal	boolean	False	A value indicating whether original words will be preserved and added to the subword list. Default is false.
protectedWords	string[]		A list of tokens to protect from being delimited.
splitOnCaseChange	boolean	True	A value indicating whether to split words on caseChange. For example, if this is set to true, "AzureSearch" becomes "Azure" "Search". Default is true.
splitOnNumerics	boolean	True	A value indicating whether to split on numbers. For example, if this is set to true, "Azure1Search" becomes "Azure" "1" "Search". Default is true.
stemEnglishPossessive	boolean	True	A value indicating whether to remove trailing "'s" for each subword. Default is true.

Share via