PatternTokenizer interface

Reference

Package:: @azure/search-documents

Tokenizer that uses regex pattern matching to construct distinct tokens. This tokenizer is implemented using Apache Lucene.

Properties

flags	Regular expression flags. Possible values include: 'CANON_EQ', 'CASE_INSENSITIVE', 'COMMENTS', 'DOTALL', 'LITERAL', 'MULTILINE', 'UNICODE_CASE', 'UNIX_LINES'
group	The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens. Use -1 if you want to use the entire pattern to split the input into tokens, irrespective of matching groups. Default is -1. Default value: -1.
name	The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.
odatatype	Polymorphic Discriminator
pattern	A regular expression pattern to match token separators. Default is an expression that matches one or more whitespace characters. Default value: `\W+`.

Property Details

flags

Regular expression flags. Possible values include: 'CANON_EQ', 'CASE_INSENSITIVE', 'COMMENTS', 'DOTALL', 'LITERAL', 'MULTILINE', 'UNICODE_CASE', 'UNIX_LINES'

flags?: ("CANON_EQ" | "CASE_INSENSITIVE" | "COMMENTS" | "DOTALL" | "LITERAL" | "MULTILINE" | "UNICODE_CASE" | "UNIX_LINES")[]

Property Value

group

The zero-based ordinal of the matching group in the regular expression pattern to extract into tokens. Use -1 if you want to use the entire pattern to split the input into tokens, irrespective of matching groups. Default is -1. Default value: -1.

group?: number

Property Value

number

name

The name of the tokenizer. It must only contain letters, digits, spaces, dashes or underscores, can only start and end with alphanumeric characters, and is limited to 128 characters.

name: string

Property Value

string

odatatype

Polymorphic Discriminator

odatatype: "#Microsoft.Azure.Search.PatternTokenizer"

Property Value

"#Microsoft.Azure.Search.PatternTokenizer"

pattern

A regular expression pattern to match token separators. Default is an expression that matches one or more whitespace characters. Default value: \W+.

pattern?: string

Property Value

string

Share via

PatternTokenizer interface

Properties

Property Details

flags

Property Value

group

Property Value

name

Property Value

odatatype

Property Value

pattern

Property Value

Additional resources