Share via


WhiteSpace Class

Definition

The pre-tokenizer which split the text at the word boundary. The word is a set of alphabet, numeric, and underscore characters.

public sealed class WhiteSpace : Microsoft.ML.Tokenizers.PreTokenizer
type WhiteSpace = class
    inherit PreTokenizer
Public NotInheritable Class WhiteSpace
Inherits PreTokenizer
Inheritance
WhiteSpace

Constructors

WhiteSpace()

Fields

Instance

Gets a singleton instance of the WhiteSpace pre-tokenizer..

Methods

PreTokenize(String)

Splits the given string in multiple substrings at the word boundary, keeping track of the offsets of said substrings from the original string.

Applies to