Share via


RobertaPreTokenizer.PreTokenize(String) Method

Definition

Splits the given string in multiple substrings at the word boundary, keeping track of the offsets of said substrings from the original string.

public override System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize (string? sentence);
override this.PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public Overrides Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)

Parameters

sentence
String

The string to split into tokens.

Returns

The list of the splits containing the tokens and the token's offsets to the original string.

Applies to