RobertaPreTokenizer.PreTokenize(String) Method
Definition
Important
Some information relates to prerelease product that may be substantially modified before it’s released. Microsoft makes no warranties, express or implied, with respect to the information provided here.
Splits the given string in multiple substrings at the word boundary, keeping track of the offsets of said substrings from the original string.
public override System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split> PreTokenize (string? sentence);
override this.PreTokenize : string -> System.Collections.Generic.IReadOnlyList<Microsoft.ML.Tokenizers.Split>
Public Overrides Function PreTokenize (sentence As String) As IReadOnlyList(Of Split)
Parameters
- sentence
- String
The string to split into tokens.
Returns
The list of the splits containing the tokens and the token's offsets to the original string.