使用 ONNX Runtime 在 WinUI 应用中开始使用 ONNX 模型

项目
12/29/2024

本文指导你创建一个 WinUI 3 应用，该应用使用 ONNX 模型对图像中的对象进行分类并显示每个分类的置信度。有关在 Windows 应用中使用 AI 和机器学习模型的详细信息，请参阅在 Windows 应用中开始使用 AI 和机器学习模型。

使用 AI 功能时，建议查看：在 Windows上开发负责任的生成 AI 应用程序和功能。

什么是 ONNX 运行时

ONNX Runtime 是跨平台机器学习模型加速器，具有灵活的接口来集成特定于硬件的库。 ONNX Runtime 可用于 PyTorch、Tensorflow/Keras、TFLite、scikit-learn和其他框架中的模型。有关详细信息，请参阅 ONNX Runtime 网站 https://onnxruntime.ai/docs/。

此示例使用 DirectML Execution Provider，它在 Windows 设备上的不同硬件选项中进行抽象化处理并在其上运行，并支持在本地加速器（如 GPU 和 NPU）上执行。

先决条件

设备必须启用开发人员模式。有关详细信息，请参阅启用用于开发的设备。
具有 .NET 桌面开发工作负载的 Visual Studio 2022 或更高版本。

创建新的 C# WinUI 应用

在 Visual Studio 中，创建新项目。在“创建新项目”对话框中，将语言筛选器设置为“C#”，将项目类型筛选器设置为“winui”，然后选择“空白应用，打包（WinUI3 in Desktop）”模板。将新项目命名为“ONNXWinUIExample”。

添加对 Nuget 包的引用

在解决方案资源管理器中，右键单击依赖项并选择管理 NuGet 包...。在 NuGet 包管理器中，选择“浏览”选项卡。搜索以下包，并在版本下拉列表中选择最新的稳定版本，然后单击安装。

包	描述
Microsoft.ML.OnnxRuntime.DirectML	提供用于在 GPU 上运行 ONNX 模型的 API。
SixLabors.ImageSharp	提供用于处理模型输入图像的图像实用工具。
SharpDX.DXGI	提供用于从 C# 访问 DirectX 设备的 API。

将以下 using 指令添加到 MainWindows.xaml.cs 的顶部，以便从这些库访问 API。

// MainWindow.xaml.cs
using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using SharpDX.DXGI;
using SixLabors.ImageSharp;
using SixLabors.ImageSharp.Formats;
using SixLabors.ImageSharp.PixelFormats;
using SixLabors.ImageSharp.Processing;

将模型添加到项目

在 解决方案资源管理器中，右键单击你的项目并选择 添加新文件夹>。将新文件夹命名为“model”。在本示例中，我们将使用 https://github.com/onnx/models中的 resnet50-v2-7.onnx 模型。转到 https://github.com/onnx/models/blob/main/validated/vision/classification/resnet/model/resnet50-v2-7.onnx 处的模型的存储库视图。单击“下载原始文件”按钮。将此文件复制到刚刚创建的“model”目录中。

在解决方案资源管理器中，单击模型文件，并将“复制到输出目录”设置为“如果较新则复制”。

创建简单的 UI

对于此示例，我们将创建一个简单的 UI，其中包含一个按钮，允许用户选择一个图像来使用模型进行评估，图像控件显示所选图像，以及一个 TextBlock，用于列出图像中检测到的对象以及每个对象分类的置信度。

在 MainWindow.xaml 文件中，将默认 StackPanel 元素替换为以下 XAML 代码。

<!--MainWindow.xaml-->
<Grid Padding="25" >
    <Grid.ColumnDefinitions>
        <ColumnDefinition/>
        <ColumnDefinition/>
        <ColumnDefinition/>
    </Grid.ColumnDefinitions>
    <Button x:Name="myButton" Click="myButton_Click" Grid.Column="0" VerticalAlignment="Top">Select photo</Button>
    <Image x:Name="myImage" MaxWidth="300" Grid.Column="1" VerticalAlignment="Top"/>
    <TextBlock x:Name="featuresTextBlock" Grid.Column="2" VerticalAlignment="Top"/>
</Grid>

初始化模型

在文件中，在 mainWindow 类中创建名为 InitModel 的帮助程序方法，用于初始化模型。此方法使用 SharpDX.DXGI 库中的 API 来选择第一个可用的适配器。在此会话中，所选适配器被设置在 DirectML 执行提供程序的 SessionOptions 对象中。最后，初始化新的 InferenceSession，并按照指向模型文件的路径和会话选项传递。

// MainWindow.xaml.cs

private InferenceSession _inferenceSession;
private string modelDir = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "model");

private void InitModel()
{
    if (_inferenceSession != null)
    {
        return;
    }

    // Select a graphics device
    var factory1 = new Factory1();
    int deviceId = 0;

    Adapter1 selectedAdapter = factory1.GetAdapter1(0);

    // Create the inference session
    var sessionOptions = new SessionOptions
    {
        LogSeverityLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_INFO
    };
    sessionOptions.AppendExecutionProvider_DML(deviceId);
    _inferenceSession = new InferenceSession($@"{modelDir}\resnet50-v2-7.onnx", sessionOptions);

}

加载和分析图像

为简单起见，本示例所有加载和格式化图像、调用模型和显示结果的步骤都将放在按钮单击处理程序中。请注意，我们将异步关键字添加到默认模板中包含的按钮单击处理程序，以便我们可以在处理程序中运行异步操作。

// MainWindow.xaml.cs

private async void myButton_Click(object sender, RoutedEventArgs e)
{
    ...
}

使用 FileOpenPicker 允许用户从其计算机中选择图像，以便在 UI 中分析和显示图像。

    FileOpenPicker fileOpenPicker = new()
    {
        ViewMode = PickerViewMode.Thumbnail,
        FileTypeFilter = { ".jpg", ".jpeg", ".png", ".gif" },
    };
    InitializeWithWindow.Initialize(fileOpenPicker, WinRT.Interop.WindowNative.GetWindowHandle(this));
    StorageFile file = await fileOpenPicker.PickSingleFileAsync();
    if (file == null)
    {
        return;
    }

    // Display the image in the UI
    var bitmap = new BitmapImage();
    bitmap.SetSource(await file.OpenAsync(Windows.Storage.FileAccessMode.Read));
    myImage.Source = bitmap;

接下来，我们需要处理输入，使其成为模型支持的格式。 SixLabors.ImageSharp 库用于以 24 位 RGB 格式加载图像，并将图像大小调整为 224x224 像素。然后，规范化像素值，平均值为 255*[0.485、0.456、0.406]，标准偏差为 255*[0.229、0.224、0.225]。在 github 页面可以找到 resnet 模型所需格式的详细信息。

    using var fileStream = await file.OpenStreamForReadAsync();

    IImageFormat format = SixLabors.ImageSharp.Image.DetectFormat(fileStream);
    using Image<Rgb24> image = SixLabors.ImageSharp.Image.Load<Rgb24>(fileStream);


    // Resize image
    using Stream imageStream = new MemoryStream();
    image.Mutate(x =>
    {
        x.Resize(new ResizeOptions
        {
            Size = new SixLabors.ImageSharp.Size(224, 224),
            Mode = ResizeMode.Crop
        });
    });

    image.Save(imageStream, format);

    // Preprocess image
    // We use DenseTensor for multi-dimensional access to populate the image data
    var mean = new[] { 0.485f, 0.456f, 0.406f };
    var stddev = new[] { 0.229f, 0.224f, 0.225f };
    DenseTensor<float> processedImage = new(new[] { 1, 3, 224, 224 });
    image.ProcessPixelRows(accessor =>
    {
        for (int y = 0; y < accessor.Height; y++)
        {
            Span<Rgb24> pixelSpan = accessor.GetRowSpan(y);
            for (int x = 0; x < accessor.Width; x++)
            {
                processedImage[0, 0, y, x] = ((pixelSpan[x].R / 255f) - mean[0]) / stddev[0];
                processedImage[0, 1, y, x] = ((pixelSpan[x].G / 255f) - mean[1]) / stddev[1];
                processedImage[0, 2, y, x] = ((pixelSpan[x].B / 255f) - mean[2]) / stddev[2];
            }
        }
    });

接下来，我们在托管的图像数据数组的顶部创建 Tensor（张量）类型的 OrtValue，以设置输入。

    // Setup inputs
    // Pin tensor buffer and create a OrtValue with native tensor that makes use of
    // DenseTensor buffer directly. This avoids extra data copy within OnnxRuntime.
    // It will be unpinned on ortValue disposal
    using var inputOrtValue = OrtValue.CreateTensorValueFromMemory(OrtMemoryInfo.DefaultInstance,
        processedImage.Buffer, new long[] { 1, 3, 224, 224 });

    var inputs = new Dictionary<string, OrtValue>
    {
        { "data", inputOrtValue }
    };

接下来，如果推理会话尚未初始化，请调用 InitModel 帮助程序方法。然后调用 Run 方法来运行模型并检索结果。

    // Run inference
    if (_inferenceSession == null)
    {
        InitModel();
    }
    using var runOptions = new RunOptions();
    using IDisposableReadOnlyCollection<OrtValue> results = _inferenceSession.Run(runOptions, inputs, _inferenceSession.OutputNames);

模型会将结果作为本机张量缓冲区输出。以下代码将输出转换为浮点数组。应用 softmax 函数，使值位于 [0,1] 范围内，总和为 1。

    // Postprocess output
    // We copy results to array only to apply algorithms, otherwise data can be accessed directly
    // from the native buffer via ReadOnlySpan<T> or Span<T>
    var output = results[0].GetTensorDataAsSpan<float>().ToArray();
    float sum = output.Sum(x => (float)Math.Exp(x));
    IEnumerable<float> softmax = output.Select(x => (float)Math.Exp(x) / sum);

输出数组中每个值的索引映射到模型已训练的标签，并且该索引处的值是模型对标签表示输入图像中检测到的对象的信心。我们选取置信度最高的 10 个结果。此代码使用我们将在下一步中定义的一些帮助程序对象。

    // Extract top 10
    IEnumerable<Prediction> top10 = softmax.Select((x, i) => new Prediction { Label = LabelMap.Labels[i], Confidence = x })
        .OrderByDescending(x => x.Confidence)
        .Take(10);

    // Print results
    featuresTextBlock.Text = "Top 10 predictions for ResNet50 v2...\n";
    featuresTextBlock.Text += "-------------------------------------\n";
    foreach (var t in top10)
    {
        featuresTextBlock.Text += $"Label: {t.Label}, Confidence: {t.Confidence}\n";
    }
} // End of myButton_Click

声明帮助程序对象

Prediction 类只是提供了将对象标签与置信度值关联起来的简单方法。在 MainPage.xaml.cs 中，在 ONNXWinUIExample 命名空间块之内、MainWindow 类定义之外添加此类。

internal class Prediction
{
    public object Label { get; set; }
    public float Confidence { get; set; }
}

接下来，添加 LabelMap 辅助类，该类按特定顺序列出模型训练的对象标签全部，以使标签与模型返回的结果索引对应。标签列表太长，无法在此处完整显示。可以从 ONNXRuntime github 存储库中的示例代码文件复制完整的 LabelMap 类，并将其粘贴到 ONNXWinUIExample 命名空间块中。

public class LabelMap
{
    public static readonly string[] Labels = new[] {
        "tench",
        "goldfish",
        "great white shark",
        ...
        "hen-of-the-woods",
        "bolete",
        "ear",
        "toilet paper"};

运行示例

生成并运行项目。单击 选择照片 按钮，然后选择要分析的图像文件。可以查看 LabelMap 帮助程序类定义，以查看模型可以识别的内容，并选取可能具有有趣结果的图像。模型初始化后，首次运行时，模型处理完成后，应会看到图像中检测到的对象列表，以及每个预测的置信度值。

Top 10 predictions for ResNet50 v2...
-------------------------------------
Label: lakeshore, Confidence: 0.91674984
Label: seashore, Confidence: 0.033412453
Label: promontory, Confidence: 0.008877817
Label: shoal, Confidence: 0.0046836217
Label: container ship, Confidence: 0.001940886
Label: Lakeland Terrier, Confidence: 0.0016400366
Label: maze, Confidence: 0.0012478716
Label: breakwater, Confidence: 0.0012336193
Label: ocean liner, Confidence: 0.0011933135
Label: pier, Confidence: 0.0011284945

通过