Procédure : Open AI Assistant Agent Code Interpréteur (expérimental)

Article
11/03/2024

Avertissement

L’infrastructure de l’agent de noyau sémantique est expérimentale, toujours en cours de développement et est susceptible de changer.

Vue d’ensemble

Dans cet exemple, nous allons découvrir comment utiliser l’outil d’interpréteur de code d’un Agent Assistant Open AI pour effectuer des tâches d’analyse des données. L’approche sera décomposée pas à pas pour éclairer les principales parties du processus de codage. Dans le cadre de la tâche, l’agent génère à la fois des réponses image et texte. Cela démontrera la polyvalence de cet outil en effectuant une analyse quantitative.

La diffusion en continu sera utilisée pour fournir les réponses de l’agent. Cela fournit des mises à jour en temps réel à mesure que la tâche progresse.

Mise en route

Avant de continuer avec le codage des fonctionnalités, vérifiez que votre environnement de développement est entièrement configuré et configuré.

Commencez par créer un projet console . Ensuite, incluez les références de package suivantes pour vous assurer que toutes les dépendances requises sont disponibles.

Pour ajouter des dépendances de package à partir de la ligne de commande, utilisez la dotnet commande :

dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.Configuration
dotnet add package Microsoft.Extensions.Configuration.Binder
dotnet add package Microsoft.Extensions.Configuration.UserSecrets
dotnet add package Microsoft.Extensions.Configuration.EnvironmentVariables
dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.SemanticKernel.Agents.OpenAI --prerelease

Le fichier projet (.csproj) doit contenir les définitions suivantes PackageReference :

  <ItemGroup>
    <PackageReference Include="Azure.Identity" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.Binder" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="<stable>" />
    <PackageReference Include="Microsoft.Extensions.Configuration.EnvironmentVariables" Version="<stable>" />
    <PackageReference Include="Microsoft.SemanticKernel" Version="<latest>" />
    <PackageReference Include="Microsoft.SemanticKernel.Agents.OpenAI" Version="<latest>" />
  </ItemGroup>

Agent Framework est expérimental et nécessite une suppression d’avertissement. Cela peut être traité en tant que propriété dans le fichier projet (.csproj) :

  <PropertyGroup>
    <NoWarn>$(NoWarn);CA2007;IDE1006;SKEXP0001;SKEXP0110;OPENAI001</NoWarn>
  </PropertyGroup>

En outre, copiez les fichiers et PopulationByCountry.csv les fichiers de données à partir du projet de noyauLearnResources sémantique.PopulationByAdmin1.csv Ajoutez ces fichiers dans votre dossier de projet et configurez-les pour les copier dans le répertoire de sortie :

  <ItemGroup>
    <None Include="PopulationByAdmin1.csv">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
    <None Include="PopulationByCountry.csv">
      <CopyToOutputDirectory>Always</CopyToOutputDirectory>
    </None>
  </ItemGroup>

Commencez par créer un dossier qui contiendra votre script (.py fichier) et les exemples de ressources. Incluez les importations suivantes en haut de votre .py fichier :

import asyncio
import os

from semantic_kernel.agents.open_ai.azure_assistant_agent import AzureAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.streaming_file_reference_content import StreamingFileReferenceContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

Configuration

Cet exemple nécessite un paramètre de configuration pour se connecter aux services distants. Vous devez définir des paramètres pour Open AI ou Azure Open AI.

# Open AI
dotnet user-secrets set "OpenAISettings:ApiKey" "<api-key>"
dotnet user-secrets set "OpenAISettings:ChatModel" "gpt-4o"

# Azure Open AI
dotnet user-secrets set "AzureOpenAISettings:ApiKey" "<api-key>" # Not required if using token-credential
dotnet user-secrets set "AzureOpenAISettings:Endpoint" "<model-endpoint>"
dotnet user-secrets set "AzureOpenAISettings:ChatModelDeployment" "gpt-4o"

La classe suivante est utilisée dans tous les exemples agent. Veillez à l’inclure dans votre projet pour garantir une fonctionnalité appropriée. Cette classe sert de composant fondamental pour les exemples qui suivent.

using System.Reflection;
using Microsoft.Extensions.Configuration;

namespace AgentsSample;

public class Settings
{
    private readonly IConfigurationRoot configRoot;

    private AzureOpenAISettings azureOpenAI;
    private OpenAISettings openAI;

    public AzureOpenAISettings AzureOpenAI => this.azureOpenAI ??= this.GetSettings<Settings.AzureOpenAISettings>();
    public OpenAISettings OpenAI => this.openAI ??= this.GetSettings<Settings.OpenAISettings>();

    public class OpenAISettings
    {
        public string ChatModel { get; set; } = string.Empty;
        public string ApiKey { get; set; } = string.Empty;
    }

    public class AzureOpenAISettings
    {
        public string ChatModelDeployment { get; set; } = string.Empty;
        public string Endpoint { get; set; } = string.Empty;
        public string ApiKey { get; set; } = string.Empty;
    }

    public TSettings GetSettings<TSettings>() =>
        this.configRoot.GetRequiredSection(typeof(TSettings).Name).Get<TSettings>()!;

    public Settings()
    {
        this.configRoot =
            new ConfigurationBuilder()
                .AddEnvironmentVariables()
                .AddUserSecrets(Assembly.GetExecutingAssembly(), optional: true)
                .Build();
    }
}

Le moyen le plus rapide de bien démarrer avec la configuration appropriée pour exécuter l’exemple de code consiste à créer un .env fichier à la racine de votre projet (où votre script est exécuté).

Configurez les paramètres suivants dans votre .env fichier pour Azure OpenAI ou OpenAI :

AZURE_OPENAI_API_KEY="..."
AZURE_OPENAI_ENDPOINT="https://..."
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME="..."
AZURE_OPENAI_API_VERSION="..."

OPENAI_API_KEY="sk-..."
OPENAI_ORG_ID=""
OPENAI_CHAT_MODEL_ID=""

Une fois configurées, les classes de service IA respectives récupèrent les variables requises et les utilisent pendant l’instanciation.

Codage

Le processus de codage de cet exemple implique :

Configuration : initialisation des paramètres et du plug-in.
Définition de l’agent : créez le OpenAI_Assistant_Agent avec des instructions et des plug-ins templatés.
Boucle de conversation : écrivez la boucle qui pilote l’interaction utilisateur/agent.

L’exemple de code complet est fourni dans la section Finale . Reportez-vous à cette section pour l’implémentation complète.

Programme d’installation

Avant de créer un agent Assistant Open AI, vérifiez que les paramètres de configuration sont disponibles et préparez les ressources de fichier.

Instanciez la Settings classe référencée dans la section Configuration précédente. Utilisez les paramètres pour créer un OpenAIClientProvider élément qui sera utilisé pour la définition de l’agent, ainsi que pour le chargement de fichiers.

Settings settings = new();

OpenAIClientProvider clientProvider =
    OpenAIClientProvider.ForAzureOpenAI(new AzureCliCredential(), new Uri(settings.AzureOpenAI.Endpoint));

Utilisez l’option OpenAIClientProvider pour accéder à un OpenAIFileClient fichier et charger les deux fichiers de données décrits dans la section Configuration précédente, en conservant la référence de fichier pour le nettoyage final.

Console.WriteLine("Uploading files...");
OpenAIFileClient fileClient = clientProvider.Client.GetOpenAIFileClient();
OpenAIFile fileDataCountryDetail = await fileClient.UploadFileAsync("PopulationByAdmin1.csv", FileUploadPurpose.Assistants);
OpenAIFile fileDataCountryList = await fileClient.UploadFileAsync("PopulationByCountry.csv", FileUploadPurpose.Assistants);

# Let's form the file paths that we will later pass to the assistant
csv_file_path_1 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByAdmin1.csv",
)

csv_file_path_2 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByCountry.csv",
)

Définition de l’agent

Nous sommes maintenant prêts à instancier un agent Assistant OpenAI. L’agent est configuré avec son modèle cible, instructions et l’outil Interpréteur de code activé. En outre, nous allons associer explicitement les deux fichiers de données à l’outil Interpréteur de code.

Console.WriteLine("Defining agent...");
OpenAIAssistantAgent agent =
    await OpenAIAssistantAgent.CreateAsync(
        clientProvider,
        new OpenAIAssistantDefinition(settings.AzureOpenAI.ChatModelDeployment)
        {
            Name = "SampleAssistantAgent",
            Instructions =
                """
                Analyze the available data to provide an answer to the user's question.
                Always format response using markdown.
                Always include a numerical index that starts at 1 for any lists or tables.
                Always sort lists in ascending order.
                """,
            EnableCodeInterpreter = true,
            CodeInterpreterFileIds = [fileDataCountryList.Id, fileDataCountryDetail.Id],
        },
        new Kernel());

agent = await AzureAssistantAgent.create(
        kernel=Kernel(),
        service_id="agent",
        name="SampleAssistantAgent",
        instructions="""
                Analyze the available data to provide an answer to the user's question.
                Always format response using markdown.
                Always include a numerical index that starts at 1 for any lists or tables.
                Always sort lists in ascending order.
                """,
        enable_code_interpreter=True,
        code_interpreter_filenames=[csv_file_path_1, csv_file_path_2],
    )

Boucle de conversation

Enfin, nous sommes en mesure de coordonner l’interaction entre l’utilisateur et l’Agent. Commencez par créer un thread Assistant pour maintenir l’état de la conversation et créer une boucle vide.

Nous allons également vérifier que les ressources sont supprimées à la fin de l’exécution pour réduire les frais inutiles.

Console.WriteLine("Creating thread...");
string threadId = await agent.CreateThreadAsync();

Console.WriteLine("Ready!");

try
{
    bool isComplete = false;
    List<string> fileIds = [];
    do
    {

    } while (!isComplete);
}
finally
{
    Console.WriteLine();
    Console.WriteLine("Cleaning-up...");
    await Task.WhenAll(
        [
            agent.DeleteThreadAsync(threadId),
            agent.DeleteAsync(),
            fileClient.DeleteFileAsync(fileDataCountryList.Id),
            fileClient.DeleteFileAsync(fileDataCountryDetail.Id),
        ]);
}

print("Creating thread...")
thread_id = await agent.create_thread()

try:
    is_complete: bool = False
    file_ids: list[str] = []
    while not is_complete:
        # agent interaction logic here
finally:
    print("Cleaning up resources...")
    if agent is not None:
        [await agent.delete_file(file_id) for file_id in agent.code_interpreter_file_ids]
        await agent.delete_thread(thread_id)
        await agent.delete()

Nous allons maintenant capturer l’entrée utilisateur dans la boucle précédente. Dans ce cas, l’entrée vide est ignorée et le terme EXIT signale que la conversation est terminée. Une entrée valide est ajoutée au thread Assistant en tant que message utilisateur .

Console.WriteLine();
Console.Write("> ");
string input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input))
{
    continue;
}
if (input.Trim().Equals("EXIT", StringComparison.OrdinalIgnoreCase))
{
    isComplete = true;
    break;
}

await agent.AddChatMessageAsync(threadId, new ChatMessageContent(AuthorRole.User, input));

Console.WriteLine();

user_input = input("User:> ")
if not user_input:
    continue

if user_input.lower() == "exit":
    is_complete = True
    break

await agent.add_chat_message(thread_id=thread_id, message=ChatMessageContent(role=AuthorRole.USER, content=user_input))

Avant d’appeler la réponse de l’Agent, nous allons ajouter des méthodes d’assistance pour télécharger les fichiers qui peuvent être générés par l’Agent.

Ici, nous mettons du contenu de fichier dans le répertoire temporaire défini par le système, puis lancez l’application visionneuse définie par le système.

private static async Task DownloadResponseImageAsync(OpenAIFileClient client, ICollection<string> fileIds)
{
    if (fileIds.Count > 0)
    {
        Console.WriteLine();
        foreach (string fileId in fileIds)
        {
            await DownloadFileContentAsync(client, fileId, launchViewer: true);
        }
    }
}

private static async Task DownloadFileContentAsync(OpenAIFileClient client, string fileId, bool launchViewer = false)
{
    OpenAIFile fileInfo = client.GetFile(fileId);
    if (fileInfo.Purpose == FilePurpose.AssistantsOutput)
    {
        string filePath =
            Path.Combine(
                Path.GetTempPath(),
                Path.GetFileName(Path.ChangeExtension(fileInfo.Filename, ".png")));

        BinaryData content = await client.DownloadFileAsync(fileId);
        await using FileStream fileStream = new(filePath, FileMode.CreateNew);
        await content.ToStream().CopyToAsync(fileStream);
        Console.WriteLine($"File saved to: {filePath}.");

        if (launchViewer)
        {
            Process.Start(
                new ProcessStartInfo
                {
                    FileName = "cmd.exe",
                    Arguments = $"/C start {filePath}"
                });
        }
    }
}

import os

async def download_file_content(agent, file_id: str):
    try:
        # Fetch the content of the file using the provided method
        response_content = await agent.client.files.content(file_id)

        # Get the current working directory of the file
        current_directory = os.path.dirname(os.path.abspath(__file__))

        # Define the path to save the image in the current directory
        file_path = os.path.join(
            current_directory,  # Use the current directory of the file
            f"{file_id}.png"  # You can modify this to use the actual filename with proper extension
        )

        # Save content to a file asynchronously
        with open(file_path, "wb") as file:
            file.write(response_content.content)

        print(f"File saved to: {file_path}")
    except Exception as e:
        print(f"An error occurred while downloading file {file_id}: {str(e)}")

async def download_response_image(agent, file_ids: list[str]):
    if file_ids:
        # Iterate over file_ids and download each one
        for file_id in file_ids:
            await download_file_content(agent, file_id)

Pour générer une réponse de l’agent à l’entrée utilisateur, appelez l’agent en spécifiant le thread Assistant. Dans cet exemple, nous choisissons une réponse diffusée en continu et capturez toutes les références de fichier générées pour téléchargement et révision à la fin du cycle de réponse. Il est important de noter que le code généré est identifié par la présence d’une clé de métadonnées dans le message de réponse, ce qui le distingue de la réponse conversationnelle.

bool isCode = false;
await foreach (StreamingChatMessageContent response in agent.InvokeStreamingAsync(threadId))
{
    if (isCode != (response.Metadata?.ContainsKey(OpenAIAssistantAgent.CodeInterpreterMetadataKey) ?? false))
    {
        Console.WriteLine();
        isCode = !isCode;
    }

    // Display response.
    Console.Write($"{response.Content}");

    // Capture file IDs for downloading.
    fileIds.AddRange(response.Items.OfType<StreamingFileReferenceContent>().Select(item => item.FileId));
}
Console.WriteLine();

// Download any files referenced in the response.
await DownloadResponseImageAsync(fileClient, fileIds);
fileIds.Clear();

is_code: bool = False
async for response in agent.invoke(stream(thread_id=thread_id):
    if is_code != metadata.get("code"):
        print()
        is_code = not is_code

    print(f"{response.content})

    file_ids.extend(
        [item.file_id for item in response.items if isinstance(item, StreamingFileReferenceContent)]
    )

print()

await download_response_image(agent, file_ids)
file_ids.clear()

Finale

Rassembler toutes les étapes, nous avons le code final de cet exemple. L’implémentation complète est fournie ci-dessous.

Essayez d’utiliser ces entrées suggérées :

Comparez les fichiers pour déterminer le nombre de pays qui n’ont pas d’état ou de province défini par rapport au nombre total
Créez une table pour les pays dont l’état ou la province est défini. Inclure le nombre d’états ou de provinces et la population totale
Fournissez un graphique à barres pour les pays dont les noms commencent par la même lettre et trient l’axe x en fonction du nombre le plus élevé (inclure tous les pays)

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading.Tasks;
using Azure.Identity;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Agents.OpenAI;
using Microsoft.SemanticKernel.ChatCompletion;
using OpenAI.Files;

namespace AgentsSample;

public static class Program
{
    public static async Task Main()
    {
        // Load configuration from environment variables or user secrets.
        Settings settings = new();

        OpenAIClientProvider clientProvider =
            OpenAIClientProvider.ForAzureOpenAI(new AzureCliCredential(), new Uri(settings.AzureOpenAI.Endpoint));

        Console.WriteLine("Uploading files...");
        OpenAIFileClient fileClient = clientProvider.Client.GetOpenAIFileClient();
        OpenAIFile fileDataCountryDetail = await fileClient.UploadFileAsync("PopulationByAdmin1.csv", FileUploadPurpose.Assistants);
        OpenAIFile fileDataCountryList = await fileClient.UploadFileAsync("PopulationByCountry.csv", FileUploadPurpose.Assistants);

        Console.WriteLine("Defining agent...");
        OpenAIAssistantAgent agent =
            await OpenAIAssistantAgent.CreateAsync(
                clientProvider,
                new OpenAIAssistantDefinition(settings.AzureOpenAI.ChatModelDeployment)
                {
                    Name = "SampleAssistantAgent",
                    Instructions =
                        """
                        Analyze the available data to provide an answer to the user's question.
                        Always format response using markdown.
                        Always include a numerical index that starts at 1 for any lists or tables.
                        Always sort lists in ascending order.
                        """,
                    EnableCodeInterpreter = true,
                    CodeInterpreterFileIds = [fileDataCountryList.Id, fileDataCountryDetail.Id],
                },
                new Kernel());

        Console.WriteLine("Creating thread...");
        string threadId = await agent.CreateThreadAsync();

        Console.WriteLine("Ready!");

        try
        {
            bool isComplete = false;
            List<string> fileIds = [];
            do
            {
                Console.WriteLine();
                Console.Write("> ");
                string input = Console.ReadLine();
                if (string.IsNullOrWhiteSpace(input))
                {
                    continue;
                }
                if (input.Trim().Equals("EXIT", StringComparison.OrdinalIgnoreCase))
                {
                    isComplete = true;
                    break;
                }

                await agent.AddChatMessageAsync(threadId, new ChatMessageContent(AuthorRole.User, input));

                Console.WriteLine();

                bool isCode = false;
                await foreach (StreamingChatMessageContent response in agent.InvokeStreamingAsync(threadId))
                {
                    if (isCode != (response.Metadata?.ContainsKey(OpenAIAssistantAgent.CodeInterpreterMetadataKey) ?? false))
                    {
                        Console.WriteLine();
                        isCode = !isCode;
                    }

                    // Display response.
                    Console.Write($"{response.Content}");

                    // Capture file IDs for downloading.
                    fileIds.AddRange(response.Items.OfType<StreamingFileReferenceContent>().Select(item => item.FileId));
                }
                Console.WriteLine();

                // Download any files referenced in the response.
                await DownloadResponseImageAsync(fileClient, fileIds);
                fileIds.Clear();

            } while (!isComplete);
        }
        finally
        {
            Console.WriteLine();
            Console.WriteLine("Cleaning-up...");
            await Task.WhenAll(
                [
                    agent.DeleteThreadAsync(threadId),
                    agent.DeleteAsync(),
                    fileClient.DeleteFileAsync(fileDataCountryList.Id),
                    fileClient.DeleteFileAsync(fileDataCountryDetail.Id),
                ]);
        }
    }

    private static async Task DownloadResponseImageAsync(OpenAIFileClient client, ICollection<string> fileIds)
    {
        if (fileIds.Count > 0)
        {
            Console.WriteLine();
            foreach (string fileId in fileIds)
            {
                await DownloadFileContentAsync(client, fileId, launchViewer: true);
            }
        }
    }

    private static async Task DownloadFileContentAsync(OpenAIFileClient client, string fileId, bool launchViewer = false)
    {
        OpenAIFile fileInfo = client.GetFile(fileId);
        if (fileInfo.Purpose == FilePurpose.AssistantsOutput)
        {
            string filePath =
                Path.Combine(
                    Path.GetTempPath(),
                    Path.GetFileName(Path.ChangeExtension(fileInfo.Filename, ".png")));

            BinaryData content = await client.DownloadFileAsync(fileId);
            await using FileStream fileStream = new(filePath, FileMode.CreateNew);
            await content.ToStream().CopyToAsync(fileStream);
            Console.WriteLine($"File saved to: {filePath}.");

            if (launchViewer)
            {
                Process.Start(
                    new ProcessStartInfo
                    {
                        FileName = "cmd.exe",
                        Arguments = $"/C start {filePath}"
                    });
            }
        }
    }
}

import asyncio
import os

from semantic_kernel.agents.open_ai.azure_assistant_agent import AzureAssistantAgent
from semantic_kernel.contents.chat_message_content import ChatMessageContent
from semantic_kernel.contents.streaming_file_reference_content import StreamingFileReferenceContent
from semantic_kernel.contents.utils.author_role import AuthorRole
from semantic_kernel.kernel import Kernel

# Let's form the file paths that we will later pass to the assistant
csv_file_path_1 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByAdmin1.csv",
)

csv_file_path_2 = os.path.join(
    os.path.dirname(os.path.dirname(os.path.realpath(__file__))),
    "PopulationByCountry.csv",
)


async def download_file_content(agent, file_id: str):
    try:
        # Fetch the content of the file using the provided method
        response_content = await agent.client.files.content(file_id)

        # Get the current working directory of the file
        current_directory = os.path.dirname(os.path.abspath(__file__))

        # Define the path to save the image in the current directory
        file_path = os.path.join(
            current_directory,  # Use the current directory of the file
            f"{file_id}.png",  # You can modify this to use the actual filename with proper extension
        )

        # Save content to a file asynchronously
        with open(file_path, "wb") as file:
            file.write(response_content.content)

        print(f"File saved to: {file_path}")
    except Exception as e:
        print(f"An error occurred while downloading file {file_id}: {str(e)}")


async def download_response_image(agent, file_ids: list[str]):
    if file_ids:
        # Iterate over file_ids and download each one
        for file_id in file_ids:
            await download_file_content(agent, file_id)


async def main():
    agent = await AzureAssistantAgent.create(
        kernel=Kernel(),
        service_id="agent",
        name="SampleAssistantAgent",
        instructions="""
                    Analyze the available data to provide an answer to the user's question.
                    Always format response using markdown.
                    Always include a numerical index that starts at 1 for any lists or tables.
                    Always sort lists in ascending order.
                    """,
        enable_code_interpreter=True,
        code_interpreter_filenames=[csv_file_path_1, csv_file_path_2],
    )

    print("Creating thread...")
    thread_id = await agent.create_thread()

    try:
        is_complete: bool = False
        file_ids: list[str] = []
        while not is_complete:
            user_input = input("User:> ")
            if not user_input:
                continue

            if user_input.lower() == "exit":
                is_complete = True
                break

            await agent.add_chat_message(
                thread_id=thread_id, message=ChatMessageContent(role=AuthorRole.USER, content=user_input)
            )
            is_code: bool = False
            async for response in agent.invoke_stream(thread_id=thread_id):
                if is_code != response.metadata.get("code"):
                    print()
                    is_code = not is_code

                print(f"{response.content}", end="", flush=True)

                file_ids.extend([
                    item.file_id for item in response.items if isinstance(item, StreamingFileReferenceContent)
                ])

            print()

            await download_response_image(agent, file_ids)
            file_ids.clear()

    finally:
        print("Cleaning up resources...")
        if agent is not None:
            [await agent.delete_file(file_id) for file_id in agent.code_interpreter_file_ids]
            await agent.delete_thread(thread_id)
            await agent.delete()


if __name__ == "__main__":
    asyncio.run(main())

Procédure : Ouvrir la recherche de fichiers de code de l’agent assistant IA

Partager via