Saiba como usar a saída reproduzível (visualização)

Artigo
09/20/2024

Por padrão, se você fizer a mesma pergunta a um modelo de Conclusão de Chat do OpenAI do Azure várias vezes, provavelmente obterá uma resposta diferente. As respostas são, portanto, consideradas não determinísticas. A saída reprodutível é um novo recurso de visualização que permite alterar seletivamente o comportamento padrão para ajudar as saídas mais determinísticas do produto.

Suporte de saída reprodutível

Atualmente, a saída reprodutível só é suportada com o seguinte:

Modelos suportados

gpt-35-turbo (1106)
gpt-35-turbo (0125)
gpt-4 (1106-Pré-visualização)
gpt-4 (0125-Pré-visualização)
gpt-4 (turbo-2024-04-09)
gpt-4o-mini (2024-07-18)
gpt-4o (2024-05-13)

Consulte a página de modelos para obter as informações mais recentes sobre a disponibilidade regional do modelo.

Versão da API

O suporte para saída reproduzível foi adicionado pela primeira vez na versão API 2023-12-01-preview

Exemplo

Primeiro, geraremos três respostas para a mesma pergunta para demonstrar a variabilidade que é comum às respostas de Conclusão de Bate-papo, mesmo quando outros parâmetros são os mesmos:

Python
PowerShell

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-02-01"
)

for i in range(3):
  print(f'Story Version {i + 1}\n---')
    
  response = client.chat.completions.create(
    model="gpt-35-turbo-0125", # Model = should match the deployment name you chose for your 0125-preview model deployment
    #seed=42,
    temperature=0.7,
    max_tokens =50, 
    messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Tell me a story about how the universe began?"}
    ]
  )
  
  print(response.choices[0].message.content)
  print("---\n")
  
  del response

$openai = @{
   api_key     = $Env:AZURE_OPENAI_API_KEY
   api_base    = $Env:AZURE_OPENAI_ENDPOINT # like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
   api_version = '2024-02-01' # may change in the future
   name        = 'YOUR-DEPLOYMENT-NAME-HERE' # name you chose for your deployment
}

$headers = @{
  'api-key' = $openai.api_key
}

$messages  = @()
$messages += @{
  role     = 'system'
  content  = 'You are a helpful assistant.'
}
$messages += @{
  role     = 'user'
  content  = 'Tell me a story about how the universe began?'
}

$body         = @{
  #seed       = 42
  temperature = 0.7
  max_tokens  = 50
  messages    = $messages
} | ConvertTo-Json

$url = "$($openai.api_base)/openai/deployments/$($openai.name)/chat/completions?api-version=$($openai.api_version)"

for ($i=0; $i -le 2; $i++) {
  $response = Invoke-RestMethod -Uri $url -Headers $headers -Body $body -Method Post -ContentType 'application/json'
  write-host "Story Version $($i+1)`n---`n$($response.choices[0].message.content)`n---`n"
}

Saída

Story Version 1
---
Once upon a time, before there was time, there was nothing but a vast emptiness. In this emptiness, there existed a tiny, infinitely dense point of energy. This point contained all the potential for the universe as we know it. And
---

Story Version 2
---
Once upon a time, long before the existence of time itself, there was nothing but darkness and silence. The universe lay dormant, a vast expanse of emptiness waiting to be awakened. And then, in a moment that defies comprehension, there
---

Story Version 3
---
Once upon a time, before time even existed, there was nothing but darkness and stillness. In this vast emptiness, there was a tiny speck of unimaginable energy and potential. This speck held within it all the elements that would come

Observe que, embora cada história possa ter elementos semelhantes e alguma repetição literal, quanto mais tempo a resposta dura, mais elas tendem a divergir.

Agora vamos executar o mesmo código de antes, mas desta vez descomente a linha para o parâmetro que diz seed=42

Python
PowerShell

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-02-01"
)

for i in range(3):
  print(f'Story Version {i + 1}\n---')
    
  response = client.chat.completions.create(
    model="gpt-35-turbo-0125", # Model = should match the deployment name you chose for your 0125-preview model deployment
    seed=42,
    temperature=0.7,
    max_tokens =50, 
    messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Tell me a story about how the universe began?"}
    ]
  )
  
  print(response.choices[0].message.content)
  print("---\n")
  
  del response

$openai = @{
   api_key     = $Env:AZURE_OPENAI_API_KEY
   api_base    = $Env:AZURE_OPENAI_ENDPOINT # like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
   api_version = '2024-02-01' # may change in the future
   name        = 'YOUR-DEPLOYMENT-NAME-HERE' # name you chose for your deployment
}

$headers = @{
  'api-key' = $openai.api_key
}

$messages  = @()
$messages += @{
  role     = 'system'
  content  = 'You are a helpful assistant.'
}
$messages += @{
  role     = 'user'
  content  = 'Tell me a story about how the universe began?'
}

$body         = @{
  seed        = 42
  temperature = 0.7
  max_tokens  = 50
  messages    = $messages
} | ConvertTo-Json

$url = "$($openai.api_base)/openai/deployments/$($openai.name)/chat/completions?api-version=$($openai.api_version)"

for ($i=0; $i -le 2; $i++) {
  $response = Invoke-RestMethod -Uri $url -Headers $headers -Body $body -Method Post -ContentType 'application/json'
  write-host "Story Version $($i+1)`n---`n$($response.choices[0].message.content)`n---`n"
}

Saída

Story Version 1
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This point of light contained all the energy and matter that would eventually form the entire universe. With a massive explosion known as the Big Bang
---

Story Version 2
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This point of light contained all the energy and matter that would eventually form the entire universe. With a massive explosion known as the Big Bang
---

Story Version 3
---
In the beginning, there was nothing but darkness and silence. Then, suddenly, a tiny point of light appeared. This was the moment when the universe was born.

The point of light began to expand rapidly, creating space and time as it grew.
---

Usando o mesmo seed parâmetro de 42 para cada uma das nossas três solicitações, mantendo todos os outros parâmetros iguais, somos capazes de produzir resultados muito mais consistentes.

Importante

O determinismo não é garantido com resultados reprodutíveis. Mesmo nos casos em que o parâmetro seed e system_fingerprint são os mesmos em todas as chamadas de API, atualmente não é incomum ainda observar um grau de variabilidade nas respostas. Chamadas de API idênticas com valores maiores max_tokens , geralmente resultarão em respostas menos determinísticas, mesmo quando o parâmetro seed é definido.

Detalhes dos parâmetros

seed é um parâmetro opcional, que pode ser definido como inteiro ou nulo.

Esta funcionalidade encontra-se na Pré-visualização. Se especificado, nosso sistema fará um melhor esforço para amostragem deterministicamente, de modo que solicitações repetidas com a mesma semente e parâmetros devem retornar o mesmo resultado. O determinismo não é garantido, e você deve consultar o system_fingerprint parâmetro response para monitorar as alterações no back-end.

system_fingerprint é uma cadeia de caracteres e faz parte do objeto de conclusão do chat.

Essa impressão digital representa a configuração de back-end com a qual o modelo é executado.

Ele pode ser usado com o parâmetro seed request para entender quando alterações de back-end foram feitas que podem afetar o determinismo.

Para exibir o objeto de conclusão de chat completo com system_fingerprinto , você pode adicionar print(response.model_dump_json(indent=2)) ao código Python anterior ao lado da instrução print existente ou $response | convertto-json -depth 5 no final do exemplo do PowerShell. Esta alteração faz com que as seguintes informações adicionais façam parte da saída:

Saída

{
  "id": "chatcmpl-8LmLRatZxp8wsx07KGLKQF0b8Zez3",
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "message": {
        "content": "In the beginning, there was nothing but a vast emptiness, a void without form or substance. Then, from this nothingness, a singular event occurred that would change the course of existence forever—The Big Bang.\n\nAround 13.8 billion years ago, an infinitely hot and dense point, no larger than a single atom, began to expand at an inconceivable speed. This was the birth of our universe, a moment where time and space came into being. As this primordial fireball grew, it cooled, and the fundamental forces that govern the cosmos—gravity, electromagnetism, and the strong and weak nuclear forces—began to take shape.\n\nMatter coalesced into the simplest elements, hydrogen and helium, which later formed vast clouds in the expanding universe. These clouds, driven by the force of gravity, began to collapse in on themselves, creating the first stars. The stars were crucibles of nuclear fusion, forging heavier elements like carbon, nitrogen, and oxygen",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      },
      "content_filter_results": {
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ],
  "created": 1700201417,
  "model": "gpt-4",
  "object": "chat.completion",
  "system_fingerprint": "fp_50a4261de5",
  "usage": {
    "completion_tokens": 200,
    "prompt_tokens": 27,
    "total_tokens": 227
  },
  "prompt_filter_results": [
    {
      "prompt_index": 0,
      "content_filter_results": {
        "hate": {
          "filtered": false,
          "severity": "safe"
        },
        "self_harm": {
          "filtered": false,
          "severity": "safe"
        },
        "sexual": {
          "filtered": false,
          "severity": "safe"
        },
        "violence": {
          "filtered": false,
          "severity": "safe"
        }
      }
    }
  ]
}

Considerações adicionais

Quando quiser usar saídas reproduzíveis, você precisa definir o mesmo inteiro em chamadas de conclusão de seed chat. Você também deve corresponder a quaisquer outros parâmetros como temperature, max_tokens, etc.

Partilhar via

Saiba como usar a saída reproduzível (visualização)

Suporte de saída reprodutível

Modelos suportados

Versão da API

Exemplo

Saída

Saída

Detalhes dos parâmetros

Saída

Considerações adicionais

Comentários

Recursos adicionais