Output strutturati

Articolo
12/18/2024

Gli output strutturati fanno in modo che un modello segua una definizione di schema JSON specificata come parte della chiamata API di inferenza. Ciò è in contrasto con la funzionalità modalità JSON precedente, che garantiva la generazione di JSON validi, ma non era in grado di garantire una rigorosa conformità allo schema fornito. Gli output strutturati sono consigliati per chiamare le funzioni, estrarre dati strutturati e creare flussi di lavoro complessi in più passaggi.

Nota

Gli output strutturati non sono attualmente supportati nello scenario bring your own data .

Modelli supportati

o1 Versione: 2024-12-17
gpt-4o-mini Versione: 2024-07-18
gpt-4o Versione: 2024-08-06

Supporto dell'API

Il supporto per gli output strutturati è stato aggiunto per la prima volta nella versione 2024-08-01-preview dell'API. È disponibile nelle API di anteprima più recenti e nell'API ga più recente: 2024-10-21.

Introduzione

È possibile usare Pydantic per definire gli schemi degli oggetti in Python. A seconda della versione delle librerie e Pydantic OpenAI in esecuzione, potrebbe essere necessario eseguire l'aggiornamento a una versione più recente. Questi esempi sono stati testati su openai 1.42.0 e pydantic 2.8.2.

pip install openai pydantic --upgrade

Se non si ha ancora una volta usato Microsoft Entra ID per l'autenticazione, vedere How to configure Azure OpenAI Service with Microsoft Entra ID authentication (Come configurare il servizio Azure OpenAI con l'autenticazione microsoft Entra ID).

from pydantic import BaseModel
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default"
)

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  azure_ad_token_provider=token_provider,
  api_version="2024-10-21"
)


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="MODEL_DEPLOYMENT_NAME", # replace with the model deployment name of your gpt-4o 2024-08-06 deployment
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

print(event)
print(completion.model_dump_json(indent=2))

Output

name='Science Fair' date='Friday' participants=['Alice', 'Bob']
{
  "id": "chatcmpl-A1EUP2fAmL4SeB1lVMinwM7I2vcqG",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "{\n  \"name\": \"Science Fair\",\n  \"date\": \"Friday\",\n  \"participants\": [\"Alice\", \"Bob\"]\n}",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": [],
        "parsed": {
          "name": "Science Fair",
          "date": "Friday",
          "participants": [
            "Alice",
            "Bob"
          ]
        }
      }
    }
  ],
  "created": 1724857389,
  "model": "gpt-4o-2024-08-06",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": "fp_1c2eaec9fe",
  "usage": {
    "completion_tokens": 27,
    "prompt_tokens": 32,
    "total_tokens": 59
  }
}

pip install openai pydantic --upgrade

from pydantic import BaseModel
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-10-21"
)


class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = client.beta.chat.completions.parse(
    model="MODEL_DEPLOYMENT_NAME", # replace with the model deployment name of your gpt-4o 2024-08-06 deployment
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

print(event)
print(completion.model_dump_json(indent=2))

Output

name='Science Fair' date='Friday' participants=['Alice', 'Bob']
{
  "id": "chatcmpl-A1EUP2fAmL4SeB1lVMinwM7I2vcqG",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "{\n  \"name\": \"Science Fair\",\n  \"date\": \"Friday\",\n  \"participants\": [\"Alice\", \"Bob\"]\n}",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": [],
        "parsed": {
          "name": "Science Fair",
          "date": "Friday",
          "participants": [
            "Alice",
            "Bob"
          ]
        }
      }
    }
  ],
  "created": 1724857389,
  "model": "gpt-4o-2024-08-06",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": "fp_1c2eaec9fe",
  "usage": {
    "completion_tokens": 27,
    "prompt_tokens": 32,
    "total_tokens": 59
  }
}

response_format è impostato su json_schema con strict: true impostato.

curl -X POST  https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_MODEL_DEPLOYMENT_NAME/chat/completions?api-version=2024-10-21 \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
    -d '{
        "messages": [
                {"role": "system", "content": "Extract the event information."},
                {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."}
            ],
            "response_format": {
                "type": "json_schema",
                "json_schema": {
                    "name": "CalendarEventResponse",
                    "strict": true,
                    "schema": {
                        "type": "object",
                        "properties": {
                            "name": {
                              "type": "string"
                            },
                            "date": {
                                "type": "string"
                            },
                            "participants": {
                                "type": "array",
                                "items": {
                                    "type": "string"
                                }
                            }
                        },
                        "required": [
                            "name",
                            "date",
                            "participants"
                        ],
                        "additionalProperties": false
                    }
                }
          }
  }'

Output:

{
  "id": "chatcmpl-A1HKsHAe2hH9MEooYslRn9UmEwsag",
  "object": "chat.completion",
  "created": 1724868330,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\n  \"name\": \"Science Fair\",\n  \"date\": \"Friday\",\n  \"participants\": [\"Alice\", \"Bob\"]\n}"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 33,
    "completion_tokens": 27,
    "total_tokens": 60
  },
  "system_fingerprint": "fp_1c2eaec9fe"
}

Chiamata di funzione con output strutturati

Gli output strutturati per le chiamate di funzione possono essere abilitati con un singolo parametro, fornendo strict: true.

Nota

Gli output strutturati non sono supportati con chiamate di funzione parallele. Quando si usano output strutturati, impostare parallel_tool_calls su false.

from enum import Enum
from typing import Union
from pydantic import BaseModel
import openai
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-10-21"
)


class GetDeliveryDate(BaseModel):
    order_id: str

tools = [openai.pydantic_function_tool(GetDeliveryDate)]

messages = []
messages.append({"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."})
messages.append({"role": "user", "content": "Hi, can you tell me the delivery date for my order #12345?"}) 

response = client.chat.completions.create(
    model="MODEL_DEPLOYMENT_NAME", # replace with the model deployment name of your gpt-4o 2024-08-06 deployment
    messages=messages,
    tools=tools
)

print(response.choices[0].message.tool_calls[0].function)
print(response.model_dump_json(indent=2))

from enum import Enum
from typing import Union
from pydantic import BaseModel
import openai
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-10-21"
)

class GetDeliveryDate(BaseModel):
    order_id: str

tools = [openai.pydantic_function_tool(GetDeliveryDate)]

messages = []
messages.append({"role": "system", "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."})
messages.append({"role": "user", "content": "Hi, can you tell me the delivery date for my order #12345?"}) 

response = client.chat.completions.create(
    model="MODEL_DEPLOYMENT_NAME", # replace with the model deployment name of your gpt-4o 2024-08-06 deployment
    messages=messages,
    tools=tools
)

print(response.choices[0].message.tool_calls[0].function)
print(response.model_dump_json(indent=2))

curl -X POST  https://YOUR_RESOURCE_NAME.openai.azure.com/openai/deployments/YOUR_MODEL_DEPLOYMENT_NAME/chat/completions?api-version=2024-10-21 \
  -H "api-key: $AZURE_OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant. The current date is August 6, 2024. You help users query for the data they are looking for by calling the query function."
    },
    {
      "role": "user",
      "content": "look up all my orders in may of last year that were fulfilled but not delivered on time"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "query",
        "description": "Execute a query.",
        "strict": true,
        "parameters": {
          "type": "object",
          "properties": {
            "table_name": {
              "type": "string",
              "enum": ["orders"]
            },
            "columns": {
              "type": "array",
              "items": {
                "type": "string",
                "enum": [
                  "id",
                  "status",
                  "expected_delivery_date",
                  "delivered_at",
                  "shipped_at",
                  "ordered_at",
                  "canceled_at"
                ]
              }
            },
            "conditions": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "column": {
                    "type": "string"
                  },
                  "operator": {
                    "type": "string",
                    "enum": ["=", ">", "<", ">=", "<=", "!="]
                  },
                  "value": {
                    "anyOf": [
                      {
                        "type": "string"
                      },
                      {
                        "type": "number"
                      },
                      {
                        "type": "object",
                        "properties": {
                          "column_name": {
                            "type": "string"
                          }
                        },
                        "required": ["column_name"],
                        "additionalProperties": false
                      }
                    ]
                  }
                },
                "required": ["column", "operator", "value"],
                "additionalProperties": false
              }
            },
            "order_by": {
              "type": "string",
              "enum": ["asc", "desc"]
            }
          },
          "required": ["table_name", "columns", "conditions", "order_by"],
          "additionalProperties": false
        }
      }
    }
  ]
}'

Schemi supportati e limitazioni

Gli output strutturati di Azure OpenAI supportano lo stesso subset delloschema JSON di OpenAI.

Tipi supportati

String
Numero
Booleano
Intero
Object
Array
Enumerazione
anyOf

Nota

Gli oggetti radice non possono essere di tipo anyOf.

Tutti i campi devono essere obbligatori

Tutti i campi o i parametri delle funzioni devono essere inclusi come richiesto. Nell'esempio seguente location e unit sono entrambi specificati in "required": ["location", "unit"].

{
    "name": "get_weather",
    "description": "Fetches the weather in the given location",
    "strict": true,
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The location to get the weather for"
            },
            "unit": {
                "type": "string",
                "description": "The unit to return the temperature in",
                "enum": ["F", "C"]
            }
        },
        "additionalProperties": false,
        "required": ["location", "unit"]
    }

Se necessario, è possibile emulare un parametro facoltativo usando un tipo di unione con null. In questo esempio questo risultato viene ottenuto con la riga "type": ["string", "null"],.

{
    "name": "get_weather",
    "description": "Fetches the weather in the given location",
    "strict": true,
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The location to get the weather for"
            },
            "unit": {
                "type": ["string", "null"],
                "description": "The unit to return the temperature in",
                "enum": ["F", "C"]
            }
        },
        "additionalProperties": false,
        "required": [
            "location", "unit"
        ]
    }
}

Livello di annidamento

Uno schema può avere fino a 100 proprietà oggetto totali, con un massimo di cinque livelli di annidamento

additionalProperties: false deve essere sempre impostato negli oggetti

Questa proprietà controlla se un oggetto può avere coppie chiave-valore aggiuntive che non sono state definite nello schema JSON. Per usare output strutturati, è necessario impostare questo valore su false.

Ordinamento delle chiavi

Gli output strutturati vengono ordinati allo stesso modo dello schema fornito. Per modificare l'ordine di output, modificare l'ordine dello schema inviato come parte della richiesta di inferenza.

Parole chiave specifiche del tipo non supportate

Type	Parola chiave non supportata
String	minlength maxLength pattern format
Numero	minimum maximum multipleOf
Oggetti	patternProperties unevaluatedProperties propertyNames minProperties maxProperties
Matrici	unevaluatedItems contains minContains maxContains minItems maxItems uniqueItems

Gli schemi annidati che usano anyOf devono rispettare il subset complessivo dello schema JSON

Esempio di schema anyOf supportato:

{
	"type": "object",
	"properties": {
		"item": {
			"anyOf": [
				{
					"type": "object",
					"description": "The user object to insert into the database",
					"properties": {
						"name": {
							"type": "string",
							"description": "The name of the user"
						},
						"age": {
							"type": "number",
							"description": "The age of the user"
						}
					},
					"additionalProperties": false,
					"required": [
						"name",
						"age"
					]
				},
				{
					"type": "object",
					"description": "The address object to insert into the database",
					"properties": {
						"number": {
							"type": "string",
							"description": "The number of the address. Eg. for 123 main st, this would be 123"
						},
						"street": {
							"type": "string",
							"description": "The street name. Eg. for 123 main st, this would be main st"
						},
						"city": {
							"type": "string",
							"description": "The city of the address"
						}
					},
					"additionalProperties": false,
					"required": [
						"number",
						"street",
						"city"
					]
				}
			]
		}
	},
	"additionalProperties": false,
	"required": [
		"item"
	]
}

Le definizioni sono supportate

Esempio supportato:

{
	"type": "object",
	"properties": {
		"steps": {
			"type": "array",
			"items": {
				"$ref": "#/$defs/step"
			}
		},
		"final_answer": {
			"type": "string"
		}
	},
	"$defs": {
		"step": {
			"type": "object",
			"properties": {
				"explanation": {
					"type": "string"
				},
				"output": {
					"type": "string"
				}
			},
			"required": [
				"explanation",
				"output"
			],
			"additionalProperties": false
		}
	},
	"required": [
		"steps",
		"final_answer"
	],
	"additionalProperties": false
}

Gli schemi ricorsivi sono supportati

Esempio di uso di # per la ricorsione radice:

{
        "name": "ui",
        "description": "Dynamically generated UI",
        "strict": true,
        "schema": {
            "type": "object",
            "properties": {
                "type": {
                    "type": "string",
                    "description": "The type of the UI component",
                    "enum": ["div", "button", "header", "section", "field", "form"]
                },
                "label": {
                    "type": "string",
                    "description": "The label of the UI component, used for buttons or form fields"
                },
                "children": {
                    "type": "array",
                    "description": "Nested UI components",
                    "items": {
                        "$ref": "#"
                    }
                },
                "attributes": {
                    "type": "array",
                    "description": "Arbitrary attributes for the UI component, suitable for any element",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": {
                                "type": "string",
                                "description": "The name of the attribute, for example onClick or className"
                            },
                            "value": {
                                "type": "string",
                                "description": "The value of the attribute"
                            }
                        },
                      "additionalProperties": false,
                      "required": ["name", "value"]
                    }
                }
            },
            "required": ["type", "label", "children", "attributes"],
            "additionalProperties": false
        }
    }

Esempio di ricorsione esplicita:

{
	"type": "object",
	"properties": {
		"linked_list": {
			"$ref": "#/$defs/linked_list_node"
		}
	},
	"$defs": {
		"linked_list_node": {
			"type": "object",
			"properties": {
				"value": {
					"type": "number"
				},
				"next": {
					"anyOf": [
						{
							"$ref": "#/$defs/linked_list_node"
						},
						{
							"type": "null"
						}
					]
				}
			},
			"additionalProperties": false,
			"required": [
				"next",
				"value"
			]
		}
	},
	"additionalProperties": false,
	"required": [
		"linked_list"
	]
}

Condividi tramite

Output strutturati

Modelli supportati

Supporto dell'API

Introduzione

Output

Output

Chiamata di funzione con output strutturati

Schemi supportati e limitazioni

Tipi supportati

Tutti i campi devono essere obbligatori

Livello di annidamento

additionalProperties: false deve essere sempre impostato negli oggetti

Ordinamento delle chiavi

Parole chiave specifiche del tipo non supportate

Gli schemi annidati che usano anyOf devono rispettare il subset complessivo dello schema JSON

Le definizioni sono supportate

Gli schemi ricorsivi sono supportati

Commenti e suggerimenti

Risorse aggiuntive