How to fix Unknown Model error?

Question

I am following Build a Custom Knowledge Retrieval (RAG) chatbot using Azure AI Foundry. I have setted up the AI search and also deployed two models gpt-4o-mini and text-embedding-ada-002, but I can't seem to access them in my computer using azure sdk. I have copied the code from tutorial and try to run it but It gives error

azure.core.exceptions.HttpResponseError: (unavailable_model) Unavailable model: gpt-4o-mini

Code: unavailable_model

Message: Unavailable model: gpt-4o-mini.
in when running code for intent_mapping_response. Additionally the same thing happens when I try to run embedding code.

I am trying to run following code:

import os
from pathlib import Path
from opentelemetry import trace
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import ConnectionType
from azure.identity import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from config import ASSET_PATH, get_logger
from azure.ai.inference.prompts import PromptTemplate

from azure.search.documents.models import VectorizedQuery
from dotenv import load_dotenv

from azure.identity import DefaultAzureCredential
import os



load_dotenv()

# initialize logging and tracing objects
logger = get_logger(__name__)
tracer = trace.get_tracer(__name__)

# create a project client using environment variables loaded from the .env file
project = AIProjectClient.from_connection_string(
    conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
)
print(dir(project.inference))

# create a vector embeddings client that will be used to generate vector embeddings
chat = project.inference.get_chat_completions_client()
print("chat completion client created :", chat)
model_info = chat.get_model_info('gpt-4o-mini')  # Replace with your model name
print("Model Info:", model_info)

# Print available attributes and methods

embeddings = project.inference.get_embeddings_client()
print("embedding client created :", embeddings)

# use the project client to get the default search connection
search_connection = project.connections.get_default(
    connection_type=ConnectionType.AZURE_AI_SEARCH, include_credentials=True
)

# Create a search index client using the search connection
# This client will be used to create and delete search indexes
search_client = SearchClient(
    index_name=os.environ["AISEARCH_INDEX_NAME"],
    endpoint=search_connection.endpoint_url,
    credential=AzureKeyCredential(key=search_connection.key),
)

@tracer.start_as_current_span(name="get_relevant_documents")
def get_relevant_documents(messages: list, context: dict = None) -> dict:
    if context is None:
        context = {}

    overrides = context.get("overrides", {})
    top = overrides.get("top", 5)


    # # generate a search query from the chat messages
    intent_prompty = PromptTemplate.from_prompty("intent_mapping.prompty")

    intent_mapping_response = chat.complete(
        model=os.environ["INTENT_MAPPING_MODEL"],
        messages=intent_prompty.create_messages(conversation=messages),
        **intent_prompty.parameters,
    )

    search_query = intent_mapping_response.choices[0].message.content
    logger.debug(f"🧠 Intent mapping: {search_query}")

    # generate a vector representation of the search query
    embedding = embeddings.embed(model=os.environ["EMBEDDINGS_MODEL"], input=search_query)
    search_vector = embedding.data[0].embedding

    # search the index for products matching the search query
    vector_query = VectorizedQuery(vector=search_vector, k_nearest_neighbors=top, fields="contentVector")

    search_results = search_client.search(
        search_text=search_query, vector_queries=[vector_query], select=["id", "content", "filepath", "title", "url"]
    )

    documents = [
        {
            "id": result["id"],
            "content": result["content"],
            "filepath": result["filepath"],
            "title": result["title"],
            "url": result["url"],
        }
        for result in search_results
    ]

    # add results to the provided context
    if "thoughts" not in context:
        context["thoughts"] = []

    # add thoughts and documents to the context object so it can be returned to the caller
    context["thoughts"].append(
        {
            "title": "Generated search query",
            "description": search_query,
        }
    )

    if "grounding_data" not in context:
        context["grounding_data"] = []
    context["grounding_data"].append(documents)

    logger.debug(f"📄 {len(documents)} documents retrieved: {documents}")
    return documents

if __name__ == "__main__":
    import logging
    import argparse

    # set logging level to debug when running this module directly
    logger.setLevel(logging.DEBUG)

    # load command line arguments
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--query",
        type=str,
        help="Query to use to search product",
        default="I need a new tent for 4 people, what would you recommend?",
    )

    args = parser.parse_args()
    query = args.query

    result = get_relevant_documents(messages=[{"role": "user", "content": query}])

Answer

Hello Aqib Riaz,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

Regarding your experience with Unknown Model error. I have put together the following steps to identify and resolve the root causes of the error "unavailable_model: gpt-4o-mini". This guide ensures the problem is addressed effectively and each step includes actionable insights, necessary code examples, and references to Azure documentation.

To ensure the models are correctly deployed, log into the Azure Portal:

Navigate to your Azure OpenAI Resource or Azure AI Foundry Resource.
Under the Model Deployments section, locate the models. Verify whether gpt-4o-mini and text-embedding-ada-002 are listed as active deployments.
Azure OpenAI Documentation - Model Deployment - https://zcusa.951200.xyz/en-us/azure/ai-services/openai/concepts/models

Replace the model names in your code with the exact names found in the Azure Portal. Often, errors arise from mismatched names. For an example:

from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential
# Replace with the exact names shown in the Azure Portal
model_info = chat.get_model_info("DEPLOYED_MODEL_NAME")
embedding = embeddings.embed(model="DEPLOYED_EMBEDDING_MODEL_NAME", input=search_query)

This ensures your application uses the correct deployment identifiers.

To programmatically confirm available models, use the Azure SDK to list deployed models. This is especially useful to cross-check portal information. For an example:

from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential
import os
# Initialize client with connection string and credential
project = AIProjectClient.from_connection_string(
    conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
)
# List and display available models
available_models = project.inference.list_models()
print("Available models:", available_models)

If the desired model (gpt-4o-mini) is not listed, it has not been deployed or may be unavailable in your subscription or region. Check for more here - https://zcusa.951200.xyz/en-us/python/api/overview/azure/ai?view=azure-python

Ensure the models are deployed in the correct region:

Check your connection string and confirm it matches the region of your resource.
Verify that the resource type (e.g., Azure OpenAI Foundry) supports the models you intend to use.

Certain models may have regional restrictions or limited availability. For instance, some high-performance models are available only in select regions. Check Azure OpenAI Region Availability here - https://zcusa.951200.xyz/en-us/azure/ai-services/openai/overview#regional-availability

Incorrect or missing environment variables can cause authentication or connection errors:

Verify the .env file contains:

AIPROJECT_CONNECTION_STRING: The correct connection string for your Azure AI resource.

INTENT_MAPPING_MODEL and EMBEDDINGS_MODEL: The names of deployed models as shown in the Azure Portal. For an example:

import os
# Check environment variable setup
print("Connection String:", os.getenv("AIPROJECT_CONNECTION_STRING"))
print("Intent Model:", os.getenv("INTENT_MAPPING_MODEL"))
print("Embeddings Model:", os.getenv("EMBEDDINGS_MODEL"))

If the problem persists:

Use the Azure CLI or Portal to redeploy the models using bash command: az openai deployment create --resource-group --name --model
Start with models like gpt-3.5-turbo or ada to verify deployment configurations.
Read more about Azure CLI OpenAI Deployment here - https://zcusa.951200.xyz/en-us/cli/azure/openai

NOTE: This is an additional step. If none of these steps resolve the issue:

Ensure your Azure subscription includes access to restricted models AND / OR contact Azure support from your Azure Portal.

I hope this is helpful! Do not hesitate to let me know if you have any other questions.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Share via

How to fix Unknown Model error?

1 answer

Your answer