How to deploy an LLM in a batch endpoint in Azure Machine Learning? Any alternatives?

Question

Hi,

I'm trying to deploy an LLM in Azure Machine Learning in a batch endpoint. I was successful in creating an online endpoint, but facing issues with batch endpoint.

The goal of the project is to deploy an LLM that will run every morning and produce an output. The output needs to be written in a storage (Blob, SQL, etc).

In Azure ML Studio GUI's Model Catalog I don't see option to deploy a model in batch endpoint. The deployment is automatically done in an online endpoint. With Python SDK, I'm facing an error:

deployment = ModelBatchDeployment(
    name=f"{endpoint_name}-deployment",
    description="first deployment",
    model=model_id,
    compute="AmlCluster",
    endpoint_name=endpoint_name,
    settings = ModelBatchDeploymentSettings(
            instance_count=1,
            max_concurrency_per_instance=1,
            mini_batch_size=1,
            output_action=BatchDeploymentOutputAction.APPEND_ROW,
            output_file_name="predictions.csv",
            retry_settings=BatchRetrySettings(max_retries=3, timeout=3000),
            logging_level="info",
        ),
)

ml_client.batch_deployments.begin_create_or_update(deployment).result()

HttpResponseError: Operation returned an invalid status 'OK' Content: { "id": "/subscriptions/.../providers/Microsoft.MachineLearningServices/locations/.../mfeOperationsStatus/bdbes:...", "name": "bdbes:...", "status": "Failed" }
*Note that I've changed hashes with three dots.

Below is the screenshot showing the failed deployment from the batch endpoint page.

User's image

I have few questions:

Is batch deployment possible in Azure Machine Learning with Model Catalog?
Is batch deployment possible in Azure Machine Learning if I use the notebook to download a HuggingFace model and register it manually?
Is there a better alternative for achieving the goal? E.g. using Azure Virtual Machines or similar?

Thanks!

Answer

@Karlos Muradyan I think at this point, hugging face models form catalog are not available for batch endpoints. This is documented in the FAQ section along with other details which help understand the capabilities and limitations of hugging face models. See this page.

How to deploy the models for batch inference? Deploying these models to batch endpoints for batch inference is currently not supported.

If you need to manually register a model and then deploy that for batch endpoint, see this page for setting up with studio, SDK or CLI. If you are able to register the model first and then create a deployment and custom script that is used for batch endpoints, you should be able to create a batch endpoint. As is, the error you are seeing could be due to the fact that this model is not supported directly for batch endpoints. I hope this helps!!

Share via

How to deploy an LLM in a batch endpoint in Azure Machine Learning? Any alternatives?

1 answer

Your answer