Hi,
I'm trying to deploy an LLM in Azure Machine Learning in a batch endpoint. I was successful in creating an online endpoint, but facing issues with batch endpoint.
The goal of the project is to deploy an LLM that will run every morning and produce an output. The output needs to be written in a storage (Blob, SQL, etc).
In Azure ML Studio GUI's Model Catalog I don't see option to deploy a model in batch endpoint. The deployment is automatically done in an online endpoint. With Python SDK, I'm facing an error:
deployment = ModelBatchDeployment(
name=f"{endpoint_name}-deployment",
description="first deployment",
model=model_id,
compute="AmlCluster",
endpoint_name=endpoint_name,
settings = ModelBatchDeploymentSettings(
instance_count=1,
max_concurrency_per_instance=1,
mini_batch_size=1,
output_action=BatchDeploymentOutputAction.APPEND_ROW,
output_file_name="predictions.csv",
retry_settings=BatchRetrySettings(max_retries=3, timeout=3000),
logging_level="info",
),
)
ml_client.batch_deployments.begin_create_or_update(deployment).result()
HttpResponseError: Operation returned an invalid status 'OK' Content: { "id": "/subscriptions/.../providers/Microsoft.MachineLearningServices/locations/.../mfeOperationsStatus/bdbes:...", "name": "bdbes:...", "status": "Failed" }
*Note that I've changed hashes with three dots.
Below is the screenshot showing the failed deployment from the batch endpoint page.
I have few questions:
- Is batch deployment possible in Azure Machine Learning with Model Catalog?
- Is batch deployment possible in Azure Machine Learning if I use the notebook to download a HuggingFace model and register it manually?
- Is there a better alternative for achieving the goal? E.g. using Azure Virtual Machines or similar?
Thanks!