Model Evaluations - Create
Evaluate an existing model.
Status codes returned:
- 201: Operation completed successfully.
- 400: The request was malformed.
- 409: An evaluation with the specified name already exists.
PUT /models/{name}/evaluations/{evaluationName}?api-version=2023-04-01-preview
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
evaluation
|
path | True |
string |
A name that can be used to uniquely identify the evaluation after it has been created. Regex pattern: |
name
|
path | True |
string |
The name of the model to evaluate. Regex pattern: |
api-version
|
query | True |
string |
Requested API version. |
Request Body
Media Types: "application/json-patch+json"
Name | Required | Type | Description |
---|---|---|---|
evaluationParameters | True |
Parameters for specifying how a model is evaluated. |
|
createdDateTime |
string |
Read only. The date and time when the evaluation run was first created, in UTC. |
|
error |
Error info. |
||
modelName |
string |
Read only. The model to evaluate. |
|
modelPerformance |
Performance metrics for a custom trained model. |
||
name |
string |
Read only. The name that is used to uniquely identify the evaluation run. |
|
status |
Read only. The current state of the evaluation run. |
||
updatedDateTime |
string |
Read only. The date and time when the evaluation run was last updated, in UTC. |
Responses
Name | Type | Description |
---|---|---|
201 Created |
Created |
|
Other Status Codes |
Error Headers x-ms-error-code: string |
Examples
ModelEvaluations_Create
Sample request
PUT /models/my_model_name/evaluations/my_evaluation_name?api-version=2023-04-01-preview
{
"evaluationParameters": {
"testDatasetName": "my_test_dataset_name"
}
}
Sample response
{
"name": "my_evaluation_name",
"modelName": "my_model_name",
"createdDateTime": "2023-01-13T20:46:22.127Z",
"updatedDateTime": "2023-01-13T20:46:22.127Z",
"status": "notStarted",
"evaluationParameters": {
"testDatasetName": "my_test_dataset_name"
}
}
Definitions
Name | Description |
---|---|
Error |
Response returned when an error occurs. |
Error |
Error info. |
Error |
Detailed error. |
Model |
Describes an evaluation run for evaluating the accuracy of a model using a test set. |
Model |
Parameters for specifying how a model is evaluated. |
Model |
Read only. The current state of the evaluation run. |
Model |
Performance metrics for a custom trained model. |
Model |
Performance metrics for each tag recognized by a custom trained model. |
ErrorResponse
Response returned when an error occurs.
Name | Type | Description |
---|---|---|
error |
Error info. |
ErrorResponseDetails
Error info.
Name | Type | Description |
---|---|---|
code |
string |
Error code. |
details |
List of detailed errors. |
|
innererror |
Detailed error. |
|
message |
string |
Error message. |
target |
string |
Target of the error. |
ErrorResponseInnerError
Detailed error.
Name | Type | Description |
---|---|---|
code |
string |
Error code. |
innererror |
Detailed error. |
|
message |
string |
Error message. |
ModelEvaluation
Describes an evaluation run for evaluating the accuracy of a model using a test set.
Name | Type | Description |
---|---|---|
createdDateTime |
string |
Read only. The date and time when the evaluation run was first created, in UTC. |
error |
Error info. |
|
evaluationParameters |
Parameters for specifying how a model is evaluated. |
|
modelName |
string |
Read only. The model to evaluate. |
modelPerformance |
Performance metrics for a custom trained model. |
|
name |
string |
Read only. The name that is used to uniquely identify the evaluation run. |
status |
Read only. The current state of the evaluation run. |
|
updatedDateTime |
string |
Read only. The date and time when the evaluation run was last updated, in UTC. |
ModelEvaluationParameters
Parameters for specifying how a model is evaluated.
Name | Type | Description |
---|---|---|
testDatasetName |
string |
The dataset name used for testing. |
ModelEvaluationState
Read only. The current state of the evaluation run.
Name | Type | Description |
---|---|---|
failed |
string |
|
notStarted |
string |
|
running |
string |
|
succeeded |
string |
ModelPerformance
Performance metrics for a custom trained model.
Name | Type | Description |
---|---|---|
accuracyTop1 |
number |
Read only. For multiclass classification models. The proportion of test samples where the ground truth class matches the predicted class. |
accuracyTop5 |
number |
Read only. For multiclass classification models. The proportion of test samples where the ground truth class is in the top five predicted classes. |
averagePrecision |
number |
Read only. A measure of the model performance, it summarizes the precision and recall at different confidence thresholds. |
calibrationECE |
number |
Read only. For multiclass classification models. Expected calibration error. |
meanAveragePrecision30 |
number |
Read only. For object detection models. Mean average precision at a threshold of 30%. |
meanAveragePrecision50 |
number |
Read only. For object detection models. Mean average precision at a threshold of 50%. |
meanAveragePrecision75 |
number |
Read only. For object detection models. Mean average precision at a threshold of 75%. |
tagPerformance |
<string,
Model |
Read only. Performance metrics for each tag recognized by the model. |
ModelTagPerformance
Performance metrics for each tag recognized by a custom trained model.
Name | Type | Description |
---|---|---|
accuracy |
number |
Read only. For multiclass models. Tag accuracy. |
averagePrecision50 |
number |
Read only. For object detection models. Average precision at a threshold of 50%. |