Image Analysis - Analyze Image

Reference

Service:: Azure AI Services

API Version:: 2023-04-01-preview

Analyze the input image. The request either contains image stream with any content type ['image/*', 'application/octet-stream'], or a JSON payload which includes an url property to be used to retrieve the image stream.

POST /imageanalysis:analyze?api-version=2023-04-01-preview

With optional parameters:

POST /imageanalysis:analyze?features={features}&model-name={model-name}&language={language}&smartcrops-aspect-ratios={smartcrops-aspect-ratios}&gender-neutral-caption={gender-neutral-caption}&api-version=2023-04-01-preview

URI Parameters

Name	In	Required	Type	Description
api-version	query	True	string	Requested API version.
features	query		VisualFeature[]	The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified.
gender-neutral-caption	query		boolean	Boolean flag for enabling gender-neutral captioning for caption and denseCaptions features. If this parameter is not specified, the default value is "false".
language	query		string	The desired language for output generation. If this parameter is not specified, the default value is "en". See https://aka.ms/cv-languages for a list of supported languages.
model-name	query		string	The name of the custom trained model. This parameter needs to be specified if the parameter "features" is not specified.
smartcrops-aspect-ratios	query		string	A list of aspect ratios to use for smartCrops feature. Aspect ratios are calculated by dividing the target crop width by the height. Supported values are between 0.75 and 1.8 (inclusive). Multiple values should be comma-separated. If this parameter is not specified, the service will return one crop suggestion with an aspect ratio it sees fit between 0.5 and 2.0 (inclusive).

Request Body

Name	Required	Type	Description
url	True	string	Publicly reachable URL of an image.

Responses

Name	Type	Description
200 OK	ImageAnalysisResult	Success
Other Status Codes	ErrorResponse	Error Headers x-ms-error-code: string

Name

Type

Description

200 OK

ImageAnalysisResult

Success

Other Status Codes

ErrorResponse

Error

Headers

x-ms-error-code: string

Examples

AnalyzeImage_CustomModel

Sample request

HTTP

POST /imageanalysis:analyze?model-name=my_model_name&api-version=2023-04-01-preview

{
  "url": "https://example.com/image.jpg"
}

Sample response

Status code:: 200

{
  "customModelResult": {
    "objectsResult": {
      "values": [
        {
          "id": "1",
          "boundingBox": {
            "x": 197,
            "y": 68,
            "w": 356,
            "h": 394
          },
          "tags": [
            {
              "name": "class1",
              "confidence": 0.92431640625
            }
          ]
        },
        {
          "id": "2",
          "boundingBox": {
            "x": 0,
            "y": 77,
            "w": 241,
            "h": 359
          },
          "tags": [
            {
              "name": "class1",
              "confidence": 0.87890625
            }
          ]
        }
      ]
    }
  },
  "modelVersion": "2023-04-01-preview",
  "metadata": {
    "width": 660,
    "height": 495
  }
}

Definitions

Name	Description
AdultMatch	An object describing adult content match.
AdultResult	An object describing whether the image contains adult-oriented content and/or is racy.
BoundingBox	A bounding box for an area inside an image.
CaptionResult	A brief description of what the image depicts.
CropRegion	A region identified for smart cropping. There will be one region returned for each requested aspect ratio.
DenseCaption	A brief description of what the image depicts.
DenseCaptionsResult	A list of captions.
DetectedObject	Describes a detected object in an image.
DetectedPerson	A person detected in an image.
DocumentLine	A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.
DocumentPage	The content and layout elements extracted from a page from the input.
DocumentSpan	Contiguous region of the concatenated content property, specified as an offset and length.
DocumentStyle	An object representing observed text styles.
DocumentWord	A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.
ErrorResponse	Response returned when an error occurs.
ErrorResponseDetails	Error info.
ErrorResponseInnerError	Detailed error.
ImageAnalysisResult	Describe the combined results of different types of image analysis.
ImageMetadataApiModel	The image metadata information such as height and width.
ImagePredictionResult	Describes the prediction result of an image.
ImageUrl	A JSON document with a URL pointing to the image that is to be analyzed.
ObjectsResult	Describes detected objects in an image.
PeopleResult	An object describing whether the image contains people.
ReadResult	The results of an Read operation.
SmartCropsResult	Smart cropping result.
Tag	An entity observation in the image, along with the confidence score.
TagsResult	A list of tags with confidence level.
VisualFeature	The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified.

AdultMatch

An object describing adult content match.

Name	Type	Description
confidence	number	A value indicating the confidence level of matched adult content.
isMatch	boolean	A value indicating if the image is matched adult content.

AdultResult

An object describing whether the image contains adult-oriented content and/or is racy.

Name	Type	Description
adult	AdultMatch	An object describing adult content match.
gore	AdultMatch	An object describing adult content match.
racy	AdultMatch	An object describing adult content match.

BoundingBox

A bounding box for an area inside an image.

Name	Type	Description
h	integer	Height measured from the top-left point of the area, in pixels.
w	integer	Width measured from the top-left point of the area, in pixels.
x	integer	Left-coordinate of the top left point of the area, in pixels.
y	integer	Top-coordinate of the top left point of the area, in pixels.

CaptionResult

A brief description of what the image depicts.

Name	Type	Description
confidence	number	The level of confidence the service has in the caption.
text	string	The text of the caption.

CropRegion

A region identified for smart cropping. There will be one region returned for each requested aspect ratio.

Name	Type	Description
aspectRatio	number	The aspect ratio of the crop region.
boundingBox	BoundingBox	A bounding box for an area inside an image.

DenseCaption

A brief description of what the image depicts.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
confidence	number	The level of confidence the service has in the caption.
text	string	The text of the caption.

DenseCaptionsResult

A list of captions.

Name	Type	Description
values	DenseCaption[]	A list of captions.

DetectedObject

Describes a detected object in an image.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
id	string	Id of the detected object.
tags	Tag[]	Classification confidences of the detected object.

DetectedPerson

A person detected in an image.

Name	Type	Description
boundingBox	BoundingBox	A bounding box for an area inside an image.
confidence	number	Confidence score of having observed the person in the image, as a value ranging from 0 to 1.

DocumentLine

A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.

Name	Type	Description
boundingBox	number[]	Bounding box of the line.
content	string	Concatenated content of the contained elements in reading order.
spans	DocumentSpan[]	Location of the line in the reading order concatenated content.

DocumentPage

The content and layout elements extracted from a page from the input.

Name	Type	Description
angle	number	The general orientation of the content in clockwise direction, measured in degrees between (-180, 180].
height	number	The height of the image/PDF in pixels/inches, respectively.
lines	DocumentLine[]	Extracted lines from the page, potentially containing both textual and visual elements.
pageNumber	integer	1-based page number in the input document.
spans	DocumentSpan[]	Location of the page in the reading order concatenated content.
width	number	The width of the image/PDF in pixels/inches, respectively.
words	DocumentWord[]	Extracted words from the page.

DocumentSpan

Contiguous region of the concatenated content property, specified as an offset and length.

Name	Type	Description
length	integer	Number of characters in the content represented by the span.
offset	integer	Zero-based index of the content represented by the span.

DocumentStyle

An object representing observed text styles.

Name	Type	Description
confidence	number	Confidence of correctly identifying the style.
isHandwritten	boolean	Is content handwritten or not.
spans	DocumentSpan[]	Location of the text elements in the concatenated content the style applies to.

DocumentWord

A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.

Name	Type	Description
boundingBox	number[]	Bounding box of the word.
confidence	number	Confidence of correctly extracting the word.
content	string	Text content of the word.
span	DocumentSpan	Contiguous region of the concatenated content property, specified as an offset and length.

ErrorResponse

Response returned when an error occurs.

Name	Type	Description
error	ErrorResponseDetails	Error info.

ErrorResponseDetails

Error info.

Name	Type	Description
code	string	Error code.
details	ErrorResponseDetails[]	List of detailed errors.
innererror	ErrorResponseInnerError	Detailed error.
message	string	Error message.
target	string	Target of the error.

ErrorResponseInnerError

Detailed error.

Name	Type	Description
code	string	Error code.
innererror	ErrorResponseInnerError	Detailed error.
message	string	Error message.

ImageAnalysisResult

Describe the combined results of different types of image analysis.

Name	Type	Description
adultResult	AdultResult	An object describing whether the image contains adult-oriented content and/or is racy.
captionResult	CaptionResult	A brief description of what the image depicts.
customModelResult	ImagePredictionResult	Describes the prediction result of an image.
denseCaptionsResult	DenseCaptionsResult	A list of captions.
metadata	ImageMetadataApiModel	The image metadata information such as height and width.
modelVersion	string	Model Version.
objectsResult	ObjectsResult	Describes detected objects in an image.
peopleResult	PeopleResult	An object describing whether the image contains people.
readResult	ReadResult	The results of an Read operation.
smartCropsResult	SmartCropsResult	Smart cropping result.
tagsResult	TagsResult	A list of tags with confidence level.

ImageMetadataApiModel

The image metadata information such as height and width.

Name	Type	Description
height	integer	The height of the image in pixels.
width	integer	The width of the image in pixels.

ImagePredictionResult

Describes the prediction result of an image.

Name	Type	Description
objectsResult	ObjectsResult	Describes detected objects in an image.
tagsResult	TagsResult	A list of tags with confidence level.

ImageUrl

A JSON document with a URL pointing to the image that is to be analyzed.

Name	Type	Description
url	string	Publicly reachable URL of an image.

ObjectsResult

Describes detected objects in an image.

Name	Type	Description
values	DetectedObject[]	An array of detected objects.

PeopleResult

An object describing whether the image contains people.

Name	Type	Description
values	DetectedPerson[]	An array of detected people.

ReadResult

The results of an Read operation.

Name	Type	Description
content	string	Concatenate string representation of all textual and visual elements in reading order.
pages	DocumentPage[]	A list of analyzed pages.
stringIndexType	string	The method used to compute string offset and length, possible values include: 'textElements', 'unicodeCodePoint', 'utf16CodeUnit' etc.
styles	DocumentStyle[]	Extracted font styles.

SmartCropsResult

Smart cropping result.

Name	Type	Description
values	CropRegion[]	Recommended regions for cropping the image.

Tag

An entity observation in the image, along with the confidence score.

Name	Type	Description
confidence	number	The level of confidence that the entity was observed.
name	string	Name of the entity.

TagsResult

A list of tags with confidence level.

Name	Type	Description
values	Tag[]	A list of tags with confidence level.

VisualFeature

The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified.

Name	Type	Description
caption	string
denseCaptions	string
objects	string
people	string
read	string
smartCrops	string
tags	string

Share via

Image Analysis - Analyze Image

URI Parameters

Request Body

Responses

Examples

AnalyzeImage_CustomModel

Sample request

Sample response

Definitions

AdultMatch

AdultResult

BoundingBox

CaptionResult

CropRegion

DenseCaption

DenseCaptionsResult

DetectedObject

DetectedPerson

DocumentLine

DocumentPage

DocumentSpan

DocumentStyle

DocumentWord

ErrorResponse

ErrorResponseDetails

ErrorResponseInnerError

ImageAnalysisResult

ImageMetadataApiModel

ImagePredictionResult

ImageUrl

ObjectsResult

PeopleResult

ReadResult

SmartCropsResult

Tag

TagsResult

VisualFeature

Additional resources