Analyze Image - Analyze Image

Reference

Service:: Azure AI Services

API Version:: 3.2

This operation extracts a rich set of visual features based on the image content. Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. Within your request, there is an optional parameter to allow you to choose which features to return. By default, image categories are returned in the response. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.

POST {Endpoint}/vision/v3.2/analyze

With optional parameters:

POST {Endpoint}/vision/v3.2/analyze?visualFeatures={visualFeatures}&details={details}&language={language}&descriptionExclude={descriptionExclude}&model-version={model-version}

URI Parameters

Name	In	Required	Type	Description
Endpoint	path	True	string	Supported Cognitive Services endpoints.
descriptionExclude	query		DescriptionExclude[]	Turn off specified domain models when generating the description.
details	query		Details[]	A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image.
language	query		string	The desired language for output generation. If this parameter is not specified, the default value is "en". See https://aka.ms/cv-languages for list of supported languages.
model-version	query		string	Optional parameter to specify the version of the AI model. Accepted values are: "latest", "2021-04-01", "2021-05-01". Defaults to "latest". Regex pattern: `^(latest\|\d{4}-\d{2}-\d{2})(-preview)?$`
visualFeatures	query		VisualFeatureTypes[]	A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English.

Request Header

Name	Required	Type	Description
Ocp-Apim-Subscription-Key	True	string

Request Body

Name	Required	Type	Description
url	True	string	Publicly reachable URL of an image.

Responses

Name	Type	Description
200 OK	ImageAnalysis	The response include the extracted features in JSON format. Here is the definitions for enumeration types: ClipartType Non - clipart = 0, ambiguous = 1, normal - clipart = 2, good - clipart = 3. LineDrawingTypeNon - LineDrawing = 0, LineDrawing = 1.
Other Status Codes	ComputerVisionErrorResponse	Error response.

Name

Type

Description

200 OK

ImageAnalysis

The response include the extracted features in JSON format. Here is the definitions for enumeration types:

ClipartType

Non - clipart = 0, ambiguous = 1, normal - clipart = 2, good - clipart = 3. LineDrawingTypeNon - LineDrawing = 0, LineDrawing = 1.

Other Status Codes

ComputerVisionErrorResponse

Error response.

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

Examples

Successful AnalyzeImage request

Sample request

HTTP

POST https://westus.api.cognitive.microsoft.com/vision/v3.2/analyze?visualFeatures=Categories,Adult,Tags,Description,Faces,Color,ImageType,Objects,Brands&details=Celebrities,Landmarks&language=en


{
  "url": "{url}"
}

Sample response

Status code:: 200

{
  "categories": [
    {
      "name": "abstract_",
      "score": 0.00390625
    },
    {
      "name": "people_",
      "score": 0.83984375,
      "detail": {
        "celebrities": [
          {
            "name": "Satya Nadella",
            "faceRectangle": {
              "left": 597,
              "top": 162,
              "width": 248,
              "height": 248
            },
            "confidence": 0.999028444
          }
        ]
      }
    },
    {
      "name": "building_",
      "score": 0.984375,
      "detail": {
        "landmarks": [
          {
            "name": "Forbidden City",
            "confidence": 0.9829016923904419
          }
        ]
      }
    }
  ],
  "adult": {
    "isAdultContent": false,
    "isRacyContent": false,
    "isGoryContent": false,
    "adultScore": 0.0934349000453949,
    "racyScore": 0.06861349195241928,
    "goreScore": 0.012872257380997575
  },
  "tags": [
    {
      "name": "person",
      "confidence": 0.9897908568382263
    },
    {
      "name": "man",
      "confidence": 0.9449388980865479
    },
    {
      "name": "outdoor",
      "confidence": 0.938492476940155
    },
    {
      "name": "window",
      "confidence": 0.8951393961906433
    },
    {
      "name": "pangolin",
      "confidence": 0.7250059783791661,
      "hint": "mammal"
    }
  ],
  "description": {
    "tags": [
      "person",
      "man",
      "outdoor",
      "window",
      "glasses"
    ],
    "captions": [
      {
        "text": "Satya Nadella sitting on a bench",
        "confidence": 0.48293603002174407
      }
    ]
  },
  "requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
  "metadata": {
    "width": 1500,
    "height": 1000,
    "format": "Jpeg"
  },
  "modelVersion": "2021-04-01",
  "faces": [
    {
      "age": 44,
      "gender": "Male",
      "faceRectangle": {
        "left": 593,
        "top": 160,
        "width": 250,
        "height": 250
      }
    }
  ],
  "color": {
    "dominantColorForeground": "Brown",
    "dominantColorBackground": "Brown",
    "dominantColors": [
      "Brown",
      "Black"
    ],
    "accentColor": "873B59",
    "isBWImg": false
  },
  "imageType": {
    "clipArtType": 0,
    "lineDrawingType": 0
  },
  "objects": [
    {
      "rectangle": {
        "x": 0,
        "y": 0,
        "w": 50,
        "h": 50
      },
      "object": "tree",
      "confidence": 0.9,
      "parent": {
        "object": "plant",
        "confidence": 0.95
      }
    }
  ],
  "brands": [
    {
      "name": "Pepsi",
      "confidence": 0.857,
      "rectangle": {
        "x": 489,
        "y": 79,
        "w": 161,
        "h": 177
      }
    },
    {
      "name": "Coca-Cola",
      "confidence": 0.893,
      "rectangle": {
        "x": 216,
        "y": 55,
        "w": 171,
        "h": 372
      }
    }
  ]
}

Definitions

Name	Description
AdultInfo	An object describing whether the image contains adult-oriented content and/or is racy.
BoundingRect	A bounding box for an area inside an image.
Category	An object describing identified category.
CategoryDetail	An object describing additional category details.
CelebritiesModel	An object describing possible celebrity identification.
ColorInfo	An object providing additional metadata describing color attributes.
ComputerVisionError	The API request error.
ComputerVisionErrorCodes	The error code.
ComputerVisionErrorResponse	The API error response.
ComputerVisionInnerError	Details about the API request error.
ComputerVisionInnerErrorCodeValue	The error code.
DescriptionExclude	Turn off specified domain models when generating the description.
Details	A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image.
DetectedBrand	A brand detected in an image.
DetectedObject	An object detected in an image.
FaceDescription	An object describing a face identified in the image.
FaceRectangle	An object describing face rectangle.
Gender	Possible gender of the face.
ImageAnalysis	Result of AnalyzeImage operation.
ImageCaption	An image caption, i.e. a brief description of what the image depicts.
ImageDescriptionDetails	A collection of content tags, along with a list of captions sorted by confidence level, and image metadata.
ImageMetadata	Image metadata.
ImageTag	An entity observation in the image, along with the confidence score.
ImageType	An object providing possible image types and matching confidence levels.
ImageUrl
LandmarksModel	A landmark recognized in the image.
ObjectHierarchy	An object detected inside an image.
VisualFeatureTypes	A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English.

AdultInfo

An object describing whether the image contains adult-oriented content and/or is racy.

Name	Type	Description
adultScore	number	Score from 0 to 1 that indicates how much the content is considered adult-oriented within the image.
goreScore	number	Score from 0 to 1 that indicates how gory is the image.
isAdultContent	boolean	A value indicating if the image contains adult-oriented content.
isGoryContent	boolean	A value indicating if the image is gory.
isRacyContent	boolean	A value indicating if the image is racy.
racyScore	number	Score from 0 to 1 that indicates how suggestive is the image.

BoundingRect

A bounding box for an area inside an image.

Name	Type	Description
h	integer	Height measured from the top-left point of the area, in pixels.
w	integer	Width measured from the top-left point of the area, in pixels.
x	integer	X-coordinate of the top left point of the area, in pixels.
y	integer	Y-coordinate of the top left point of the area, in pixels.

Name	Type	Description
detail	CategoryDetail	Details of the identified category.
name	string	Name of the category.
score	number	Scoring of the category.

CategoryDetail

An object describing additional category details.

Name	Type	Description
celebrities	CelebritiesModel[]	An array of celebrities if any identified.
landmarks	LandmarksModel[]	An array of landmarks if any identified.

CelebritiesModel

An object describing possible celebrity identification.

Name	Type	Description
confidence	number	Confidence level for the celebrity recognition as a value ranging from 0 to 1.
faceRectangle	FaceRectangle	Location of the identified face in the image.
name	string	Name of the celebrity.

ColorInfo

An object providing additional metadata describing color attributes.

Name	Type	Description
accentColor	string	Possible accent color.
dominantColorBackground	string	Possible dominant background color.
dominantColorForeground	string	Possible dominant foreground color.
dominantColors	string[]	An array of possible dominant colors.
isBWImg	boolean	A value indicating if the image is black and white.

ComputerVisionError

The API request error.

Name	Type	Description
code	ComputerVisionErrorCodes	The error code.
innererror	ComputerVisionInnerError	Inner error contains more specific information.
message	string	A message explaining the error reported by the service.

ComputerVisionErrorCodes

The error code.

Name	Type	Description
InternalServerError	string
InvalidArgument	string
InvalidRequest	string
ServiceUnavailable	string

ComputerVisionErrorResponse

The API error response.

Name	Type	Description
error	ComputerVisionError	Error contents.

ComputerVisionInnerError

Details about the API request error.

Name	Type	Description
code	ComputerVisionInnerErrorCodeValue	The error code.
message	string	Error message.

ComputerVisionInnerErrorCodeValue

The error code.

Name	Type	Description
BadArgument	string
CancelledRequest	string
DetectFaceError	string
FailedToProcess	string
InternalServerError	string
InvalidDetails	string
InvalidImageFormat	string
InvalidImageSize	string
InvalidImageUrl	string
InvalidModel	string
InvalidThumbnailSize	string
NotSupportedFeature	string
NotSupportedImage	string
NotSupportedLanguage	string
NotSupportedVisualFeature	string
StorageException	string
Timeout	string
Unspecified	string
UnsupportedMediaType	string

DescriptionExclude

Turn off specified domain models when generating the description.

Name	Type	Description
Celebrities	string
Landmarks	string

Details

A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image.

Name	Type	Description
Celebrities	string
Landmarks	string

DetectedBrand

A brand detected in an image.

Name	Type	Description
confidence	number	Confidence score of having observed the brand in the image, as a value ranging from 0 to 1.
name	string	Label for the brand.
rectangle	BoundingRect	Approximate location of the detected brand.

DetectedObject

An object detected in an image.

Name	Type	Description
confidence	number	Confidence score of having observed the object in the image, as a value ranging from 0 to 1.
object	string	Label for the object.
parent	ObjectHierarchy	The parent object, from a taxonomy perspective. The parent object is a more generic form of this object. For example, a 'bulldog' would have a parent of 'dog'.
rectangle	BoundingRect	Approximate location of the detected object.

FaceDescription

An object describing a face identified in the image.

Name	Type	Description
age	integer	Possible age of the face.
faceRectangle	FaceRectangle	Rectangle in the image containing the identified face.
gender	Gender	Possible gender of the face.

FaceRectangle

An object describing face rectangle.

Name	Type	Description
height	integer	Height measured from the top-left point of the face, in pixels.
left	integer	X-coordinate of the top left point of the face, in pixels.
top	integer	Y-coordinate of the top left point of the face, in pixels.
width	integer	Width measured from the top-left point of the face, in pixels.

Gender

Possible gender of the face.

Name	Type	Description
Female	string
Male	string

ImageAnalysis

Result of AnalyzeImage operation.

Name	Type	Description
adult	AdultInfo	An object describing whether the image contains adult-oriented content and/or is racy.
brands	DetectedBrand[]	Array of brands detected in the image.
categories	Category[]	An array indicating identified categories.
color	ColorInfo	An object providing additional metadata describing color attributes.
description	ImageDescriptionDetails	A collection of content tags, along with a list of captions sorted by confidence level, and image metadata.
faces	FaceDescription[]	An array of possible faces within the image.
imageType	ImageType	An object providing possible image types and matching confidence levels.
metadata	ImageMetadata	Image metadata.
modelVersion	string	Version of the AI model.
objects	DetectedObject[]	Array of objects describing what was detected in the image.
requestId	string	Id of the REST API request.
tags	ImageTag[]	A list of tags with confidence level.

ImageCaption

An image caption, i.e. a brief description of what the image depicts.

Name	Type	Description
confidence	number	The level of confidence the service has in the caption.
text	string	The text of the caption.

ImageDescriptionDetails

A collection of content tags, along with a list of captions sorted by confidence level, and image metadata.

Name	Type	Description
captions	ImageCaption[]	A list of captions, sorted by confidence level.
tags	string[]	A collection of image tags.

ImageMetadata

Image metadata.

Name	Type	Description
format	string	Image format.
height	integer	Image height, in pixels.
width	integer	Image width, in pixels.

ImageTag

An entity observation in the image, along with the confidence score.

Name	Type	Description
confidence	number	The level of confidence that the entity was observed.
hint	string	Optional hint/details for this tag.
name	string	Name of the entity.

ImageType

An object providing possible image types and matching confidence levels.

Name	Type	Description
clipArtType	integer	Confidence level that the image is a clip art.
lineDrawingType	integer	Confidence level that the image is a line drawing.

ImageUrl

Name	Type	Description
url	string	Publicly reachable URL of an image.

LandmarksModel

A landmark recognized in the image.

Name	Type	Description
confidence	number	Confidence level for the landmark recognition as a value ranging from 0 to 1.
name	string	Name of the landmark.

ObjectHierarchy

An object detected inside an image.

Name	Type	Description
confidence	number	Confidence score of having observed the object in the image, as a value ranging from 0 to 1.
object	string	Label for the object.
parent	ObjectHierarchy	The parent object, from a taxonomy perspective. The parent object is a more generic form of this object. For example, a 'bulldog' would have a parent of 'dog'.

VisualFeatureTypes

A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English.

Name	Type	Description
Adult	string
Brands	string
Categories	string
Color	string
Description	string
Faces	string
ImageType	string
Objects	string
Tags	string

Share via

Analyze Image - Analyze Image

URI Parameters

Request Header

Request Body

Responses

Security

Ocp-Apim-Subscription-Key

Examples

Successful AnalyzeImage request

Sample request

Sample response

Definitions

AdultInfo

BoundingRect

Category

CategoryDetail

CelebritiesModel

ColorInfo

ComputerVisionError

ComputerVisionErrorCodes

ComputerVisionErrorResponse

ComputerVisionInnerError

ComputerVisionInnerErrorCodeValue

DescriptionExclude

Details

DetectedBrand

DetectedObject

FaceDescription

FaceRectangle

Gender

ImageAnalysis

ImageCaption

ImageDescriptionDetails

ImageMetadata

ImageTag

ImageType

ImageUrl

LandmarksModel

ObjectHierarchy

VisualFeatureTypes

Additional resources