Analyze Image - Analyze Image
This operation extracts a rich set of visual features based on the image content. Two input methods are supported -- (1) Uploading an image or (2) specifying an image URL. Within your request, there is an optional parameter to allow you to choose which features to return. By default, image categories are returned in the response. A successful response will be returned in JSON. If the request failed, the response will contain an error code and a message to help understand what went wrong.
POST {Endpoint}/vision/v3.2/analyze
POST {Endpoint}/vision/v3.2/analyze?visualFeatures={visualFeatures}&details={details}&language={language}&descriptionExclude={descriptionExclude}&model-version={model-version}
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
Endpoint
|
path | True |
string |
Supported Cognitive Services endpoints. |
description
|
query |
Turn off specified domain models when generating the description. |
||
details
|
query |
Details[] |
A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image. |
|
language
|
query |
string |
The desired language for output generation. If this parameter is not specified, the default value is "en". See https://aka.ms/cv-languages for list of supported languages. |
|
model-version
|
query |
string |
Optional parameter to specify the version of the AI model. Accepted values are: "latest", "2021-04-01", "2021-05-01". Defaults to "latest". Regex pattern: |
|
visual
|
query |
A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English. |
Request Header
Name | Required | Type | Description |
---|---|---|---|
Ocp-Apim-Subscription-Key | True |
string |
Request Body
Name | Required | Type | Description |
---|---|---|---|
url | True |
string |
Publicly reachable URL of an image. |
Responses
Name | Type | Description |
---|---|---|
200 OK |
The response include the extracted features in JSON format. Here is the definitions for enumeration types: ClipartType Non - clipart = 0, ambiguous = 1, normal - clipart = 2, good - clipart = 3. LineDrawingTypeNon - LineDrawing = 0, LineDrawing = 1. |
|
Other Status Codes |
Error response. |
Security
Ocp-Apim-Subscription-Key
Type:
apiKey
In:
header
Examples
Successful AnalyzeImage request
Sample request
POST https://westus.api.cognitive.microsoft.com/vision/v3.2/analyze?visualFeatures=Categories,Adult,Tags,Description,Faces,Color,ImageType,Objects,Brands&details=Celebrities,Landmarks&language=en
{
"url": "{url}"
}
Sample response
{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceRectangle": {
"left": 597,
"top": 162,
"width": 248,
"height": 248
},
"confidence": 0.999028444
}
]
}
},
{
"name": "building_",
"score": 0.984375,
"detail": {
"landmarks": [
{
"name": "Forbidden City",
"confidence": 0.9829016923904419
}
]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"isGoryContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.06861349195241928,
"goreScore": 0.012872257380997575
},
"tags": [
{
"name": "person",
"confidence": 0.9897908568382263
},
{
"name": "man",
"confidence": 0.9449388980865479
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.8951393961906433
},
{
"name": "pangolin",
"confidence": 0.7250059783791661,
"hint": "mammal"
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
]
},
"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
"metadata": {
"width": 1500,
"height": 1000,
"format": "Jpeg"
},
"modelVersion": "2021-04-01",
"faces": [
{
"age": 44,
"gender": "Male",
"faceRectangle": {
"left": 593,
"top": 160,
"width": 250,
"height": 250
}
}
],
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown",
"Black"
],
"accentColor": "873B59",
"isBWImg": false
},
"imageType": {
"clipArtType": 0,
"lineDrawingType": 0
},
"objects": [
{
"rectangle": {
"x": 0,
"y": 0,
"w": 50,
"h": 50
},
"object": "tree",
"confidence": 0.9,
"parent": {
"object": "plant",
"confidence": 0.95
}
}
],
"brands": [
{
"name": "Pepsi",
"confidence": 0.857,
"rectangle": {
"x": 489,
"y": 79,
"w": 161,
"h": 177
}
},
{
"name": "Coca-Cola",
"confidence": 0.893,
"rectangle": {
"x": 216,
"y": 55,
"w": 171,
"h": 372
}
}
]
}
Definitions
Name | Description |
---|---|
Adult |
An object describing whether the image contains adult-oriented content and/or is racy. |
Bounding |
A bounding box for an area inside an image. |
Category |
An object describing identified category. |
Category |
An object describing additional category details. |
Celebrities |
An object describing possible celebrity identification. |
Color |
An object providing additional metadata describing color attributes. |
Computer |
The API request error. |
Computer |
The error code. |
Computer |
The API error response. |
Computer |
Details about the API request error. |
Computer |
The error code. |
Description |
Turn off specified domain models when generating the description. |
Details |
A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image. |
Detected |
A brand detected in an image. |
Detected |
An object detected in an image. |
Face |
An object describing a face identified in the image. |
Face |
An object describing face rectangle. |
Gender |
Possible gender of the face. |
Image |
Result of AnalyzeImage operation. |
Image |
An image caption, i.e. a brief description of what the image depicts. |
Image |
A collection of content tags, along with a list of captions sorted by confidence level, and image metadata. |
Image |
Image metadata. |
Image |
An entity observation in the image, along with the confidence score. |
Image |
An object providing possible image types and matching confidence levels. |
Image |
|
Landmarks |
A landmark recognized in the image. |
Object |
An object detected inside an image. |
Visual |
A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English. |
AdultInfo
An object describing whether the image contains adult-oriented content and/or is racy.
Name | Type | Description |
---|---|---|
adultScore |
number |
Score from 0 to 1 that indicates how much the content is considered adult-oriented within the image. |
goreScore |
number |
Score from 0 to 1 that indicates how gory is the image. |
isAdultContent |
boolean |
A value indicating if the image contains adult-oriented content. |
isGoryContent |
boolean |
A value indicating if the image is gory. |
isRacyContent |
boolean |
A value indicating if the image is racy. |
racyScore |
number |
Score from 0 to 1 that indicates how suggestive is the image. |
BoundingRect
A bounding box for an area inside an image.
Name | Type | Description |
---|---|---|
h |
integer |
Height measured from the top-left point of the area, in pixels. |
w |
integer |
Width measured from the top-left point of the area, in pixels. |
x |
integer |
X-coordinate of the top left point of the area, in pixels. |
y |
integer |
Y-coordinate of the top left point of the area, in pixels. |
Category
An object describing identified category.
Name | Type | Description |
---|---|---|
detail |
Details of the identified category. |
|
name |
string |
Name of the category. |
score |
number |
Scoring of the category. |
CategoryDetail
An object describing additional category details.
Name | Type | Description |
---|---|---|
celebrities |
An array of celebrities if any identified. |
|
landmarks |
An array of landmarks if any identified. |
CelebritiesModel
An object describing possible celebrity identification.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence level for the celebrity recognition as a value ranging from 0 to 1. |
faceRectangle |
Location of the identified face in the image. |
|
name |
string |
Name of the celebrity. |
ColorInfo
An object providing additional metadata describing color attributes.
Name | Type | Description |
---|---|---|
accentColor |
string |
Possible accent color. |
dominantColorBackground |
string |
Possible dominant background color. |
dominantColorForeground |
string |
Possible dominant foreground color. |
dominantColors |
string[] |
An array of possible dominant colors. |
isBWImg |
boolean |
A value indicating if the image is black and white. |
ComputerVisionError
The API request error.
Name | Type | Description |
---|---|---|
code |
The error code. |
|
innererror |
Inner error contains more specific information. |
|
message |
string |
A message explaining the error reported by the service. |
ComputerVisionErrorCodes
The error code.
Name | Type | Description |
---|---|---|
InternalServerError |
string |
|
InvalidArgument |
string |
|
InvalidRequest |
string |
|
ServiceUnavailable |
string |
ComputerVisionErrorResponse
The API error response.
Name | Type | Description |
---|---|---|
error |
Error contents. |
ComputerVisionInnerError
Details about the API request error.
Name | Type | Description |
---|---|---|
code |
The error code. |
|
message |
string |
Error message. |
ComputerVisionInnerErrorCodeValue
The error code.
Name | Type | Description |
---|---|---|
BadArgument |
string |
|
CancelledRequest |
string |
|
DetectFaceError |
string |
|
FailedToProcess |
string |
|
InternalServerError |
string |
|
InvalidDetails |
string |
|
InvalidImageFormat |
string |
|
InvalidImageSize |
string |
|
InvalidImageUrl |
string |
|
InvalidModel |
string |
|
InvalidThumbnailSize |
string |
|
NotSupportedFeature |
string |
|
NotSupportedImage |
string |
|
NotSupportedLanguage |
string |
|
NotSupportedVisualFeature |
string |
|
StorageException |
string |
|
Timeout |
string |
|
Unspecified |
string |
|
UnsupportedMediaType |
string |
DescriptionExclude
Turn off specified domain models when generating the description.
Name | Type | Description |
---|---|---|
Celebrities |
string |
|
Landmarks |
string |
Details
A string indicating which domain-specific details to return. Multiple values should be comma-separated. Valid visual feature types include: Celebrities - identifies celebrities if detected in the image, Landmarks - identifies notable landmarks in the image.
Name | Type | Description |
---|---|---|
Celebrities |
string |
|
Landmarks |
string |
DetectedBrand
A brand detected in an image.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence score of having observed the brand in the image, as a value ranging from 0 to 1. |
name |
string |
Label for the brand. |
rectangle |
Approximate location of the detected brand. |
DetectedObject
An object detected in an image.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence score of having observed the object in the image, as a value ranging from 0 to 1. |
object |
string |
Label for the object. |
parent |
The parent object, from a taxonomy perspective. The parent object is a more generic form of this object. For example, a 'bulldog' would have a parent of 'dog'. |
|
rectangle |
Approximate location of the detected object. |
FaceDescription
An object describing a face identified in the image.
Name | Type | Description |
---|---|---|
age |
integer |
Possible age of the face. |
faceRectangle |
Rectangle in the image containing the identified face. |
|
gender |
Possible gender of the face. |
FaceRectangle
An object describing face rectangle.
Name | Type | Description |
---|---|---|
height |
integer |
Height measured from the top-left point of the face, in pixels. |
left |
integer |
X-coordinate of the top left point of the face, in pixels. |
top |
integer |
Y-coordinate of the top left point of the face, in pixels. |
width |
integer |
Width measured from the top-left point of the face, in pixels. |
Gender
Possible gender of the face.
Name | Type | Description |
---|---|---|
Female |
string |
|
Male |
string |
ImageAnalysis
Result of AnalyzeImage operation.
Name | Type | Description |
---|---|---|
adult |
An object describing whether the image contains adult-oriented content and/or is racy. |
|
brands |
Array of brands detected in the image. |
|
categories |
Category[] |
An array indicating identified categories. |
color |
An object providing additional metadata describing color attributes. |
|
description |
A collection of content tags, along with a list of captions sorted by confidence level, and image metadata. |
|
faces |
An array of possible faces within the image. |
|
imageType |
An object providing possible image types and matching confidence levels. |
|
metadata |
Image metadata. |
|
modelVersion |
string |
Version of the AI model. |
objects |
Array of objects describing what was detected in the image. |
|
requestId |
string |
Id of the REST API request. |
tags |
Image |
A list of tags with confidence level. |
ImageCaption
An image caption, i.e. a brief description of what the image depicts.
Name | Type | Description |
---|---|---|
confidence |
number |
The level of confidence the service has in the caption. |
text |
string |
The text of the caption. |
ImageDescriptionDetails
A collection of content tags, along with a list of captions sorted by confidence level, and image metadata.
Name | Type | Description |
---|---|---|
captions |
A list of captions, sorted by confidence level. |
|
tags |
string[] |
A collection of image tags. |
ImageMetadata
Image metadata.
Name | Type | Description |
---|---|---|
format |
string |
Image format. |
height |
integer |
Image height, in pixels. |
width |
integer |
Image width, in pixels. |
ImageTag
An entity observation in the image, along with the confidence score.
Name | Type | Description |
---|---|---|
confidence |
number |
The level of confidence that the entity was observed. |
hint |
string |
Optional hint/details for this tag. |
name |
string |
Name of the entity. |
ImageType
An object providing possible image types and matching confidence levels.
Name | Type | Description |
---|---|---|
clipArtType |
integer |
Confidence level that the image is a clip art. |
lineDrawingType |
integer |
Confidence level that the image is a line drawing. |
ImageUrl
Name | Type | Description |
---|---|---|
url |
string |
Publicly reachable URL of an image. |
LandmarksModel
A landmark recognized in the image.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence level for the landmark recognition as a value ranging from 0 to 1. |
name |
string |
Name of the landmark. |
ObjectHierarchy
An object detected inside an image.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence score of having observed the object in the image, as a value ranging from 0 to 1. |
object |
string |
Label for the object. |
parent |
The parent object, from a taxonomy perspective. The parent object is a more generic form of this object. For example, a 'bulldog' would have a parent of 'dog'. |
VisualFeatureTypes
A string indicating what visual feature types to return. Multiple values should be comma-separated. Valid visual feature types include: Categories - categorizes image content according to a taxonomy defined in documentation. Tags - tags the image with a detailed list of words related to the image content. Description - describes the image content with a complete English sentence. Faces - detects if faces are present. If present, generate coordinates, gender and age. ImageType - detects if image is clipart or a line drawing. Color - determines the accent color, dominant color, and whether an image is black&white. Adult - detects if the image is pornographic in nature (depicts nudity or a sex act), or is gory (depicts extreme violence or blood). Sexually suggestive content (aka racy content) is also detected. Objects - detects various objects within an image, including the approximate location. The Objects argument is only available in English. Brands - detects various brands within an image, including the approximate location. The Brands argument is only available in English.
Name | Type | Description |
---|---|---|
Adult |
string |
|
Brands |
string |
|
Categories |
string |
|
Color |
string |
|
Description |
string |
|
Faces |
string |
|
ImageType |
string |
|
Objects |
string |
|
Tags |
string |