Image Analysis - Analyze Stream
Analyze the input image. The request either contains image stream with any content type ['image/*', 'application/octet-stream'], or a JSON payload which includes an url property to be used to retrieve the image stream.
POST /imageanalysis:analyze?overload=stream&api-version=2023-04-01-preview
POST /imageanalysis:analyze?overload=stream&features={features}&model-name={model-name}&language={language}&smartcrops-aspect-ratios={smartcrops-aspect-ratios}&gender-neutral-caption={gender-neutral-caption}&api-version=2023-04-01-preview
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
api-version
|
query | True |
string |
Requested API version. |
features
|
query |
The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified. |
||
gender-neutral-caption
|
query |
boolean |
Boolean flag for enabling gender-neutral captioning for caption and denseCaptions features. If this parameter is not specified, the default value is "false". |
|
language
|
query |
string |
The desired language for output generation. If this parameter is not specified, the default value is "en". See https://aka.ms/cv-languages for a list of supported languages. |
|
model-name
|
query |
string |
The name of the custom trained model. This parameter needs to be specified if the parameter "features" is not specified. |
|
smartcrops-aspect-ratios
|
query |
string |
A list of aspect ratios to use for smartCrops feature. Aspect ratios are calculated by dividing the target crop width by the height. Supported values are between 0.75 and 1.8 (inclusive). Multiple values should be comma-separated. If this parameter is not specified, the service will return one crop suggestion with an aspect ratio it sees fit between 0.5 and 2.0 (inclusive). |
Request Body
Media Types: "application/octet-stream", "image/jpeg", "image/gif", "image/tiff", "image/bmp", "image/png"
Name | Type | Description |
---|---|---|
body |
string |
An image stream. |
Responses
Name | Type | Description |
---|---|---|
200 OK |
Success |
|
Other Status Codes |
Error Headers x-ms-error-code: string |
Examples
AnalyzeImageFromImageStream_CustomModel
Sample request
POST /imageanalysis:analyze?overload=stream&model-name=my_model_name&api-version=2023-04-01-preview
"Ynl0ZXM="
Sample response
{
"modelVersion": "2023-04-01-preview",
"customModelResult": {
"objectsResult": {
"values": [
{
"id": "1",
"boundingBox": {
"x": 197,
"y": 68,
"w": 356,
"h": 394
},
"tags": [
{
"name": "class1",
"confidence": 0.92431640625
}
]
},
{
"id": "2",
"boundingBox": {
"x": 0,
"y": 77,
"w": 241,
"h": 359
},
"tags": [
{
"name": "class1",
"confidence": 0.87890625
}
]
}
]
}
},
"metadata": {
"width": 660,
"height": 495
}
}
Definitions
Name | Description |
---|---|
Adult |
An object describing adult content match. |
Adult |
An object describing whether the image contains adult-oriented content and/or is racy. |
Bounding |
A bounding box for an area inside an image. |
Caption |
A brief description of what the image depicts. |
Crop |
A region identified for smart cropping. There will be one region returned for each requested aspect ratio. |
Dense |
A brief description of what the image depicts. |
Dense |
A list of captions. |
Detected |
Describes a detected object in an image. |
Detected |
A person detected in an image. |
Document |
A content line object consisting of an adjacent sequence of content elements, such as words and selection marks. |
Document |
The content and layout elements extracted from a page from the input. |
Document |
Contiguous region of the concatenated content property, specified as an offset and length. |
Document |
An object representing observed text styles. |
Document |
A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word. |
Error |
Response returned when an error occurs. |
Error |
Error info. |
Error |
Detailed error. |
Image |
Describe the combined results of different types of image analysis. |
Image |
The image metadata information such as height and width. |
Image |
Describes the prediction result of an image. |
Objects |
Describes detected objects in an image. |
People |
An object describing whether the image contains people. |
Read |
The results of an Read operation. |
Smart |
Smart cropping result. |
Tag |
An entity observation in the image, along with the confidence score. |
Tags |
A list of tags with confidence level. |
Visual |
The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified. |
AdultMatch
An object describing adult content match.
Name | Type | Description |
---|---|---|
confidence |
number |
A value indicating the confidence level of matched adult content. |
isMatch |
boolean |
A value indicating if the image is matched adult content. |
AdultResult
An object describing whether the image contains adult-oriented content and/or is racy.
Name | Type | Description |
---|---|---|
adult |
An object describing adult content match. |
|
gore |
An object describing adult content match. |
|
racy |
An object describing adult content match. |
BoundingBox
A bounding box for an area inside an image.
Name | Type | Description |
---|---|---|
h |
integer |
Height measured from the top-left point of the area, in pixels. |
w |
integer |
Width measured from the top-left point of the area, in pixels. |
x |
integer |
Left-coordinate of the top left point of the area, in pixels. |
y |
integer |
Top-coordinate of the top left point of the area, in pixels. |
CaptionResult
A brief description of what the image depicts.
Name | Type | Description |
---|---|---|
confidence |
number |
The level of confidence the service has in the caption. |
text |
string |
The text of the caption. |
CropRegion
A region identified for smart cropping. There will be one region returned for each requested aspect ratio.
Name | Type | Description |
---|---|---|
aspectRatio |
number |
The aspect ratio of the crop region. |
boundingBox |
A bounding box for an area inside an image. |
DenseCaption
A brief description of what the image depicts.
Name | Type | Description |
---|---|---|
boundingBox |
A bounding box for an area inside an image. |
|
confidence |
number |
The level of confidence the service has in the caption. |
text |
string |
The text of the caption. |
DenseCaptionsResult
A list of captions.
Name | Type | Description |
---|---|---|
values |
A list of captions. |
DetectedObject
Describes a detected object in an image.
Name | Type | Description |
---|---|---|
boundingBox |
A bounding box for an area inside an image. |
|
id |
string |
Id of the detected object. |
tags |
Tag[] |
Classification confidences of the detected object. |
DetectedPerson
A person detected in an image.
Name | Type | Description |
---|---|---|
boundingBox |
A bounding box for an area inside an image. |
|
confidence |
number |
Confidence score of having observed the person in the image, as a value ranging from 0 to 1. |
DocumentLine
A content line object consisting of an adjacent sequence of content elements, such as words and selection marks.
Name | Type | Description |
---|---|---|
boundingBox |
number[] |
Bounding box of the line. |
content |
string |
Concatenated content of the contained elements in reading order. |
spans |
Location of the line in the reading order concatenated content. |
DocumentPage
The content and layout elements extracted from a page from the input.
Name | Type | Description |
---|---|---|
angle |
number |
The general orientation of the content in clockwise direction, measured in degrees between (-180, 180]. |
height |
number |
The height of the image/PDF in pixels/inches, respectively. |
lines |
Extracted lines from the page, potentially containing both textual and visual elements. |
|
pageNumber |
integer |
1-based page number in the input document. |
spans |
Location of the page in the reading order concatenated content. |
|
width |
number |
The width of the image/PDF in pixels/inches, respectively. |
words |
Extracted words from the page. |
DocumentSpan
Contiguous region of the concatenated content property, specified as an offset and length.
Name | Type | Description |
---|---|---|
length |
integer |
Number of characters in the content represented by the span. |
offset |
integer |
Zero-based index of the content represented by the span. |
DocumentStyle
An object representing observed text styles.
Name | Type | Description |
---|---|---|
confidence |
number |
Confidence of correctly identifying the style. |
isHandwritten |
boolean |
Is content handwritten or not. |
spans |
Location of the text elements in the concatenated content the style applies to. |
DocumentWord
A word object consisting of a contiguous sequence of characters. For non-space delimited languages, such as Chinese, Japanese, and Korean, each character is represented as its own word.
Name | Type | Description |
---|---|---|
boundingBox |
number[] |
Bounding box of the word. |
confidence |
number |
Confidence of correctly extracting the word. |
content |
string |
Text content of the word. |
span |
Contiguous region of the concatenated content property, specified as an offset and length. |
ErrorResponse
Response returned when an error occurs.
Name | Type | Description |
---|---|---|
error |
Error info. |
ErrorResponseDetails
Error info.
Name | Type | Description |
---|---|---|
code |
string |
Error code. |
details |
List of detailed errors. |
|
innererror |
Detailed error. |
|
message |
string |
Error message. |
target |
string |
Target of the error. |
ErrorResponseInnerError
Detailed error.
Name | Type | Description |
---|---|---|
code |
string |
Error code. |
innererror |
Detailed error. |
|
message |
string |
Error message. |
ImageAnalysisResult
Describe the combined results of different types of image analysis.
Name | Type | Description |
---|---|---|
adultResult |
An object describing whether the image contains adult-oriented content and/or is racy. |
|
captionResult |
A brief description of what the image depicts. |
|
customModelResult |
Describes the prediction result of an image. |
|
denseCaptionsResult |
A list of captions. |
|
metadata |
The image metadata information such as height and width. |
|
modelVersion |
string |
Model Version. |
objectsResult |
Describes detected objects in an image. |
|
peopleResult |
An object describing whether the image contains people. |
|
readResult |
The results of an Read operation. |
|
smartCropsResult |
Smart cropping result. |
|
tagsResult |
A list of tags with confidence level. |
ImageMetadataApiModel
The image metadata information such as height and width.
Name | Type | Description |
---|---|---|
height |
integer |
The height of the image in pixels. |
width |
integer |
The width of the image in pixels. |
ImagePredictionResult
Describes the prediction result of an image.
Name | Type | Description |
---|---|---|
objectsResult |
Describes detected objects in an image. |
|
tagsResult |
A list of tags with confidence level. |
ObjectsResult
Describes detected objects in an image.
Name | Type | Description |
---|---|---|
values |
An array of detected objects. |
PeopleResult
An object describing whether the image contains people.
Name | Type | Description |
---|---|---|
values |
An array of detected people. |
ReadResult
The results of an Read operation.
Name | Type | Description |
---|---|---|
content |
string |
Concatenate string representation of all textual and visual elements in reading order. |
pages |
A list of analyzed pages. |
|
stringIndexType |
string |
The method used to compute string offset and length, possible values include: 'textElements', 'unicodeCodePoint', 'utf16CodeUnit' etc. |
styles |
Extracted font styles. |
SmartCropsResult
Smart cropping result.
Name | Type | Description |
---|---|---|
values |
Recommended regions for cropping the image. |
Tag
An entity observation in the image, along with the confidence score.
Name | Type | Description |
---|---|---|
confidence |
number |
The level of confidence that the entity was observed. |
name |
string |
Name of the entity. |
TagsResult
A list of tags with confidence level.
Name | Type | Description |
---|---|---|
values |
Tag[] |
A list of tags with confidence level. |
VisualFeature
The visual features requested: tags, objects, caption, denseCaptions, read, smartCrops, people. This parameter needs to be specified if the parameter "model-name" is not specified.
Name | Type | Description |
---|---|---|
caption |
string |
|
denseCaptions |
string |
|
objects |
string |
|
people |
string |
|
read |
string |
|
smartCrops |
string |
|
tags |
string |