Call the Image Analysis 4.0 Analyze API

This article demonstrates how to call the Image Analysis 4.0 API to return information about an image's visual features. It also shows you how to parse the returned information.

Prerequisites

This guide assumes you've followed the steps mentioned in the quickstart page. This means:

  • You have created a Computer Vision resource and obtained a key and endpoint URL.
  • You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on code examples here.

Create and authenticate the client

To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY and VISION_ENDPOINT with your key and endpoint.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

Start by creating a ImageAnalysisClient object. For example:

string endpoint = Environment.GetEnvironmentVariable("VISION_ENDPOINT");
string key = Environment.GetEnvironmentVariable("VISION_KEY");

// Create an Image Analysis client.
ImageAnalysisClient client = new ImageAnalysisClient(
    new Uri(endpoint),
    new AzureKeyCredential(key));

Select the image to analyze

You can select an image by providing a publicly accessible image URL, or by passing binary data to the SDK. See Image requirements for supported image formats.

Image URL

Create a Uri object for the image you want to analyze.

Uri imageURL = new Uri("https://aka.ms/azsdk/image-analysis/sample.jpg");

Image buffer

Alternatively, you can pass the image data to the SDK through a BinaryData object. For example, read from a local image file you want to analyze.

using FileStream stream = new FileStream("sample.jpg", FileMode.Open);
BinaryData imageData = BinaryData.FromStream(stream);

Select visual features

The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.

Important

The visual features Captions and DenseCaptions are only supported in certain Azure regions: see Region availability

VisualFeatures visualFeatures =
    VisualFeatures.Caption |
    VisualFeatures.DenseCaptions |
    VisualFeatures.Objects |
    VisualFeatures.Read |
    VisualFeatures.Tags |
    VisualFeatures.People |
    VisualFeatures.SmartCrops;

Select analysis options

Use an ImageAnalysisOptions object to specify various options for the Analyze Image API call.

  • Language: You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
  • Gender neutral captions: If you're extracting captions or dense captions (using VisualFeatures.Caption or VisualFeatures.DenseCaptions), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
  • Crop aspect ratio: An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
ImageAnalysisOptions options = new ImageAnalysisOptions { 
    GenderNeutralCaption = true,
    Language = "en",
    SmartCropsAspectRatios = new float[] { 0.9F, 1.33F }};

Call the Analyze API

This section shows you how to make an analysis call to the service.

Call the Analyze method on the ImageAnalysisClient object, as shown here. The call is synchronous, and blocks execution until the service returns the results or an error occurred. Alternatively, you can call the non-blocking AnalyzeAsync method.

Use the input objects created in the above sections. To analyze from an image buffer instead of URL, replace imageURL in the method call with the imageData variable.

ImageAnalysisResult result = client.Analyze(
    imageURL,
    visualFeatures,
    options);

Get results from the service

The following code shows you how to parse the results of the various Analyze operations.

Console.WriteLine("Image analysis results:");

// Print caption results to the console
Console.WriteLine(" Caption:");
Console.WriteLine($"   '{result.Caption.Text}', Confidence {result.Caption.Confidence:F4}");

// Print dense caption results to the console
Console.WriteLine(" Dense Captions:");
foreach (DenseCaption denseCaption in result.DenseCaptions.Values)
{
    Console.WriteLine($"   '{denseCaption.Text}', Confidence {denseCaption.Confidence:F4}, Bounding box {denseCaption.BoundingBox}");
}

// Print object detection results to the console
Console.WriteLine(" Objects:");
foreach (DetectedObject detectedObject in result.Objects.Values)
{
    Console.WriteLine($"   '{detectedObject.Tags.First().Name}', Bounding box {detectedObject.BoundingBox.ToString()}");
}

// Print text (OCR) analysis results to the console
Console.WriteLine(" Read:");
foreach (DetectedTextBlock block in result.Read.Blocks)
    foreach (DetectedTextLine line in block.Lines)
    {
        Console.WriteLine($"   Line: '{line.Text}', Bounding Polygon: [{string.Join(" ", line.BoundingPolygon)}]");
        foreach (DetectedTextWord word in line.Words)
        {
            Console.WriteLine($"     Word: '{word.Text}', Confidence {word.Confidence.ToString("#.####")}, Bounding Polygon: [{string.Join(" ", word.BoundingPolygon)}]");
        }
    }

// Print tags results to the console
Console.WriteLine(" Tags:");
foreach (DetectedTag tag in result.Tags.Values)
{
    Console.WriteLine($"   '{tag.Name}', Confidence {tag.Confidence:F4}");
}

// Print people detection results to the console
Console.WriteLine(" People:");
foreach (DetectedPerson person in result.People.Values)
{
    Console.WriteLine($"   Person: Bounding box {person.BoundingBox.ToString()}, Confidence {person.Confidence:F4}");
}

// Print smart-crops analysis results to the console
Console.WriteLine(" SmartCrops:");
foreach (CropRegion cropRegion in result.SmartCrops.Values)
{
    Console.WriteLine($"   Aspect ratio: {cropRegion.AspectRatio}, Bounding box: {cropRegion.BoundingBox}");
}

// Print metadata
Console.WriteLine(" Metadata:");
Console.WriteLine($"   Model: {result.ModelVersion}");
Console.WriteLine($"   Image width: {result.Metadata.Width}");
Console.WriteLine($"   Image hight: {result.Metadata.Height}");

Troubleshooting

Exception handling

When you interact with Image Analysis using the .NET SDK, any response from the service that doesn't have a 200 (success) status code results in an exception being thrown. For example, if you try to analyze an image that is not accessible due to a broken URL, a 400 status is returned, indicating a bad request, and a corresponding exception is thrown.

In the following snippet, errors are handled gracefully by catching the exception and displaying additional information about the error.

var imageUrl = new Uri("https://some-host-name.com/non-existing-image.jpg");

try
{
    var result = client.Analyze(imageUrl, VisualFeatures.Caption);
}
catch (RequestFailedException e)
{
    if (e.Status != 200)
    {
        Console.WriteLine("Error analyzing image.");
        Console.WriteLine($"HTTP status code {e.Status}: {e.Message}");
    }
    else
    {
        throw;
    }
}

You can learn more about how to enable SDK logging here.

Prerequisites

This guide assumes you've followed the steps of the quickstart. This means:

  • You've created a Computer Vision resource and obtained a key and endpoint URL.
  • You've installed the appropriate SDK package and have a working quickstart application. You can modify this quickstart application based on the code examples here.

Create and authenticate the client

To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY and VISION_ENDPOINT with your key and endpoint.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

Start by creating an ImageAnalysisClient object using one of the constructors. For example:

client = ImageAnalysisClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key)
)

Select the image to analyze

You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.

Image URL

You can use the following sample image URL.

# Define image URL
image_url = "https://zcusa.951200.xyz/azure/ai-services/computer-vision/media/quickstarts/presentation.png"

Image buffer

Alternatively, you can pass in the image as bytes object. For example, read from a local image file you want to analyze.

# Load image to analyze into a 'bytes' object
with open("sample.jpg", "rb") as f:
    image_data = f.read()

Select visual features

The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.

Important

The visual features Captions and DenseCaptions are only supported in certain Azure regions. See Region availability.

visual_features =[
        VisualFeatures.TAGS,
        VisualFeatures.OBJECTS,
        VisualFeatures.CAPTION,
        VisualFeatures.DENSE_CAPTIONS,
        VisualFeatures.READ,
        VisualFeatures.SMART_CROPS,
        VisualFeatures.PEOPLE,
    ]

Call the analyze_from_url method with options

The following code calls the analyze_from_url method on the client with the features you selected above and other options, defined below. To analyze from an image buffer instead of URL, call the method analyze instead, with image_data=image_data as the first argument.

# Analyze all visual features from an image stream. This will be a synchronously (blocking) call.
result = client.analyze_from_url(
    image_url=image_url,
    visual_features=visual_features,
    smart_crops_aspect_ratios=[0.9, 1.33],
    gender_neutral_caption=True,
    language="en"
)

Select smart cropping aspect ratios

An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SMART_CROPS was selected as part the visual feature list. If you select VisualFeatures.SMART_CROPS but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).

Select gender neutral captions

If you're extracting captions or dense captions (using VisualFeatures.CAPTION or VisualFeatures.DENSE_CAPTIONS), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.

Specify languages

You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.

Get results from the service

The following code shows you how to parse the results from the analyze_from_url or analyze operations.

# Print all analysis results to the console
print("Image analysis results:")

if result.caption is not None:
    print(" Caption:")
    print(f"   '{result.caption.text}', Confidence {result.caption.confidence:.4f}")

if result.dense_captions is not None:
    print(" Dense Captions:")
    for caption in result.dense_captions.list:
        print(f"   '{caption.text}', {caption.bounding_box}, Confidence: {caption.confidence:.4f}")

if result.read is not None:
    print(" Read:")
    for line in result.read.blocks[0].lines:
        print(f"   Line: '{line.text}', Bounding box {line.bounding_polygon}")
        for word in line.words:
            print(f"     Word: '{word.text}', Bounding polygon {word.bounding_polygon}, Confidence {word.confidence:.4f}")

if result.tags is not None:
    print(" Tags:")
    for tag in result.tags.list:
        print(f"   '{tag.name}', Confidence {tag.confidence:.4f}")

if result.objects is not None:
    print(" Objects:")
    for object in result.objects.list:
        print(f"   '{object.tags[0].name}', {object.bounding_box}, Confidence: {object.tags[0].confidence:.4f}")

if result.people is not None:
    print(" People:")
    for person in result.people.list:
        print(f"   {person.bounding_box}, Confidence {person.confidence:.4f}")

if result.smart_crops is not None:
    print(" Smart Cropping:")
    for smart_crop in result.smart_crops.list:
        print(f"   Aspect ratio {smart_crop.aspect_ratio}: Smart crop {smart_crop.bounding_box}")

print(f" Image height: {result.metadata.height}")
print(f" Image width: {result.metadata.width}")
print(f" Model version: {result.model_version}")

Troubleshooting

Exceptions

The analyze methods raise an HttpResponseError exception for a non-success HTTP status code response from the service. The exception's status_code is the HTTP response status code. The exception's error.message contains a detailed message that allows you to diagnose the issue:

try:
    result = client.analyze( ... )
except HttpResponseError as e:
    print(f"Status code: {e.status_code}")
    print(f"Reason: {e.reason}")
    print(f"Message: {e.error.message}")

For example, when you provide a wrong authentication key:

Status code: 401
Reason: PermissionDenied
Message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.

Or when you provide an image URL that doesn't exist or is not accessible:

Status code: 400
Reason: Bad Request
Message: The provided image url is not accessible.

Logging

The client uses the standard Python logging library. The SDK logs HTTP request and response details, which may be useful in troubleshooting. To log to stdout, add the following:

import sys
import logging

# Acquire the logger for this client library. Use 'azure' to affect both
# 'azure.core` and `azure.ai.vision.imageanalysis' libraries.
logger = logging.getLogger("azure")

# Set the desired logging level. logging.INFO or logging.DEBUG are good options.
logger.setLevel(logging.INFO)

# Direct logging output to stdout (the default):
handler = logging.StreamHandler(stream=sys.stdout)
# Or direct logging output to a file:
# handler = logging.FileHandler(filename = 'sample.log')
logger.addHandler(handler)

# Optional: change the default logging format. Here we add a timestamp.
formatter = logging.Formatter("%(asctime)s:%(levelname)s:%(name)s:%(message)s")
handler.setFormatter(formatter)

By default logs redact the values of URL query strings, the values of some HTTP request and response headers (including Ocp-Apim-Subscription-Key, which holds the key), and the request and response payloads. To create logs without redaction, set the method argument logging_enable = True when you create ImageAnalysisClient, or when you call analyze on the client.

# Create an Image Analysis client with none redacted log
client = ImageAnalysisClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key),
    logging_enable=True
)

None redacted logs are generated for log level logging.DEBUG only. Be sure to protect none redacted logs to avoid compromising security. For more information, see Configure logging in the Azure libraries for Python

Prerequisites

This guide assumes you've followed the steps in the quickstart page. This means:

  • You have created a Computer Vision resource and obtained a key and endpoint URL.
  • You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on code examples here.

Create and authenticate the client

To authenticate with the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY and VISION_ENDPOINT with your key and endpoint.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

Start by creating an ImageAnalysisClient object. For example:

String endpoint = System.getenv("VISION_ENDPOINT");
String key = System.getenv("VISION_KEY");

if (endpoint == null || key == null) {
    System.out.println("Missing environment variable 'VISION_ENDPOINT' or 'VISION_KEY'.");
    System.out.println("Set them before running this sample.");
    System.exit(1);
}

// Create a synchronous Image Analysis client.
ImageAnalysisClient client = new ImageAnalysisClientBuilder()
    .endpoint(endpoint)
    .credential(new KeyCredential(key))
    .buildClient();

Select the image to analyze

You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.

Image URL

Create an imageUrl string to hold the publicly accessible URL of the image you want to analyze.

String imageUrl = "https://zcusa.951200.xyz/azure/ai-services/computer-vision/media/quickstarts/presentation.png";

Image buffer

Alternatively, you can pass in the image as memory buffer using a BinaryData object. For example, read from a local image file you want to analyze.

BinaryData imageData = BinaryData.fromFile(new File("sample.png").toPath());

Select visual features

The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.

Important

The visual features Captions and DenseCaptions are only supported in certain Azure regions. See Region availability.

// visualFeatures: Select one or more visual features to analyze.
List<VisualFeatures> visualFeatures = Arrays.asList(
            VisualFeatures.SMART_CROPS,
            VisualFeatures.CAPTION,
            VisualFeatures.DENSE_CAPTIONS,
            VisualFeatures.OBJECTS,
            VisualFeatures.PEOPLE,
            VisualFeatures.READ,
            VisualFeatures.TAGS);

Select analysis options

Use an ImageAnalysisOptions object to specify various options for the Analyze API call.

  • Language: You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.
  • Gender neutral captions: If you're extracting captions or dense captions (using VisualFeatures.CAPTION or VisualFeatures.DENSE_CAPTIONS), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.
  • Crop aspect ratio: An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SMART_CROPS was selected as part the visual feature list. If you select VisualFeatures.SMART_CROPS but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).
// Specify analysis options (or set `options` to null for defaults)
ImageAnalysisOptions options = new ImageAnalysisOptions()
    .setLanguage("en")
    .setGenderNeutralCaption(true)
    .setSmartCropsAspectRatios(Arrays.asList(0.9, 1.33, 1.78));

Call the analyzeFromUrl method

This section shows you how to make an analysis call to the service.

Call the analyzeFromUrl method on the ImageAnalysisClient object, as shown here. The call is synchronous, and will block until the service returns the results or an error occurred. Alternatively, you can use a ImageAnalysisAsyncClient object instead, and call its analyzeFromUrl method, which is non-blocking.

To analyze from an image buffer instead of URL, call the analyze method instead, and pass in the imageData as the first argument.

try {
    // Analyze all visual features from an image URL. This is a synchronous (blocking) call.
    ImageAnalysisResult result = client.analyzeFromUrl(
        imageUrl,
        visualFeatures,
        options);

    printAnalysisResults(result);

} catch (HttpResponseException e) {
    System.out.println("Exception: " + e.getClass().getSimpleName());
    System.out.println("Status code: " + e.getResponse().getStatusCode());
    System.out.println("Message: " + e.getMessage());
} catch (Exception e) {
    System.out.println("Message: " + e.getMessage());
}

Get results from the service

The following code shows you how to parse the results from the analyzeFromUrl and analyze operations.

// Print all analysis results to the console
public static void printAnalysisResults(ImageAnalysisResult result) {

    System.out.println("Image analysis results:");

    if (result.getCaption() != null) {
        System.out.println(" Caption:");
        System.out.println("   \"" + result.getCaption().getText() + "\", Confidence "
            + String.format("%.4f", result.getCaption().getConfidence()));
    }

    if (result.getDenseCaptions() != null) {
        System.out.println(" Dense Captions:");
        for (DenseCaption denseCaption : result.getDenseCaptions().getValues()) {
            System.out.println("   \"" + denseCaption.getText() + "\", Bounding box "
                + denseCaption.getBoundingBox() + ", Confidence " + String.format("%.4f", denseCaption.getConfidence()));
        }
    }

    if (result.getRead() != null) {
        System.out.println(" Read:");
        for (DetectedTextLine line : result.getRead().getBlocks().get(0).getLines()) {
            System.out.println("   Line: '" + line.getText()
                + "', Bounding polygon " + line.getBoundingPolygon());
            for (DetectedTextWord word : line.getWords()) {
                System.out.println("     Word: '" + word.getText()
                    + "', Bounding polygon " + word.getBoundingPolygon()
                    + ", Confidence " + String.format("%.4f", word.getConfidence()));
            }
        }
    }

    if (result.getTags() != null) {
        System.out.println(" Tags:");
        for (DetectedTag tag : result.getTags().getValues()) {
            System.out.println("   \"" + tag.getName() + "\", Confidence " + String.format("%.4f", tag.getConfidence()));
        }
    }

    if (result.getObjects() != null) {
        System.out.println(" Objects:");
        for (DetectedObject detectedObject : result.getObjects().getValues()) {
            System.out.println("   \"" + detectedObject.getTags().get(0).getName() + "\", Bounding box "
                + detectedObject.getBoundingBox() + ", Confidence " + String.format("%.4f", detectedObject.getTags().get(0).getConfidence()));
        }
    }

    if (result.getPeople() != null) {
        System.out.println(" People:");
        for (DetectedPerson person : result.getPeople().getValues()) {
            System.out.println("   Bounding box "
                + person.getBoundingBox() + ", Confidence " + String.format("%.4f", person.getConfidence()));
        }
    }

    if (result.getSmartCrops() != null) {
        System.out.println(" Crop Suggestions:");
        for (CropRegion cropRegion : result.getSmartCrops().getValues()) {
            System.out.println("   Aspect ratio "
                + cropRegion.getAspectRatio() + ": Bounding box " + cropRegion.getBoundingBox());
        }
    }

    System.out.println(" Image height = " + result.getMetadata().getHeight());
    System.out.println(" Image width = " + result.getMetadata().getWidth());
    System.out.println(" Model version = " + result.getModelVersion());
}

Troubleshooting

Exceptions

The analyze methods throw HttpResponseException when the service responds with a non-success HTTP status code. The exception's getResponse().getStatusCode() holds the HTTP response status code. The exception's getMessage() contains a detailed message that allows you to diagnose the issue:

try {
    ImageAnalysisResult result = client.analyze(...)
} catch (HttpResponseException e) {
    System.out.println("Exception: " + e.getClass().getSimpleName());
    System.out.println("Status code: " + e.getResponse().getStatusCode());
    System.out.println("Message: " + e.getMessage());
} catch (Exception e) {
    System.out.println("Message: " + e.getMessage());
}

For example, when you provide a wrong authentication key:

Exception: ClientAuthenticationException
Status code: 401
Message: Status code 401, "{"error":{"code":"401","message":"Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource."}}"

Or when you provide an image in a format that isn't recognized:

Exception: HttpResponseException
Status code: 400
Message: Status code 400, "{"error":{"code":"InvalidRequest","message":"Image format is not valid.","innererror":{"code":"InvalidImageFormat","message":"Input data is not a valid image."}}}"

Enable HTTP request/response logging

Reviewing the HTTP request sent or response received over the wire to the Image Analysis service can be useful in troubleshooting. The Image Analysis client library supports a built-in console logging framework for temporary debugging purposes. It also supports more advanced logging using the SLF4J interface. For detailed information, see Use logging in the Azure SDK for Java.

The sections below discusses enabling console logging using the built-in framework.

By setting environment variables

You can enable console logging of HTTP request and response for your entire application by setting the following two environment variables. This change affects every Azure client that supports logging HTTP request and response.

  • Set environment variable AZURE_LOG_LEVEL to debug
  • Set environment variable AZURE_HTTP_LOG_DETAIL_LEVEL to one of the following values:
Value Logging level
none HTTP request/response logging is disabled
basic Logs only URLs, HTTP methods, and time to finish the request.
headers Logs everything in BASIC, plus all the request and response headers.
body Logs everything in BASIC, plus all the request and response body.
body_and_headers Logs everything in HEADERS and BODY.

By setting httpLogOptions

To enable console logging of HTTP request and response for a single client

  • Set environment variable AZURE_LOG_LEVEL to debug
  • Add a call to httpLogOptions when building the ImageAnalysisClient:
ImageAnalysisClient client = new ImageAnalysisClientBuilder()
    .endpoint(endpoint)
    .credential(new KeyCredential(key))
    .httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS))
    .buildClient();

The enum HttpLogDetailLevel defines the supported logging levels.

By default, when logging, certain HTTP header and query parameter values are redacted. It's possible to override this default by specifying which headers and query parameters are safe to log:

ImageAnalysisClient client = new ImageAnalysisClientBuilder()
    .endpoint(endpoint)
    .credential(new KeyCredential(key))
    .httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS)
        .addAllowedHeaderName("safe-to-log-header-name")
        .addAllowedQueryParamName("safe-to-log-query-parameter-name"))
    .buildClient();

For example, to get a complete un-redacted log of the HTTP request, apply the following:

    .httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS)
        .addAllowedHeaderName("Ocp-Apim-Subscription-Key")
        .addAllowedQueryParamName("features")
        .addAllowedQueryParamName("language")
        .addAllowedQueryParamName("gender-neutral-caption")
        .addAllowedQueryParamName("smartcrops-aspect-ratios")
        .addAllowedQueryParamName("model-version"))

Add more to the above to get an un-redacted HTTP response. When you share an un-redacted log, make sure it doesn't contain secrets such as your subscription key.

Prerequisites

This guide assumes you have followed the steps mentioned in the quickstart. This means:

  • You have created a Computer Vision resource and obtained a key and endpoint URL.
  • You have the appropriate SDK package installed and you have a running quickstart application. You can modify this quickstart application based on the code examples here.

Create and authenticate the client

To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL. This guide assumes that you've defined the environment variables VISION_KEY and VISION_ENDPOINT with your key and endpoint.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

Start by creating a ImageAnalysisClient object. For example:

// Load the .env file if it exists
require("dotenv").config();

const endpoint = process.env['VISION_ENDPOINT'] || '<your_endpoint>';
const key = process.env['VISION_KEY'] || '<your_key>';

const credential = new AzureKeyCredential(key);
const client = createClient(endpoint, credential);

Select the image to analyze

You can select an image by providing a publicly accessible image URL, or by reading image data into the SDK's input buffer. See Image requirements for supported image formats.

Image URL

You can use the following sample image URL.

const imageUrl = 'https://zcusa.951200.xyz/azure/ai-services/computer-vision/media/quickstarts/presentation.png';

Image buffer

Alternatively, you can pass in the image as a data array. For example, read from a local image file you want to analyze.

const imagePath = '../sample.jpg';
const imageData = fs.readFileSync(imagePath);

Select visual features

The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the Overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.

Important

The visual features Captions and DenseCaptions are only supported in certain Azure regions. See .

const features = [
  'Caption',
  'DenseCaptions',
  'Objects',
  'People',
  'Read',
  'SmartCrops',
  'Tags'
];

Call the Analyze API with options

The following code calls the Analyze Image API with the features you selected above and other options, defined next. To analyze from an image buffer instead of URL, replace imageURL in the method call with imageData.

const result = await client.path('/imageanalysis:analyze').post({
  body: {
      url: imageUrl
  },
  queryParameters: {
      features: features,
      'language': 'en',
      'gender-neutral-captions': 'true',
      'smartCrops-aspect-ratios': [0.9, 1.33]
  },
  contentType: 'application/json'
});

Select smart cropping aspect ratios

An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).

Select gender neutral captions

If you're extracting captions or dense captions (using VisualFeatures.Caption or VisualFeatures.DenseCaptions), you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.

Specify languages

You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.

Get results from the service

The following code shows you how to parse the results of the various analyze operations.

const iaResult = result.body;

console.log(`Model Version: ${iaResult.modelVersion}`);
console.log(`Image Metadata: ${JSON.stringify(iaResult.metadata)}`);
if (iaResult.captionResult) {
  console.log(`Caption: ${iaResult.captionResult.text} (confidence: ${iaResult.captionResult.confidence})`);
}
if (iaResult.denseCaptionsResult) {
  iaResult.denseCaptionsResult.values.forEach(denseCaption => console.log(`Dense Caption: ${JSON.stringify(denseCaption)}`));
}
if (iaResult.objectsResult) {
  iaResult.objectsResult.values.forEach(object => console.log(`Object: ${JSON.stringify(object)}`));
}
if (iaResult.peopleResult) {
  iaResult.peopleResult.values.forEach(person => console.log(`Person: ${JSON.stringify(person)}`));
}
if (iaResult.readResult) {
  iaResult.readResult.blocks.forEach(block => console.log(`Text Block: ${JSON.stringify(block)}`));
}
if (iaResult.smartCropsResult) {
  iaResult.smartCropsResult.values.forEach(smartCrop => console.log(`Smart Crop: ${JSON.stringify(smartCrop)}`));
}
if (iaResult.tagsResult) {
  iaResult.tagsResult.values.forEach(tag => console.log(`Tag: ${JSON.stringify(tag)}`));
}

Troubleshooting

Logging

Enabling logging may help uncover useful information about failures. In order to see a log of HTTP requests and responses, set the AZURE_LOG_LEVEL environment variable to info. Alternatively, logging can be enabled at runtime by calling setLogLevel in the @azure/logger:

const { setLogLevel } = require("@azure/logger");

setLogLevel("info");

For more detailed instructions on how to enable logs, you can look at the @azure/logger package docs.

Prerequisites

This guide assumes you have successfully followed the steps mentioned in the quickstart page. This means:

  • You have created a Computer Vision resource and obtained a key and endpoint URL.
  • You have successfully made a curl.exe call to the service (or used an alternative tool). You modify the curl.exe call based on the examples here.

Authenticate against the service

To authenticate against the Image Analysis service, you need a Computer Vision key and endpoint URL.

Important

If you use an API key, store it securely somewhere else, such as in Azure Key Vault. Don't include the API key directly in your code, and never post it publicly.

For more information about AI services security, see Authenticate requests to Azure AI services.

The SDK example assumes that you defined the environment variables VISION_KEY and VISION_ENDPOINT with your key and endpoint.

Authentication is done by adding the HTTP request header Ocp-Apim-Subscription-Key and setting it to your vision key. The call is made to the URL <endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01, where <endpoint> is your unique computer vision endpoint URL. You add query strings based on your analysis options.

Select the image to analyze

The code in this guide uses remote images referenced by URL. You might want to try different images on your own to see the full capability of the Image Analysis features.

Image URL

When analyzing a remote image, you specify the image's URL by formatting the request body like this: {"url":"https://zcusa.951200.xyz/azure/cognitive-services/computer-vision/images/windows-kitchen.jpg"}. The Content-Type should be application/json.

Image file

To analyze a local image, you'd put the binary image data in the HTTP request body. The Content-Type should be application/octet-stream or multipart/form-data.

Select analysis options

Select visual features when using the standard model

The Analysis 4.0 API gives you access to all of the service's image analysis features. Choose which operations to do based on your own use case. See the overview for a description of each feature. The example in this section adds all of the available visual features, but for practical usage you likely need fewer.

Visual features 'Captions' and 'DenseCaptions' are only supported in certain Azure regions: see Region availability.

Note

The REST API uses the terms Smart Crops and Smart Crops Aspect Ratios. The SDK uses the terms Crop Suggestions and Cropping Aspect Ratios. They both refer to the same service operation. Similarly, the REST API uses the term Read for detecting text in the image using Optical Character Recognition (OCR), whereas the SDK uses the term Text for the same operation.

You can specify which features you want to use by setting the URL query parameters of the Analysis 4.0 API. A parameter can have multiple values, separated by commas.

URL parameter Value Description
features read Reads the visible text in the image and outputs it as structured JSON data.
features caption Describes the image content with a complete sentence in supported languages.
features denseCaptions Generates detailed captions for up to 10 prominent image regions.
features smartCrops Finds the rectangle coordinates that would crop the image to a desired aspect ratio while preserving the area of interest.
features objects Detects various objects within an image, including the approximate location. The Objects argument is only available in English.
features tags Tags the image with a detailed list of words related to the image content.
features people Detects people appearing in images, including the approximate locations.

A populated URL might look like this:

<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=tags,read,caption,denseCaptions,smartCrops,objects,people

Set model name when using a custom model

You can also do image analysis with a custom trained model. To create and train a model, see Create a custom Image Analysis model. Once your model is trained, all you need is the model's name. You do not need to specify visual features if you use a custom model.

To use a custom model, don't use the features query parameter. Instead, set the model-name parameter to the name of your model as shown here. Replace MyCustomModelName with your custom model name.

<endpoint>/computervision/imageanalysis:analyze?api-version=2023-02-01&model-name=MyCustomModelName

Specify languages

You can specify the language of the returned data. The language is optional, with the default being English. See Language support for a list of supported language codes and which visual features are supported for each language.

Language option only applies when you're using the standard model.

The following URL query parameter specifies the language. The default value is en.

URL parameter Value Description
language en English
language es Spanish
language ja Japanese
language pt Portuguese
language zh Simplified Chinese

A populated URL might look like this:

<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=caption&language=en

Select gender neutral captions

If you're extracting captions or dense captions, you can ask for gender neutral captions. Gender neutral captions are optional, with the default being gendered captions. For example, in English, when you select gender neutral captions, terms like woman or man are replaced with person, and boy or girl are replaced with child.

Gender neutral caption option only applies when you're using the standard model.

Add the optional query string gender-neutral-caption with values true or false (the default).

A populated URL might look like this:

<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=caption&gender-neutral-caption=true

Select smart cropping aspect ratios

An aspect ratio is calculated by dividing the target crop width by the height. Supported values are from 0.75 to 1.8 (inclusive). Setting this property is only relevant when VisualFeatures.SmartCrops was selected as part the visual feature list. If you select VisualFeatures.SmartCrops but don't specify aspect ratios, the service returns one crop suggestion with an aspect ratio it sees fit. In this case, the aspect ratio is between 0.5 and 2.0 (inclusive).

Smart cropping aspect rations only applies when you're using the standard model.

Add the optional query string smartcrops-aspect-ratios, with one or more aspect ratios separated by a comma.

A populated URL might look like this:

<endpoint>/computervision/imageanalysis:analyze?api-version=2024-02-01&features=smartCrops&smartcrops-aspect-ratios=0.8,1.2

Get results from the service

Get results using the standard model

This section shows you how to make an analysis call to the service using the standard model, and get the results.

The service returns a 200 HTTP response, and the body contains the returned data in the form of a JSON string. The following text is an example of a JSON response.

{
    "modelVersion": "string",
    "captionResult": {
      "text": "string",
      "confidence": 0.0
    },
    "denseCaptionsResult": {
      "values": [
        {
          "text": "string",
          "confidence": 0.0,
          "boundingBox": {
            "x": 0,
            "y": 0,
            "w": 0,
            "h": 0
          }
        }
      ]
    },
    "metadata": {
      "width": 0,
      "height": 0
    },
    "tagsResult": {
      "values": [
        {
          "name": "string",
          "confidence": 0.0
        }
      ]
    },
    "objectsResult": {
      "values": [
        {
          "id": "string",
          "boundingBox": {
            "x": 0,
            "y": 0,
            "w": 0,
            "h": 0
          },
          "tags": [
            {
              "name": "string",
              "confidence": 0.0
            }
          ]
        }
      ]
    },
    "readResult": {
      "blocks": [
        {
          "lines": [
            {
              "text": "string",
              "boundingPolygon": [
                {
                  "x": 0,
                  "y": 0
                },
                {
                    "x": 0,
                    "y": 0
                },
                {
                    "x": 0,
                    "y": 0
                },
                {
                    "x": 0,
                    "y": 0
                }
              ],
              "words": [
                {
                  "text": "string",
                  "boundingPolygon": [
                    {
                        "x": 0,
                        "y": 0
                    },
                    {
                        "x": 0,
                        "y": 0
                    },
                    {
                        "x": 0,
                        "y": 0
                    },
                    {
                        "x": 0,
                        "y": 0
                    }
                  ],
                  "confidence": 0.0
                }
              ]
            }
          ]
        }
      ]
    },
    "smartCropsResult": {
      "values": [
        {
          "aspectRatio": 0.0,
          "boundingBox": {
            "x": 0,
            "y": 0,
            "w": 0,
            "h": 0
          }
        }
      ]
    },
    "peopleResult": {
      "values": [
        {
          "boundingBox": {
            "x": 0,
            "y": 0,
            "w": 0,
            "h": 0
          },
          "confidence": 0.0
        }
      ]
    }
  }

Error codes

On error, the Image Analysis service response contains a JSON payload that includes an error code and error message. It may also include other details in the form of and inner error code and message. For example:

{
    "error":
    {
        "code": "InvalidRequest",
        "message": "Analyze query is invalid.",
        "innererror":
        {
            "code": "NotSupportedVisualFeature",
            "message": "Specified feature type is not valid"
        }
    }
}

Following is a list of common errors and their causes. List items are presented in the following format:

  • HTTP response code
    • Error code and message in the JSON response
      • [Optional] Inner error code and message in the JSON response

List of common errors:

  • 400 Bad Request
    • InvalidRequest - Image URL is badly formatted or not accessible. Make sure the image URL is valid and publicly accessible.
    • InvalidRequest - The image size is not allowed to be zero or larger than 20971520 bytes. Reduce the size of the image by compressing it and/or resizing, and resubmit your request.
    • InvalidRequest - The feature 'Caption' is not supported in this region. The feature is only supported in specific Azure regions. See Quickstart prerequisites for the list of supported Azure regions.
    • InvalidRequest - The provided image content type ... is not supported. The HTTP header Content-Type in the request isn't an allowed type:
      • For an image URL, Content-Type should be application/json
      • For a binary image data, Content-Type should be application/octet-stream or multipart/form-data
    • InvalidRequest - Either 'features' or 'model-name' needs to be specified in the query parameter.
    • InvalidRequest - Image format is not valid
      • InvalidImageFormat - Image format is not valid. See the Image requirements section for supported image formats.
    • InvalidRequest - Analyze query is invalid
      • NotSupportedVisualFeature - Specified feature type is not valid. Make sure the features query string has a valid value.
      • NotSupportedLanguage - The input language is not supported. Make sure the language query string has a valid value for the selected visual feature, based on the following table.
      • BadArgument - 'smartcrops-aspect-ratios' aspect ratio is not in allowed range [0.75 to 1.8]
  • 401 PermissionDenied
    • 401 - Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.
  • 404 Resource Not Found
    • 404 - Resource not found. The service couldn't find the custom model based on the name provided by the model-name query string.

Next steps