Partager via


Azure DocumentTranslation client library for Java - version 1.0.0-beta.2

Document Translation is a cloud-based machine translation feature of the Azure AI Translator service. You can translate multiple and complex documents across all supported languages and dialects while preserving original document structure and data format. The Document translation API supports two translation processes:

Asynchronous batch translation supports the processing of multiple documents and large files. The batch translation process requires an Azure Blob storage account with storage containers for your source and translated documents.

Synchronous single file supports the processing of single file translations. The file translation process doesn't require an Azure Blob storage account. The final response contains the translated document and is returned directly to the calling client.

Documentation

Various documentation is available to help you get started

Getting started

Prerequisites

Adding the package to your product

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-translation-document</artifactId>
    <version>1.0.0-beta.2</version>
</dependency>

Authentication

Interaction with the service using the client library begins with creating an instance of the DocumentTranslationClient class. You will need an API key or TokenCredential and Endpoint to instantiate a document translation client object. Similarly for SingleDocumentTranslationclient

Get an API key

You can get the endpoint, API key and Region from the Cognitive Services resource or Document Translator service resource information in the [Azure Portal][azure_portal].

Alternatively, use the [Azure CLI][azure_cli] snippet below to get the API key from the Translator service resource.

az cognitiveservices account keys list --resource-group <your-resource-group-name> --name <your-resource-name>

Create a DocumentTranslationClient using endpoint and API key credential

Once you have the value for the API key, create an AzureKeyCredential. This will allow you to update the API key without creating a new client.

With the value of the endpoint and AzureKeyCredential , you can create the DocumentTranslationClient:

String endpoint = System.getenv("DOCUMENT_TRANSLATION_ENDPOINT");
String apiKey = System.getenv("DOCUMENT_TRANSLATION_API_KEY");

AzureKeyCredential credential = new AzureKeyCredential(apiKey);

DocumentTranslationClient client = new DocumentTranslationClientBuilder()
                    .endpoint(endpoint)
                    .credential(credential)
                    .buildClient();

You can similarly create the SingleDocumentTranslationClient:

String endpoint = System.getenv("DOCUMENT_TRANSLATION_ENDPOINT");
String apiKey = System.getenv("DOCUMENT_TRANSLATION_API_KEY");

AzureKeyCredential credential = new AzureKeyCredential(apiKey);

SingleDocumentTranslationClient client = new SingleDocumentTranslationClientBuilder()
                    .endpoint(endpoint)
                    .credential(credential)
                    .buildClient();

Key concepts

DocumentTranslationClient and DocumentTranslationAsyncClient

A DocumentTranslationClient is the primary interface for developers using the Document Translator client library. It provides both synchronous operations to access a specific use of document translator operations, such as get supported formats, get translation status, get translations status, get document status, get documents status, cancel translation or batch translation.

For asynchronous operations use DocumentTranslationAsyncClient.

A SingleDocumentTranslationClient provides an interface for developers to synchronously translate a single document.

For asynchronous operations use SingleDocumentTranslationAsyncClient.

Input

A batch request element (BatchRequest), is a single unit of input to be processed by the document translation models in the document Translator service. Operations on DocumentTranslationClient may take a single batchRequest element or a collection of elements.

Examples

The following section provides several code snippets using the client created above), and covers the main features present in this client library. Although most of the snippets below make use of asynchronous service calls, keep in mind that the Azure.AI.Translation.Document package supports both synchronous and asynchronous APIs.

Get Supported Formats

Gets a list of document and glossary formats supported by the Document Translation feature. The list includes common file extensions and content-type if using the upload API.

SupportedFileFormats documentResponse = documentTranslationClient.getSupportedFormats(FileFormatType.DOCUMENT);
List<FileFormat> documentFileFormats = documentResponse.getValue();
for (FileFormat fileFormat : documentFileFormats) {
    System.out.println("FileFormat:" + fileFormat.getFormat());
    System.out.println("FileExtensions:" + fileFormat.getFileExtensions());
    System.out.println("ContentTypes:" + fileFormat.getContentTypes());
    System.out.println("Type:" + fileFormat.getType());
}

SupportedFileFormats glossaryResponse = documentTranslationClient.getSupportedFormats(FileFormatType.GLOSSARY);
List<FileFormat> glossaryFileFormats = glossaryResponse.getValue();
for (FileFormat fileFormat : glossaryFileFormats) {
    System.out.println("FileFormat:" + fileFormat.getFormat());
    System.out.println("FileExtensions:" + fileFormat.getFileExtensions());
    System.out.println("ContentTypes:" + fileFormat.getContentTypes());
    System.out.println("Type:" + fileFormat.getType());
}

Please refer to the service documentation for a conceptual discussion of documentFormats and [glossaryFormats][glossaryFormats_doc].

Batch Translation

Executes an asynchronous batch translation request. The method requires an Azure Blob storage account with storage containers for your source and translated documents.

SyncPoller<TranslationStatus, Void> response
    = documentTranslationClient
        .beginStartTranslation(
            new StartTranslationDetails(Arrays.asList(new BatchRequest(
                new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                    .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                    .setLanguage("en")
                    .setStorageSource(StorageSource.AZURE_BLOB),
                Arrays
                    .asList(
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer1", "fr")
                            .setCategory("general")
                            .setGlossaries(Arrays.asList(new Glossary(
                                "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf", "XLIFF")
                                .setStorageSource(StorageSource.AZURE_BLOB)))
                            .setStorageSource(StorageSource.AZURE_BLOB),
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer2", "es")
                            .setCategory("general")
                            .setStorageSource(StorageSource.AZURE_BLOB)))
                .setStorageType(StorageInputType.FOLDER))));

Please refer to the service documentation for a conceptual discussion of batchTranslation.

Single Document Translation

Synchronously translate a single document.

DocumentFileDetails document = createDocumentContent();
DocumentTranslateContent documentTranslateContent = new DocumentTranslateContent(document);
String targetLanguage = "hi";    

BinaryData response = singleDocumentTranslationClient.documentTranslate(targetLanguage, documentTranslateContent);        
String translatedResponse = response.toString();
System.out.println("Translated Response: " + translatedResponse);

Please refer to the service documentation for a conceptual discussion of singleDocumentTranslation.

Cancel Translation

Cancels a translation job that is currently processing or queued (pending) as indicated in the request by the id query parameter.

DocumentTranslationClient documentTranslationClient = new DocumentTranslationClientBuilder()
    .endpoint("{endpoint}")
    .credential(new AzureKeyCredential("{key}"))
    .buildClient();
       
SyncPoller<TranslationStatus, Void> response
    = documentTranslationClient
        .beginStartTranslation(
            new StartTranslationDetails(Arrays.asList(new BatchRequest(
                new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                    .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                    .setLanguage("en")
                    .setStorageSource(StorageSource.AZURE_BLOB),
                Arrays
                    .asList(
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer1", "fr")
                            .setCategory("general")
                            .setGlossaries(Arrays.asList(new Glossary(
                                "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf", "XLIFF")
                                .setStorageSource(StorageSource.AZURE_BLOB)))
                            .setStorageSource(StorageSource.AZURE_BLOB),
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer2", "es")
                            .setCategory("general")
                            .setStorageSource(StorageSource.AZURE_BLOB)))
                .setStorageType(StorageInputType.FOLDER))));

String translationId = response.poll().getValue().getId();
documentTranslationClient.cancelTranslation(translationId);        
TranslationStatus translationStatus = documentTranslationClient.getTranslationStatus(translationId);

System.out.println("Translation ID is: " + translationStatus.getId());
System.out.println("Translation status is: " + translationStatus.getStatus().toString());

Please refer to the service documentation for a conceptual discussion of cancelTranslation.

Get Translations Status

Gets a list and the status of all translation jobs submitted by the user (associated with the resource).

SyncPoller<TranslationStatus, Void> response = documentTranslationClient
        .beginStartTranslation(
                new StartTranslationDetails(Arrays.asList(new BatchRequest(
                        new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                                .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                                .setLanguage("en")
                                .setStorageSource(StorageSource.AZURE_BLOB),
                        Arrays
                                .asList(
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer1",
                                                "fr")
                                                .setCategory("general")
                                                .setGlossaries(Arrays.asList(new Glossary(
                                                        "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf",
                                                        "XLIFF")
                                                        .setStorageSource(StorageSource.AZURE_BLOB)))
                                                .setStorageSource(StorageSource.AZURE_BLOB),
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer2",
                                                "es")
                                                .setCategory("general")
                                                .setStorageSource(StorageSource.AZURE_BLOB)))
                        .setStorageType(StorageInputType.FOLDER))));

PagedIterable<TranslationStatus> translationStatuses = documentTranslationClient.getTranslationsStatus();
for (TranslationStatus translationStatus : translationStatuses) {
    System.out.println("Translation ID is: " + translationStatus.getId());
    System.out.println("Translation status is: " + translationStatus.getStatus().toString());
}

Please refer to the service documentation for a conceptual discussion of getTranslationsStatus.

Get Translation Status

Request a summary of the status for a specific translation job. The response includes the overall job status and the status for documents that are being translated as part of that job.

SyncPoller<TranslationStatus, Void> response
    = documentTranslationClient
        .beginStartTranslation(
            new StartTranslationDetails(Arrays.asList(new BatchRequest(
                new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                    .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                    .setLanguage("en")
                    .setStorageSource(StorageSource.AZURE_BLOB),
                Arrays
                    .asList(
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer1", "fr")
                            .setCategory("general")
                            .setGlossaries(Arrays.asList(new Glossary(
                                "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf", "XLIFF")
                                .setStorageSource(StorageSource.AZURE_BLOB)))
                            .setStorageSource(StorageSource.AZURE_BLOB),
                        new TargetInput("https://myblob.blob.core.windows.net/destinationContainer2", "es")
                            .setCategory("general")
                            .setStorageSource(StorageSource.AZURE_BLOB)))
                .setStorageType(StorageInputType.FOLDER))));

String translationId = response.poll().getValue().getId();      
TranslationStatus translationStatus = documentTranslationClient.getTranslationStatus(translationId);

System.out.println("Translation ID is: " + translationStatus.getId());
System.out.println("Translation status is: " + translationStatus.getStatus().toString());

Please refer to the service documentation for a conceptual discussion of getTranslationStatus.

Get Documents Status

Gets the status for all documents in a translation job.

SyncPoller<TranslationStatus, Void> response = documentTranslationClient
        .beginStartTranslation(
                new StartTranslationDetails(Arrays.asList(new BatchRequest(
                        new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                                .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                                .setLanguage("en")
                                .setStorageSource(StorageSource.AZURE_BLOB),
                        Arrays
                                .asList(
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer1",
                                                "fr")
                                                .setCategory("general")
                                                .setGlossaries(Arrays.asList(new Glossary(
                                                        "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf",
                                                        "XLIFF")
                                                        .setStorageSource(StorageSource.AZURE_BLOB)))
                                                .setStorageSource(StorageSource.AZURE_BLOB),
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer2",
                                                "es")
                                                .setCategory("general")
                                                .setStorageSource(StorageSource.AZURE_BLOB)))
                        .setStorageType(StorageInputType.FOLDER))));

String translationId = response.poll().getValue().getId();

// Add Status filter
List<String> succeededStatusList = Arrays.asList(Status.SUCCEEDED.toString());
try {
    PagedIterable<DocumentStatus> documentStatusResponse = documentTranslationClient
            .getDocumentsStatus(translationId, null, null, null, succeededStatusList, null, null, null);
    for (DocumentStatus documentStatus : documentStatusResponse) {
        String id = documentStatus.getId();
        System.out.println("Document Translation ID is: " + id);
        String status = documentStatus.getStatus().toString();
        System.out.println("Document Translation status is: " + status);
    }
} catch (Exception e) {
    System.err.println("An exception occurred: " + e.getMessage());
    e.printStackTrace();
}

Please refer to the service documentation for a conceptual discussion of getDocumentsStatus.

Get Document Status

Request the status for a specific document in a job.

SyncPoller<TranslationStatus, Void> response = documentTranslationClient
        .beginStartTranslation(
                new StartTranslationDetails(Arrays.asList(new BatchRequest(
                        new SourceInput("https://myblob.blob.core.windows.net/sourceContainer")
                                .setFilter(new DocumentFilter().setPrefix("pre").setSuffix(".txt"))
                                .setLanguage("en")
                                .setStorageSource(StorageSource.AZURE_BLOB),
                        Arrays
                                .asList(
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer1",
                                                "fr")
                                                .setCategory("general")
                                                .setGlossaries(Arrays.asList(new Glossary(
                                                        "https://myblob.blob.core.windows.net/myglossary/en_fr_glossary.xlf",
                                                        "XLIFF")
                                                        .setStorageSource(StorageSource.AZURE_BLOB)))
                                                .setStorageSource(StorageSource.AZURE_BLOB),
                                        new TargetInput(
                                                "https://myblob.blob.core.windows.net/destinationContainer2",
                                                "es")
                                                .setCategory("general")
                                                .setStorageSource(StorageSource.AZURE_BLOB)))
                        .setStorageType(StorageInputType.FOLDER))));

String translationId = response.poll().getValue().getId();

// Add Status filter
List<String> succeededStatusList = Arrays.asList(Status.SUCCEEDED.toString());
try {
    PagedIterable<DocumentStatus> documentStatusResponse = documentTranslationClient
            .getDocumentsStatus(translationId, null, null, null, succeededStatusList, null, null, null);
    for (DocumentStatus documentsStatus : documentStatusResponse) {
        String id = documentsStatus.getId();
        System.out.println("Document Translation ID is: " + id);
        DocumentStatus documentStatus = documentTranslationClient.getDocumentStatus(translationId, id);
        System.out.println("Document ID is: " + documentStatus.getId());
        System.out.println("Document Status is: " + documentStatus.getStatus().toString());
        System.out.println("Characters Charged is: " + documentStatus.getCharacterCharged().toString());
        System.out.println("Document path is: " + documentStatus.getPath());
        System.out.println("Document source path is: " + documentStatus.getSourcePath());
    }
} catch (Exception e) {
    System.err.println("An exception occurred: " + e.getMessage());
    e.printStackTrace();
}

Please refer to the service documentation for a conceptual discussion of getDocumentStatus.

Troubleshooting

When you interact with the Document Translator Service using the DocumentTranslator client library, errors returned by the service correspond to the same HTTP status codes returned for REST API requests.

For example, if you submit a document translation request without a target translate language, a 400 error is returned, indicating "Bad Request".

Next steps

Samples showing how to use this client library are available in this GitHub repository. Samples are provided for each main functional area.

Contributing

For details on contributing to this repository, see the contributing guide.

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request