Dela via


FormTrainingClient Class

FormTrainingClient is the Form Recognizer interface to use for creating and managing custom models. It provides methods for training models on the forms you provide, as well as methods for viewing and deleting models, accessing account properties, copying models to another Form Recognizer resource, and composing models from a collection of existing models trained with labels.

Note

FormTrainingClient should be used with API versions <=v2.1.

To use API versions 2022-08-31 and up, instantiate a DocumentModelAdministrationClient.

Inheritance
azure.ai.formrecognizer._form_base_client.FormRecognizerClientBase
FormTrainingClient

Constructor

FormTrainingClient(endpoint: str, credential: AzureKeyCredential | TokenCredential, **kwargs: Any)

Parameters

Name Description
endpoint
Required
str

Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus2.api.cognitive.microsoft.com).

credential
Required

Credentials needed for the client to connect to Azure. This is an instance of AzureKeyCredential if using an API key or a token credential from identity.

Keyword-Only Parameters

Name Description
api_version

The API version of the service to use for requests. It defaults to API version v2.1. Setting to an older version may result in reduced feature compatibility. To use the latest supported API version and features, instantiate a DocumentModelAdministrationClient instead.

Examples

Creating the FormTrainingClient with an endpoint and API key.


   from azure.core.credentials import AzureKeyCredential
   from azure.ai.formrecognizer import FormTrainingClient
   endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
   key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]

   form_training_client = FormTrainingClient(endpoint, AzureKeyCredential(key))

Creating the FormTrainingClient with a token credential.


   """DefaultAzureCredential will use the values from these environment
   variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET
   """
   from azure.ai.formrecognizer import FormTrainingClient
   from azure.identity import DefaultAzureCredential

   endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
   credential = DefaultAzureCredential()

   form_training_client = FormTrainingClient(endpoint, credential)

Methods

begin_copy_model

Copy a custom model stored in this resource (the source) to the user specified target Form Recognizer resource. This should be called with the source Form Recognizer resource (with the model that is intended to be copied). The target parameter should be supplied from the target resource's output from calling the get_copy_authorization method.

begin_create_composed_model

Creates a composed model from a collection of existing models that were trained with labels.

A composed model allows multiple models to be called with a single model ID. When a document is submitted to be analyzed with a composed model ID, a classification step is first performed to route it to the correct custom model.

New in version v2.1: The begin_create_composed_model client method

begin_training

Create and train a custom model. The request must include a training_files_url parameter that is an externally accessible Azure storage blob container URI (preferably a Shared Access Signature URI). Note that a container URI (without SAS) is accepted only when the container is public or has a managed identity configured, see more about configuring managed identities to work with Form Recognizer here: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/managed-identities. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg', 'image/png', 'image/tiff', or 'image/bmp'. Other types of content in the container is ignored.

New in version v2.1: The model_name keyword argument

close

Close the FormTrainingClient session.

delete_model

Mark model for deletion. Model artifacts will be permanently removed within a predetermined period.

get_account_properties

Get information about the models on the form recognizer account.

get_copy_authorization

Generate authorization for copying a custom model into the target Form Recognizer resource. This should be called by the target resource (where the model will be copied to) and the output can be passed as the target parameter into begin_copy_model.

get_custom_model

Get a description of a custom model, including the types of forms it can recognize, and the fields it will extract for each form type.

get_form_recognizer_client

Get an instance of a FormRecognizerClient from FormTrainingClient.

list_custom_models

List information for each model, including model id, model status, and when it was created and last modified.

send_request

Runs a network request using the client's existing pipeline.

The request URL can be relative to the base URL. The service API version used for the request is the same as the client's unless otherwise specified. Overriding the client's configured API version in relative URL is supported on client with API version 2022-08-31 and later. Overriding in absolute URL supported on client with any API version. This method does not raise if the response is an error; to raise an exception, call raise_for_status() on the returned response object. For more information about how to send custom requests with this method, see https://aka.ms/azsdk/dpcodegen/python/send_request.

begin_copy_model

Copy a custom model stored in this resource (the source) to the user specified target Form Recognizer resource. This should be called with the source Form Recognizer resource (with the model that is intended to be copied). The target parameter should be supplied from the target resource's output from calling the get_copy_authorization method.

begin_copy_model(model_id: str, target: Dict[str, str | int], **kwargs: Any) -> LROPoller[CustomFormModelInfo]

Parameters

Name Description
model_id
Required
str

Model identifier of the model to copy to target resource.

target
Required

The copy authorization generated from the target resource's call to get_copy_authorization.

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

An instance of an LROPoller. Call result() on the poller object to return a CustomFormModelInfo.

Exceptions

Type Description

Examples

Copy a model from the source resource to the target resource


   source_client = FormTrainingClient(endpoint=source_endpoint, credential=AzureKeyCredential(source_key))

   poller = source_client.begin_copy_model(
       model_id=source_model_id,
       target=target  # output from target client's call to get_copy_authorization()
   )
   copied_over_model = poller.result()

   print("Model ID: {}".format(copied_over_model.model_id))
   print("Status: {}".format(copied_over_model.status))

begin_create_composed_model

Creates a composed model from a collection of existing models that were trained with labels.

A composed model allows multiple models to be called with a single model ID. When a document is submitted to be analyzed with a composed model ID, a classification step is first performed to route it to the correct custom model.

New in version v2.1: The begin_create_composed_model client method

begin_create_composed_model(model_ids: List[str], **kwargs: Any) -> LROPoller[CustomFormModel]

Parameters

Name Description
model_ids
Required

List of model IDs to use in the composed model.

Keyword-Only Parameters

Name Description
model_name
str

An optional, user-defined name to associate with your model.

continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

An instance of an LROPoller. Call result() on the poller object to return a CustomFormModel.

Exceptions

Type Description

Examples

Create a composed model


   from azure.core.credentials import AzureKeyCredential
   from azure.ai.formrecognizer import FormTrainingClient

   endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
   key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]
   po_supplies = os.environ['PURCHASE_ORDER_OFFICE_SUPPLIES_SAS_URL_V2']
   po_equipment = os.environ['PURCHASE_ORDER_OFFICE_EQUIPMENT_SAS_URL_V2']
   po_furniture = os.environ['PURCHASE_ORDER_OFFICE_FURNITURE_SAS_URL_V2']
   po_cleaning_supplies = os.environ['PURCHASE_ORDER_OFFICE_CLEANING_SUPPLIES_SAS_URL_V2']

   form_training_client = FormTrainingClient(endpoint=endpoint, credential=AzureKeyCredential(key))
   supplies_poller = form_training_client.begin_training(
       po_supplies, use_training_labels=True, model_name="Purchase order - Office supplies"
   )
   equipment_poller = form_training_client.begin_training(
       po_equipment, use_training_labels=True, model_name="Purchase order - Office Equipment"
   )
   furniture_poller = form_training_client.begin_training(
       po_furniture, use_training_labels=True, model_name="Purchase order - Furniture"
   )
   cleaning_supplies_poller = form_training_client.begin_training(
       po_cleaning_supplies, use_training_labels=True, model_name="Purchase order - Cleaning Supplies"
   )
   supplies_model = supplies_poller.result()
   equipment_model = equipment_poller.result()
   furniture_model = furniture_poller.result()
   cleaning_supplies_model = cleaning_supplies_poller.result()

   models_trained_with_labels = [
       supplies_model.model_id,
       equipment_model.model_id,
       furniture_model.model_id,
       cleaning_supplies_model.model_id
   ]

   poller = form_training_client.begin_create_composed_model(
       models_trained_with_labels, model_name="Office Supplies Composed Model"
   )
   model = poller.result()

   print("Office Supplies Composed Model Info:")
   print("Model ID: {}".format(model.model_id))
   print("Model name: {}".format(model.model_name))
   print("Is this a composed model?: {}".format(model.properties.is_composed_model))
   print("Status: {}".format(model.status))
   print("Composed model creation started on: {}".format(model.training_started_on))
   print("Creation completed on: {}".format(model.training_completed_on))


begin_training

Create and train a custom model. The request must include a training_files_url parameter that is an externally accessible Azure storage blob container URI (preferably a Shared Access Signature URI). Note that a container URI (without SAS) is accepted only when the container is public or has a managed identity configured, see more about configuring managed identities to work with Form Recognizer here: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/managed-identities. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg', 'image/png', 'image/tiff', or 'image/bmp'. Other types of content in the container is ignored.

New in version v2.1: The model_name keyword argument

begin_training(training_files_url: str, use_training_labels: bool, **kwargs: Any) -> LROPoller[CustomFormModel]

Parameters

Name Description
training_files_url
Required
str

An Azure Storage blob container's SAS URI. A container URI (without SAS) can be used if the container is public or has a managed identity configured. For more information on setting up a training data set, see: https://aka.ms/azsdk/formrecognizer/buildtrainingset.

use_training_labels
Required

Whether to train with labels or not. Corresponding labeled files must exist in the blob container if set to True.

Keyword-Only Parameters

Name Description
prefix
str

A case-sensitive prefix string to filter documents in the source path for training. For example, when using an Azure storage blob URI, use the prefix to restrict sub folders for training.

include_subfolders

A flag to indicate if subfolders within the set of prefix folders will also need to be included when searching for content to be preprocessed. Not supported if training with labels.

model_name
str

An optional, user-defined name to associate with your model.

continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

An instance of an LROPoller. Call result() on the poller object to return a CustomFormModel.

Exceptions

Type Description

Note that if the training fails, the exception is raised, but a model with an "invalid" status is still created. You can delete this model by calling

Examples

Training a model (without labels) with your custom forms.


   from azure.ai.formrecognizer import FormTrainingClient
   from azure.core.credentials import AzureKeyCredential

   endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
   key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]
   container_sas_url = os.environ["CONTAINER_SAS_URL_V2"]

   form_training_client = FormTrainingClient(endpoint, AzureKeyCredential(key))
   poller = form_training_client.begin_training(container_sas_url, use_training_labels=False)
   model = poller.result()

   # Custom model information
   print("Model ID: {}".format(model.model_id))
   print("Status: {}".format(model.status))
   print("Model name: {}".format(model.model_name))
   print("Training started on: {}".format(model.training_started_on))
   print("Training completed on: {}".format(model.training_completed_on))

   print("Recognized fields:")
   # Looping through the submodels, which contains the fields they were trained on
   for submodel in model.submodels:
       print("...The submodel has form type '{}'".format(submodel.form_type))
       for name, field in submodel.fields.items():
           print("...The model found field '{}' to have label '{}'".format(
               name, field.label
           ))

close

Close the FormTrainingClient session.

close() -> None

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Exceptions

Type Description

delete_model

Mark model for deletion. Model artifacts will be permanently removed within a predetermined period.

delete_model(model_id: str, **kwargs: Any) -> None

Parameters

Name Description
model_id
Required
str

Model identifier.

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

Exceptions

Type Description

Examples

Delete a custom model.


   form_training_client.delete_model(model_id=custom_model.model_id)

   try:
       form_training_client.get_custom_model(model_id=custom_model.model_id)
   except ResourceNotFoundError:
       print("Successfully deleted model with id {}".format(custom_model.model_id))

get_account_properties

Get information about the models on the form recognizer account.

get_account_properties(**kwargs: Any) -> AccountProperties

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

Summary of models on account - custom model count, custom model limit.

Exceptions

Type Description

Examples

Get properties for the form recognizer account.


   form_training_client = FormTrainingClient(endpoint=endpoint, credential=AzureKeyCredential(key))
   # First, we see how many custom models we have, and what our limit is
   account_properties = form_training_client.get_account_properties()
   print("Our account has {} custom models, and we can have at most {} custom models\n".format(
       account_properties.custom_model_count, account_properties.custom_model_limit
   ))

get_copy_authorization

Generate authorization for copying a custom model into the target Form Recognizer resource. This should be called by the target resource (where the model will be copied to) and the output can be passed as the target parameter into begin_copy_model.

get_copy_authorization(resource_id: str, resource_region: str, **kwargs: Any) -> Dict[str, str | int]

Parameters

Name Description
resource_id
Required
str

Azure Resource Id of the target Form Recognizer resource where the model will be copied to.

resource_region
Required
str

Location of the target Form Recognizer resource. A valid Azure region name supported by Cognitive Services. For example, 'westus', 'eastus' etc. See https://azure.microsoft.com/global-infrastructure/services/?products=cognitive-services for the regional availability of Cognitive Services.

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

A dictionary with values for the copy authorization - "modelId", "accessToken", "resourceId", "resourceRegion", and "expirationDateTimeTicks".

Exceptions

Type Description

Examples

Authorize the target resource to receive the copied model


   target_client = FormTrainingClient(endpoint=target_endpoint, credential=AzureKeyCredential(target_key))

   target = target_client.get_copy_authorization(
       resource_region=target_region,
       resource_id=target_resource_id
   )
   # model ID that target client will use to access the model once copy is complete
   print("Model ID: {}".format(target["modelId"]))

get_custom_model

Get a description of a custom model, including the types of forms it can recognize, and the fields it will extract for each form type.

get_custom_model(model_id: str, **kwargs: Any) -> CustomFormModel

Parameters

Name Description
model_id
Required
str

Model identifier.

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

CustomFormModel

Exceptions

Type Description

Examples

Get a custom model with a model ID.


   custom_model = form_training_client.get_custom_model(model_id=model.model_id)
   print("\nModel ID: {}".format(custom_model.model_id))
   print("Status: {}".format(custom_model.status))
   print("Model name: {}".format(custom_model.model_name))
   print("Is this a composed model?: {}".format(custom_model.properties.is_composed_model))
   print("Training started on: {}".format(custom_model.training_started_on))
   print("Training completed on: {}".format(custom_model.training_completed_on))

get_form_recognizer_client

Get an instance of a FormRecognizerClient from FormTrainingClient.

get_form_recognizer_client(**kwargs: Any) -> FormRecognizerClient

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

A FormRecognizerClient

Exceptions

Type Description

list_custom_models

List information for each model, including model id, model status, and when it was created and last modified.

list_custom_models(**kwargs: Any) -> ItemPaged[CustomFormModelInfo]

Keyword-Only Parameters

Name Description
continuation_token
str

A continuation token to restart a poller from a saved state.

Returns

Type Description

ItemPaged[CustomFormModelInfo]

Exceptions

Type Description

Examples

List model information for each model on the account.


   custom_models = form_training_client.list_custom_models()

   print("We have models with the following IDs:")
   for model_info in custom_models:
       print(model_info.model_id)

send_request

Runs a network request using the client's existing pipeline.

The request URL can be relative to the base URL. The service API version used for the request is the same as the client's unless otherwise specified. Overriding the client's configured API version in relative URL is supported on client with API version 2022-08-31 and later. Overriding in absolute URL supported on client with any API version. This method does not raise if the response is an error; to raise an exception, call raise_for_status() on the returned response object. For more information about how to send custom requests with this method, see https://aka.ms/azsdk/dpcodegen/python/send_request.

send_request(request: HttpRequest, *, stream: bool = False, **kwargs) -> HttpResponse

Parameters

Name Description
request
Required

The network request you want to make.

Keyword-Only Parameters

Name Description
stream

Whether the response payload will be streamed. Defaults to False.

Returns

Type Description

The response of your network call. Does not do error handling on your response.

Exceptions

Type Description