How to tag repeating data in AI Builder model training?

Sagar Biradar 20 Reputation points
2025-01-02T11:42:14.49+00:00

I have a PDF where the format repeats on every page. There are 10 invoices in the same PDF (10 pages). If I want to tag them as one field only, how can this be accomplished, as AI Builder doesn't allow using the same tag multiple times?

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,846 questions
0 comments No comments
{count} votes

Accepted answer
  1. Pavankumar Purilla 2,290 Reputation points Microsoft Vendor
    2025-01-02T23:16:10.1166667+00:00

    Hi Sagar Biradar,
    Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!

    I understand that you are facing an issue when working with AI Builder to tag repeating data in a PDF, such as multiple invoices in a single document.
    Here are some best practices for tagging fields in AI Builder when working with structured or semi-structured documents:

    For each field to be tagged, draw a rectangle around the field on the first page and assign the corresponding field name. If the field appears on multiple pages, use the "Continue tagging" option to tag the same field across different pages, ensuring consistency throughout the document.

    For tables, draw a rectangle around the entire table and select a table name. Define rows and columns by left-clicking between row separators and pressing Ctrl + left-click for columns. This approach ensures accurate table data extraction, even across multiple pages.

    For repeated sections, such as multiple invoices, tag each instance separately. Since AI Builder does not allow the same tag to be used multiple times, create unique tags for each occurrence, such as "InvoiceDate1," "InvoiceDate2," and so on, to differentiate between repeated data.

    By following these steps, you can effectively tag repeating data in AI Builder, ensuring the model learns to extract data accurately from structured or semi-structured documents.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.