Problem Statement : I have a pdf which consists of questions with their options, and it may also consist of a figure (with or without labels) associated with those questions. I want to extract the questions and their respective options along with figures (if available). How can I extract only the figures with labels?

Hi Anchit Gupta , Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query! To address your requirement: To extract figures with labels from a PDF, follow these steps: Convert the PDF to Images: Use a library like pdf2image to convert each page of the PDF into images for processing. Use Azure Document Intelligence: Leverage the prebuilt-layout model to detect and extract text regions, questions, options, and layout elements. This will help identify figures and their associated text labels. Extract Graphical Labels: For graphical or embedded text labels within figures, utilize the Azure Computer Vision Read API to perform OCR on the figure regions. Optional Customization: If the figures or labels follow a unique pattern, consider training a custom model using Azure Custom Vision or Document Intelligence Custom Model for better accuracy. Hope this helps. Do let us know if you have any further queries. If this answers your query, do click Accept Answer and Yes f or was this answer helpful.

How to extract Figures with labels from the image

Accepted answer

Pavankumar Purilla 1,965 Reputation points Microsoft Vendor

2024-11-21T17:22:21.1933333+00:00
Hi Anchit Gupta,
Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!

To address your requirement:

To extract figures with labels from a PDF, follow these steps:

Convert the PDF to Images: Use a library like pdf2image to convert each page of the PDF into images for processing.

Use Azure Document Intelligence: Leverage the prebuilt-layout model to detect and extract text regions, questions, options, and layout elements. This will help identify figures and their associated text labels.

Extract Graphical Labels: For graphical or embedded text labels within figures, utilize the Azure Computer Vision Read API to perform OCR on the figure regions.

Optional Customization: If the figures or labels follow a unique pattern, consider training a custom model using Azure Custom Vision or Document Intelligence Custom Model for better accuracy.

Hope this helps. Do let us know if you have any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful.
Please sign in to rate this answer.
Anchit Gupta 20 Reputation points

2024-11-22T09:33:07.5666667+00:00

Hi Pavankumar Purilla,

Thank you for Providing Answer, I have used Azure Document Intelligence and got the desired result, but I want to create a bounding box around the figure, as I have also tried out the sample code given on GitHub for extracting bounding box coordinates, but I am unable to track the figure through coordinates as I don't know about the coordinates metrics units
Is there any way to configure bounding box coordinates such that to create bounding box around figure?

Anchit Gupta 20 Reputation points

2024-11-22T11:05:13.2066667+00:00

Hi Pavankumar Purilla,

Thank you for providing Answer, I have used Azure Document Intelligence, Got the desired result. But I want to create a Bounding box around the figure, for that i have used sample code from GitHub(https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Python(v4.0)/Layout_model/sample_analyze_layout.py).

As it is able to provide me coordinates of figure, But I am unable to mark a bounding box around figure because i am unable to know what units/metrics measures were used in coordinates.

Please share some thoughts on how to create bounding box by the resulted coordinates.

Like I have used Document Intelligence Studio, In extracted pages, it is highlighting the figures same as I want to create bounding box

I have given image as reference

Pavankumar Purilla 1,965 Reputation points Microsoft Vendor

2024-11-22T16:02:54.4733333+00:00

Hi Anchit Gupta
Hope you are doing well.
To create a bounding box around a figure the coordinates returned by the service are normalized values in the range [0, 1], relative to the width and height of the image or document analyzed. To draw the bounding box, multiply these normalized coordinates by the actual dimensions of the image (width and height) to convert them into pixel-based absolute coordinates. Once converted, use a library like Pillow or OpenCV to draw the bounding box. For example, if the normalized coordinates are [(x1, y1), (x2, y2), (x3, y3), (x4, y4)], scale each x by the image width and each y by the image height. Then, use these scaled coordinates to draw the box on the image. This approach replicates how Document Intelligence Studio highlights figures.
I hope this information helps. Thank you!

Anchit Gupta 20 Reputation points

2024-11-25T04:39:26.6866667+00:00

Hi Pavankumar Purilla,
Thanks for your reply,
But I am unable to get the desired result as per your suggestion,

Lets say there is an highlighted figure whos coordinates are:

"polygon": [ 2.0227, 4.083, 6.1379, 4.0809, 6.1407, 7.7027, 2.0256, 7.7047 ]

Image dimensions: 1655x2340
If I multiply image dimensions with coordinates of highlighted figure, then resultant coordinates are:
[(3347.5685, 9554.220000000001), (10158.2245, 9549.305999999999), (10162.8585, 18024.318), (3352.368, 18028.998)]

which is out of bound of image size,

Please do explain me how it actually works if I misunderstood.
Thank You!

Pavankumar Purilla 1,965 Reputation points Microsoft Vendor

2024-11-26T00:09:54.19+00:00

Hi Anchit Gupta,
Hope you are doing great.
The issue arises because the coordinates provided are normalized values (between 0 and 1) representing relative positions within the image, not absolute pixel positions. To convert them to absolute pixel coordinates, multiply the normalized values by the image dimensions and round to the nearest integer. If the resulting absolute coordinates are out of bounds, it indicates the figure is partially or fully outside the image area. Adjusting the coordinates or verifying the image's size might be necessary to ensure the figure is fully contained within the boundaries.
I hope this information helps. Thank you!
Sign in to comment

Use comments to ask for clarification, additional information, or improvements to the question.

Share via

How to extract Figures with labels from the image

0 additional answers

Your answer