Improving Scientific Notation Accuracy with Azure Document Intelligence Layout Model

Hongqian Li 65 Reputation points
2025-07-23T17:55:26.25+00:00

We’re currently using Azure Document Intelligence’s Prebuilt Layout model to extract content from structured documents. It works well for layout and structure, but it does not accurately preserve scientific notations. For example, expressions like 2.0 × 10⁻⁵ are extracted as 2.0 X 10-5, and formatting such as superscripts/subscripts is lost.

Our goal is to retain the layout extraction capabilities of the Prebuilt Layout model, but enhance it with better accuracy for mathematical/scientific expressions.

  1. Is it possible to build a custom model that extends the Prebuilt Layout model? That is, we want the same layout detection, but with improved text extraction (especially for scientific formats).
  2. If not directly extendable, what’s the recommended approach to combine layout recognition with better scientific notation handling? For example:
    1. How can we combine with Azure Vision OCR (Read API v4) to infer superscripts/subscripts more accurately?

We’re looking for implementation guidance or documentation that can help us bridge the gap without losing the benefits of the prebuilt layout model.

Thanks in advance for any insights or references!

Azure AI Document Intelligence
0 comments No comments
{count} votes

Accepted answer
  1. Sina Salam 24,096 Reputation points Volunteer Moderator
    2025-07-31T18:18:01.14+00:00

    Hello Hongqian Li,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to improve Scientific Notation Accuracy with Azure Document Intelligence Layout Model.

    My best advice for you is to start with the hybrid pipeline combining:

    • Read API v4 for OCR
    • Prebuilt Layout for structure
    • Custom post-processing for notation formatting

    Move to custom model if layout is consistent and notations are critical.

    For your scenario requirement, tools and solution:

    1. To preserve document layout, we use Azure DI's Prebuilt Layout Model.
    2. For detecting superscript/scientific notations, we combine Azure Read API v4 with heuristics or leverage MathPix's Math OCR.
    3. Accurate fusion of elements is achieved through custom logic implementing IoU matching with coordinate normalization.
    4. LaTeX/MathML export capability is enabled via math-aware reconstruction using MathPix or custom solutions.
    5. Handling custom templates involves field tagging and regex patterns through Azure DI's Custom Model.

    References as requested:

    1. Azure Read API: - https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/call-read-api
    2. Document Intelligence Layout Model: - https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-layout
    3. MathPix OCR (Optional): - https://docs.mathpix.com/
    4. MathML for Scientific Publishing: - https://developer.mozilla.org/en-US/docs/Web/MathML
    5. Coordinate Alignment for OCR: - https://docs.opencv.org/4.x/da/d6e/tutorial\_py\_geometric\_transformations.html

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.