Hello Hongqian Li,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to improve Scientific Notation Accuracy with Azure Document Intelligence Layout Model.
My best advice for you is to start with the hybrid pipeline combining:
- Read API v4 for OCR
- Prebuilt Layout for structure
- Custom post-processing for notation formatting
Move to custom model if layout is consistent and notations are critical.
For your scenario requirement, tools and solution:
- To preserve document layout, we use Azure DI's Prebuilt Layout Model.
- For detecting superscript/scientific notations, we combine Azure Read API v4 with heuristics or leverage MathPix's Math OCR.
- Accurate fusion of elements is achieved through custom logic implementing IoU matching with coordinate normalization.
- LaTeX/MathML export capability is enabled via math-aware reconstruction using MathPix or custom solutions.
- Handling custom templates involves field tagging and regex patterns through Azure DI's Custom Model.
References as requested:
- Azure Read API: - https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/call-read-api
- Document Intelligence Layout Model: - https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept-layout
- MathPix OCR (Optional): - https://docs.mathpix.com/
- MathML for Scientific Publishing: - https://developer.mozilla.org/en-US/docs/Web/MathML
- Coordinate Alignment for OCR: - https://docs.opencv.org/4.x/da/d6e/tutorial\_py\_geometric\_transformations.html
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.