Documents Image Quality Assessment Part -2

3 min readMar 10, 2023

Aditya Mangal | Insurance Samadhan | Shivpoojan Saini

Welcome back, dear reader! In Part-1, we dipped our toes into the exciting world of Image Quality Assessment and made some impressive progress using classification and regression models. However, like a master chef constantly seeking to refine their signature dish, we’re not quite satisfied yet. Our goal demands more accuracy, more precision, more… well, more everything! So in Part-2, we’re taking our exploration to the next level by delving into the world of OCR’s recognition model. This technology has the potential to provide us with the kind of granular detail and precision that our use case demands. So let’s roll up our sleeves, put on our thinking caps, and see where this rabbit hole leads us! But first, if you haven’t checked out Part-1 yet, what are you waiting for? 🤔🤔 Don’t be left behind — catch up now and join us on this exciting journey.

Thanks to MMOCR for creating such a wonderful toolbox.MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the corresponding downstream tasks including key information extraction. It is part of the OpenMMLab project.

To set up the MMOCR, you can refer to its documentation which is written very well.

We have tested recognition methods on sample data (20k) of MJSynth dataset and analyzed the results with the actual text that comes in MJSynth datasets. So we can conclude which recognition is working well with the word images dataset.

Let’s take a look at the code to recognize the word image and its output.

Import the libraries that will be used in the code

from mmocr.ocr import MMOCR

Make an object of MMOCR Class

model_name = 'ABINet'
ocr = MMOCR(recog=model_name)

Read text image

image = cv2.imread('sample.png')

Recognize the text image with ABINet Recognition Model

result = ocr.readtext(image)

Output

print(result)
plt.imshow(image)
plt.show()

For every image, we are interested in the corresponding confidence score of the text image. Now, we can compare the recognition methods with our sample dataset(20k) from MJSynth dataset.

We have calculated correct and incorrect by comparing the predicted text with the actual text (compare with the lower case of text of both actual and predicted).

Here, we can conclude that ABInet is working well on the word images dataset. Now, we can tweak the architecture of ABINet and make the model for our use case of Image Quality Assessment. If you forgot to read Part-1, go and check it out to get an insight into the complete story of Image Quality Assessment.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Aditya Mangal

479 Followers

1.1K Following

My Personal Quote to overcome problems and remove dependencies - "It's not the car, it's the driver who win the race".

Responses (1)

Write a response

What are your thoughts?

Also publish to my profile

SHIVPOOJAN SAINI

Mar 15, 2023

good job

Recommended from Medium

Understanding and Implementing Faster R-CNN

Rishabh Singh

Understanding and Implementing Faster R-CNN

Most of the current SOTA models are built on top of the groundwork laid by the Faster-RCNN model. Faster R-CNN is an object detection model…

Oct 14, 2024

How To Generate Synthetic Images For Object Detection Tasks

TDS Archive

Dr. Leon Eversberg

How To Generate Synthetic Images For Object Detection Tasks

A step-by-step tutorial using Blender, Python, and 3D Assets

Mar 8, 2024

Lists

Coding & Development

11 stories1035 saves

Predictive Modeling w/ Python

20 stories1858 saves

Practical Guides to Machine Learning

10 stories2229 saves

ChatGPT

21 stories991 saves

Effortless Code Quality: The Ultimate Pre-Commit Hooks Guide for 2025

Gatlen Culp

Effortless Code Quality: The Ultimate Pre-Commit Hooks Guide for 2025

I fell in love with pre-commit hooks and you might too. Learn more about pre-commit and my favorite hooks.

Jan 11

Unleash the Power of PaddleOCR: Your Guide to Best Open Source OCR

Generative AI

RSD Studio.ai

Unleash the Power of PaddleOCR: Your Guide to Best Open Source OCR

Want to find out about the best OCR that you can use to build AI applications at scale and earn a fortune!

Feb 8

Unlocking Document Processing with Python: Advanced File Partitioning and Text Extraction

Avinash Maheshwari

Unlocking Document Processing with Python: Advanced File Partitioning and Text Extraction

Processing and extracting information from diverse document formats is essential for numerous applications. Python’s unstructured library…

Dec 1, 2024

Boost Your YOLO Model with Albumentations: A Step-by-Step Guide to Advanced Data Augmentation

Merwansky

Boost Your YOLO Model with Albumentations: A Step-by-Step Guide to Advanced Data Augmentation

Computer Vision & AI

Oct 26, 2024

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams