OCR-decisions testing system by MIL Team

Many Computer Vision tasks are focused on extracting and detecting objects on an image, and some of our projects are also related to this task.

The latest project by our team is the Optical Character Recognition (OCR) — decisions testing system. OCR — decision consists of the set of methods and models for image text detection and character recognition. This system output is the list of quad coordinates (boxes) hedging the words and the text inside them.

By the end of the article you will know:

  1. The reason why building OCR — decisions testing system is not a trivial matter.

What’s wrong with existing libraries?

For counting most of the metrics you should find a match between predicted and correct boxes. Sounds easy, so, what’s the catch?

Usually, object correspondence is determined by IOU measure: the box is thought to be correctly predicted when the IOU measure exceeds the set threshold. The most popular Python library aimed for detection metrics estimation is Object Detection Metrics. Let’s point its main disadvantages:

  1. Recognizes only the text boxes parallel to coordinate axes, in other words, doesn’t recognize oblique text.

What do we suggest?

We suggest creating an engine that will find pairs of predicted and correct boxes and have the existing library’s properties taking the disadvantages into account. For this aim, we use two main instruments:

  1. Lib Shapely for honest IOU counting. It has a user-friendly interface and it allows to work with arbitrary convex polygons.

With the help of these instruments, we suggest implementing the following algorithm:

  1. Counting the centres of all predicted boxes and add them to KD-tree.

Why does this method work?

  1. It helps to find the best pair as it comes as a proper heuristic which is excellent for the case when predicted boxes don’t cross.

All in all: the algorithm by MIL Team works on the text documents consisting of several hundreds of the words spending less than a second whereas quadratic search could take up to half a minute.

We hope this method will be useful for your future research projects. Wishing you quick algorithms and qualitative models!

--

--

MIL. Team is the united and professional group of researchers, developers and engineers conducting R&D projects in the field of Artificial Intelligence.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Machine Intelligence Laboratory

MIL. Team is the united and professional group of researchers, developers and engineers conducting R&D projects in the field of Artificial Intelligence.