Detect Text vs Picture for OCR

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

Detect Text vs Picture for OCR

Post by gvandyk » 2016-02-23T02:22:25-07:00

I have an image that contains both text and pictures.

I need to clean the image for the ocr process and are using the textcleaner script from Fred. Unfortunately because this script cleans the image to be ready for ocr'ing I loose the images inside of the image.

What is the best approach to clean images that only has text using the textcleaner scripts and for other images that contain both pictures and text, not to run the textcleaner?

Or, what do you suggest should happen with images like this containing both text and pictures for Ocr'ing?

Example:

https://www.dropbox.com/s/0vqr1oo8jqt9dw2/page.jpg?dl=0

Post Reply