Check if image contains text

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Check if image contains text

Post by galv »

I want to check if an image contains text. I know I can run OCR on it but I want it to be faster than that. If it contains text then it should OCR, if not it should discard the image.

Any ideas?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Check if image contains text

Post by anthony »

Compareing Images, sorting images by type...
http://www.imagemagick.org/Usage/compare/#type_general

Text vs Line Drawing -- text is lots of small disconnected objects, typically in rows.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
whugemann
Posts: 289
Joined: 2011-03-28T07:11:31-07:00
Authentication code: 8675308
Location: Münster, Germany 52°N,7.6°E

Re: Check if image contains text

Post by whugemann »

You should describe the task more specifically: Do you want the check whether the image contains MOSTLY text or any text? The latter will probably be rather difficult, but the former should be easier. A page that contains only text is almost black and white and should have an average grey value of about 80% something. Therefore, some checks on the histogram should give you an idea whether the image potentially is a text page.

You could also try to rotate the image by -- say -- 5°, than run '-deskew' and check whether its result differs significantly from the original. If so, the image probably contains text.
Wolfgang Hugemann
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: Check if image contains text

Post by galv »

whugemann, I want to check whether the image contains mostly text. Like if a webcam is looking at a wall or at a book. Can you please give an example of checking the histogram like you say?

anthony, I think that discerning the small disconnected objects in rows would take more time than doing OCR on the image.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Check if image contains text

Post by fmw42 »

If you have a page of text, then it should be arranged in rows. If you -deskew the image to correct for any rotation, then you can average the image down to 1 column using -scale and convert to txt format or make a profile. Then look for alternating bright and dark bands. You can even threshold it so that the bands are black and white. If they are regularly spaced then it is likely text. If there is noise in the text, then you should remove it by using -statistics med or -morphology open or close depending upon the polarity of the image (b on w, or w on b)

Input:
Image


convert text.jpg -deskew 40% -threshold 50% -scale 1x! -negate -threshold 0 -negate txt:

Code: Select all

# ImageMagick pixel enumeration: 1,214,255,gray
0,0: (255,255,255)  #FFFFFF  gray(255,255,255)
0,1: (255,255,255)  #FFFFFF  gray(255,255,255)
0,2: (255,255,255)  #FFFFFF  gray(255,255,255)
0,3: (255,255,255)  #FFFFFF  gray(255,255,255)
0,4: (255,255,255)  #FFFFFF  gray(255,255,255)
0,5: (  0,  0,  0)  #000000  gray(0,0,0)
0,6: (  0,  0,  0)  #000000  gray(0,0,0)
0,7: (  0,  0,  0)  #000000  gray(0,0,0)
0,8: (  0,  0,  0)  #000000  gray(0,0,0)
0,9: (  0,  0,  0)  #000000  gray(0,0,0)
0,10: (  0,  0,  0)  #000000  gray(0,0,0)
0,11: (  0,  0,  0)  #000000  gray(0,0,0)
0,12: (  0,  0,  0)  #000000  gray(0,0,0)
0,13: (  0,  0,  0)  #000000  gray(0,0,0)
0,14: (  0,  0,  0)  #000000  gray(0,0,0)
0,15: (  0,  0,  0)  #000000  gray(0,0,0)
0,16: (  0,  0,  0)  #000000  gray(0,0,0)
0,17: (  0,  0,  0)  #000000  gray(0,0,0)
0,18: (  0,  0,  0)  #000000  gray(0,0,0)
0,19: (  0,  0,  0)  #000000  gray(0,0,0)
0,20: (  0,  0,  0)  #000000  gray(0,0,0)
0,21: (  0,  0,  0)  #000000  gray(0,0,0)
0,22: (  0,  0,  0)  #000000  gray(0,0,0)
0,23: (255,255,255)  #FFFFFF  gray(255,255,255)
0,24: (255,255,255)  #FFFFFF  gray(255,255,255)
0,25: (255,255,255)  #FFFFFF  gray(255,255,255)
0,26: (255,255,255)  #FFFFFF  gray(255,255,255)
0,27: (255,255,255)  #FFFFFF  gray(255,255,255)
0,28: (  0,  0,  0)  #000000  gray(0,0,0)
0,29: (  0,  0,  0)  #000000  gray(0,0,0)
0,30: (  0,  0,  0)  #000000  gray(0,0,0)
0,31: (  0,  0,  0)  #000000  gray(0,0,0)
0,32: (  0,  0,  0)  #000000  gray(0,0,0)
0,33: (  0,  0,  0)  #000000  gray(0,0,0)
0,34: (  0,  0,  0)  #000000  gray(0,0,0)
0,35: (  0,  0,  0)  #000000  gray(0,0,0)
0,36: (  0,  0,  0)  #000000  gray(0,0,0)
0,37: (  0,  0,  0)  #000000  gray(0,0,0)
0,38: (  0,  0,  0)  #000000  gray(0,0,0)
0,39: (  0,  0,  0)  #000000  gray(0,0,0)
0,40: (  0,  0,  0)  #000000  gray(0,0,0)
0,41: (  0,  0,  0)  #000000  gray(0,0,0)
0,42: (  0,  0,  0)  #000000  gray(0,0,0)
0,43: (  0,  0,  0)  #000000  gray(0,0,0)
0,44: (  0,  0,  0)  #000000  gray(0,0,0)
0,45: (  0,  0,  0)  #000000  gray(0,0,0)
0,46: (  0,  0,  0)  #000000  gray(0,0,0)
0,47: (255,255,255)  #FFFFFF  gray(255,255,255)
0,48: (255,255,255)  #FFFFFF  gray(255,255,255)
0,49: (255,255,255)  #FFFFFF  gray(255,255,255)
0,50: (255,255,255)  #FFFFFF  gray(255,255,255)
0,51: (  0,  0,  0)  #000000  gray(0,0,0)
0,52: (  0,  0,  0)  #000000  gray(0,0,0)
0,53: (  0,  0,  0)  #000000  gray(0,0,0)
0,54: (  0,  0,  0)  #000000  gray(0,0,0)
0,55: (  0,  0,  0)  #000000  gray(0,0,0)
0,56: (  0,  0,  0)  #000000  gray(0,0,0)
0,57: (  0,  0,  0)  #000000  gray(0,0,0)
0,58: (  0,  0,  0)  #000000  gray(0,0,0)
0,59: (  0,  0,  0)  #000000  gray(0,0,0)
0,60: (  0,  0,  0)  #000000  gray(0,0,0)
0,61: (  0,  0,  0)  #000000  gray(0,0,0)
0,62: (  0,  0,  0)  #000000  gray(0,0,0)
0,63: (  0,  0,  0)  #000000  gray(0,0,0)
0,64: (  0,  0,  0)  #000000  gray(0,0,0)
0,65: (  0,  0,  0)  #000000  gray(0,0,0)
0,66: (  0,  0,  0)  #000000  gray(0,0,0)
0,67: (  0,  0,  0)  #000000  gray(0,0,0)
0,68: (  0,  0,  0)  #000000  gray(0,0,0)
0,69: (255,255,255)  #FFFFFF  gray(255,255,255)
0,70: (255,255,255)  #FFFFFF  gray(255,255,255)
0,71: (255,255,255)  #FFFFFF  gray(255,255,255)
0,72: (255,255,255)  #FFFFFF  gray(255,255,255)
0,73: (255,255,255)  #FFFFFF  gray(255,255,255)
0,74: (  0,  0,  0)  #000000  gray(0,0,0)
0,75: (  0,  0,  0)  #000000  gray(0,0,0)
0,76: (  0,  0,  0)  #000000  gray(0,0,0)
0,77: (  0,  0,  0)  #000000  gray(0,0,0)
0,78: (  0,  0,  0)  #000000  gray(0,0,0)
0,79: (  0,  0,  0)  #000000  gray(0,0,0)
0,80: (  0,  0,  0)  #000000  gray(0,0,0)
0,81: (  0,  0,  0)  #000000  gray(0,0,0)
0,82: (  0,  0,  0)  #000000  gray(0,0,0)
0,83: (  0,  0,  0)  #000000  gray(0,0,0)
0,84: (  0,  0,  0)  #000000  gray(0,0,0)
0,85: (  0,  0,  0)  #000000  gray(0,0,0)
0,86: (  0,  0,  0)  #000000  gray(0,0,0)
0,87: (  0,  0,  0)  #000000  gray(0,0,0)
0,88: (  0,  0,  0)  #000000  gray(0,0,0)
0,89: (  0,  0,  0)  #000000  gray(0,0,0)
0,90: (  0,  0,  0)  #000000  gray(0,0,0)
0,91: (  0,  0,  0)  #000000  gray(0,0,0)
0,92: (  0,  0,  0)  #000000  gray(0,0,0)
0,93: (255,255,255)  #FFFFFF  gray(255,255,255)
0,94: (  0,  0,  0)  #000000  gray(0,0,0)
0,95: (  0,  0,  0)  #000000  gray(0,0,0)
0,96: (255,255,255)  #FFFFFF  gray(255,255,255)
0,97: (255,255,255)  #FFFFFF  gray(255,255,255)
0,98: (  0,  0,  0)  #000000  gray(0,0,0)
0,99: (  0,  0,  0)  #000000  gray(0,0,0)
0,100: (  0,  0,  0)  #000000  gray(0,0,0)
0,101: (  0,  0,  0)  #000000  gray(0,0,0)
0,102: (  0,  0,  0)  #000000  gray(0,0,0)
0,103: (  0,  0,  0)  #000000  gray(0,0,0)
0,104: (  0,  0,  0)  #000000  gray(0,0,0)
0,105: (  0,  0,  0)  #000000  gray(0,0,0)
0,106: (  0,  0,  0)  #000000  gray(0,0,0)
0,107: (  0,  0,  0)  #000000  gray(0,0,0)
0,108: (  0,  0,  0)  #000000  gray(0,0,0)
0,109: (  0,  0,  0)  #000000  gray(0,0,0)
0,110: (  0,  0,  0)  #000000  gray(0,0,0)
0,111: (  0,  0,  0)  #000000  gray(0,0,0)
0,112: (  0,  0,  0)  #000000  gray(0,0,0)
0,113: (  0,  0,  0)  #000000  gray(0,0,0)
0,114: (  0,  0,  0)  #000000  gray(0,0,0)
0,115: (  0,  0,  0)  #000000  gray(0,0,0)
0,116: (255,255,255)  #FFFFFF  gray(255,255,255)
0,117: (255,255,255)  #FFFFFF  gray(255,255,255)
0,118: (255,255,255)  #FFFFFF  gray(255,255,255)
0,119: (255,255,255)  #FFFFFF  gray(255,255,255)
0,120: (  0,  0,  0)  #000000  gray(0,0,0)
0,121: (  0,  0,  0)  #000000  gray(0,0,0)
0,122: (  0,  0,  0)  #000000  gray(0,0,0)
0,123: (  0,  0,  0)  #000000  gray(0,0,0)
0,124: (  0,  0,  0)  #000000  gray(0,0,0)
0,125: (  0,  0,  0)  #000000  gray(0,0,0)
0,126: (  0,  0,  0)  #000000  gray(0,0,0)
0,127: (  0,  0,  0)  #000000  gray(0,0,0)
0,128: (  0,  0,  0)  #000000  gray(0,0,0)
0,129: (  0,  0,  0)  #000000  gray(0,0,0)
0,130: (  0,  0,  0)  #000000  gray(0,0,0)
0,131: (  0,  0,  0)  #000000  gray(0,0,0)
0,132: (  0,  0,  0)  #000000  gray(0,0,0)
0,133: (  0,  0,  0)  #000000  gray(0,0,0)
0,134: (  0,  0,  0)  #000000  gray(0,0,0)
0,135: (  0,  0,  0)  #000000  gray(0,0,0)
0,136: (  0,  0,  0)  #000000  gray(0,0,0)
0,137: (  0,  0,  0)  #000000  gray(0,0,0)
0,138: (  0,  0,  0)  #000000  gray(0,0,0)
0,139: (  0,  0,  0)  #000000  gray(0,0,0)
0,140: (255,255,255)  #FFFFFF  gray(255,255,255)
0,141: (255,255,255)  #FFFFFF  gray(255,255,255)
0,142: (255,255,255)  #FFFFFF  gray(255,255,255)
0,143: (255,255,255)  #FFFFFF  gray(255,255,255)
0,144: (  0,  0,  0)  #000000  gray(0,0,0)
0,145: (  0,  0,  0)  #000000  gray(0,0,0)
0,146: (  0,  0,  0)  #000000  gray(0,0,0)
0,147: (  0,  0,  0)  #000000  gray(0,0,0)
0,148: (  0,  0,  0)  #000000  gray(0,0,0)
0,149: (  0,  0,  0)  #000000  gray(0,0,0)
0,150: (  0,  0,  0)  #000000  gray(0,0,0)
0,151: (  0,  0,  0)  #000000  gray(0,0,0)
0,152: (  0,  0,  0)  #000000  gray(0,0,0)
0,153: (  0,  0,  0)  #000000  gray(0,0,0)
0,154: (  0,  0,  0)  #000000  gray(0,0,0)
0,155: (  0,  0,  0)  #000000  gray(0,0,0)
0,156: (  0,  0,  0)  #000000  gray(0,0,0)
0,157: (  0,  0,  0)  #000000  gray(0,0,0)
0,158: (  0,  0,  0)  #000000  gray(0,0,0)
0,159: (  0,  0,  0)  #000000  gray(0,0,0)
0,160: (  0,  0,  0)  #000000  gray(0,0,0)
0,161: (  0,  0,  0)  #000000  gray(0,0,0)
0,162: (  0,  0,  0)  #000000  gray(0,0,0)
0,163: (255,255,255)  #FFFFFF  gray(255,255,255)
0,164: (255,255,255)  #FFFFFF  gray(255,255,255)
0,165: (  0,  0,  0)  #000000  gray(0,0,0)
0,166: (  0,  0,  0)  #000000  gray(0,0,0)
0,167: (  0,  0,  0)  #000000  gray(0,0,0)
0,168: (  0,  0,  0)  #000000  gray(0,0,0)
0,169: (  0,  0,  0)  #000000  gray(0,0,0)
0,170: (  0,  0,  0)  #000000  gray(0,0,0)
0,171: (  0,  0,  0)  #000000  gray(0,0,0)
0,172: (  0,  0,  0)  #000000  gray(0,0,0)
0,173: (  0,  0,  0)  #000000  gray(0,0,0)
0,174: (  0,  0,  0)  #000000  gray(0,0,0)
0,175: (  0,  0,  0)  #000000  gray(0,0,0)
0,176: (  0,  0,  0)  #000000  gray(0,0,0)
0,177: (  0,  0,  0)  #000000  gray(0,0,0)
0,178: (  0,  0,  0)  #000000  gray(0,0,0)
0,179: (  0,  0,  0)  #000000  gray(0,0,0)
0,180: (  0,  0,  0)  #000000  gray(0,0,0)
0,181: (  0,  0,  0)  #000000  gray(0,0,0)
0,182: (  0,  0,  0)  #000000  gray(0,0,0)
0,183: (  0,  0,  0)  #000000  gray(0,0,0)
0,184: (  0,  0,  0)  #000000  gray(0,0,0)
0,185: (255,255,255)  #FFFFFF  gray(255,255,255)
0,186: (255,255,255)  #FFFFFF  gray(255,255,255)
0,187: (255,255,255)  #FFFFFF  gray(255,255,255)
0,188: (255,255,255)  #FFFFFF  gray(255,255,255)
0,189: (255,255,255)  #FFFFFF  gray(255,255,255)
0,190: (  0,  0,  0)  #000000  gray(0,0,0)
0,191: (  0,  0,  0)  #000000  gray(0,0,0)
0,192: (  0,  0,  0)  #000000  gray(0,0,0)
0,193: (  0,  0,  0)  #000000  gray(0,0,0)
0,194: (  0,  0,  0)  #000000  gray(0,0,0)
0,195: (  0,  0,  0)  #000000  gray(0,0,0)
0,196: (  0,  0,  0)  #000000  gray(0,0,0)
0,197: (  0,  0,  0)  #000000  gray(0,0,0)
0,198: (  0,  0,  0)  #000000  gray(0,0,0)
0,199: (  0,  0,  0)  #000000  gray(0,0,0)
0,200: (  0,  0,  0)  #000000  gray(0,0,0)
0,201: (  0,  0,  0)  #000000  gray(0,0,0)
0,202: (  0,  0,  0)  #000000  gray(0,0,0)
0,203: (  0,  0,  0)  #000000  gray(0,0,0)
0,204: (  0,  0,  0)  #000000  gray(0,0,0)
0,205: (  0,  0,  0)  #000000  gray(0,0,0)
0,206: (  0,  0,  0)  #000000  gray(0,0,0)
0,207: (  0,  0,  0)  #000000  gray(0,0,0)
0,208: (  0,  0,  0)  #000000  gray(0,0,0)
0,209: (255,255,255)  #FFFFFF  gray(255,255,255)
0,210: (255,255,255)  #FFFFFF  gray(255,255,255)
0,211: (255,255,255)  #FFFFFF  gray(255,255,255)
0,212: (255,255,255)  #FFFFFF  gray(255,255,255)
0,213: (255,255,255)  #FFFFFF  gray(255,255,255)


convert text.jpg -deskew 40% -threshold 50% -scale 1x! -negate -threshold 0 -negate -rotate 90 miff:- | im_profile - text_profile.gif

Image
User avatar
whugemann
Posts: 289
Joined: 2011-03-28T07:11:31-07:00
Authentication code: 8675308
Location: Münster, Germany 52°N,7.6°E

Re: Check if image contains text

Post by whugemann »

In regard to the histogram check, I have no ready-made answer. But if you check Fred's example, you will find that the histogram has a pronounced peak for brighter values (which respresents the white background), while the rest of the grey values is distributed almost evenly. This could be used for a first rough automatic text check.

Fred's method is more sophisticated, but lacks the final step of full automisation, i.e. recognition of the pattern (?). You could try compare the result of Fred's manipulation to a standard black-and-white strip and define a certain threshold as to when the original image has to be regarded as text.
Wolfgang Hugemann
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Check if image contains text

Post by fmw42 »

Just measure the width and spacing of the white areas from the txt output to see if regular.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Check if image contains text

Post by anthony »

galv wrote:anthony, I think that discerning the small disconnected objects in rows would take more time than doing OCR on the image.
Actually that is exactly what OCR does, though as a specialised bit of software it would do this faster!

However their is a two-pass morphology method that is VERY fast and gives you a count of how many distinct object are present in a thresholded (pure binary) image. It is called 'labeling', but has not been implemented in IM (lack of time). I really should do that - some time.

Slower techniques however do exist. including row segmentation script divide_vert, and a script by Fred for labeling.

whugemann The rotate then deskew would be a nice idea. Especially of you then resize the that result to a single column, and look for a 'square-wave' pattern indicating rows of text.

The 80% grey should be combined with and part of the initial 'mostly black and white' test. even before 'deskew' step.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Check if image contains text

Post by anthony »

whugemann wrote:Fred's method is more sophisticated, but lacks the final step of full automisation, i.e. recognition of the pattern (?). You could try compare the result of Fred's manipulation to a standard black-and-white strip and define a certain threshold as to when the original image has to be regarded as text.
Anyone thought of looking at the Fourier transform of the text? That should generate a very very distinct pattern, which can be looked for in a automated way, regardless of rotation (though it is still better to de-skew it).

It should also let you extract the period of the rows, even if that period is NOT a power of two.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
Post Reply