Page 1 of 1

Check if image contains text

Posted: 2012-06-21T16:17:01-07:00
by galv
I want to check if an image contains text. I know I can run OCR on it but I want it to be faster than that. If it contains text then it should OCR, if not it should discard the image.

Any ideas?

Re: Check if image contains text

Posted: 2012-06-21T19:21:27-07:00
by anthony
Compareing Images, sorting images by type...
http://www.imagemagick.org/Usage/compare/#type_general

Text vs Line Drawing -- text is lots of small disconnected objects, typically in rows.

Re: Check if image contains text

Posted: 2012-06-22T00:38:48-07:00
by whugemann
You should describe the task more specifically: Do you want the check whether the image contains MOSTLY text or any text? The latter will probably be rather difficult, but the former should be easier. A page that contains only text is almost black and white and should have an average grey value of about 80% something. Therefore, some checks on the histogram should give you an idea whether the image potentially is a text page.

You could also try to rotate the image by -- say -- 5°, than run '-deskew' and check whether its result differs significantly from the original. If so, the image probably contains text.

Re: Check if image contains text

Posted: 2012-06-22T07:53:15-07:00
by galv
whugemann, I want to check whether the image contains mostly text. Like if a webcam is looking at a wall or at a book. Can you please give an example of checking the histogram like you say?

anthony, I think that discerning the small disconnected objects in rows would take more time than doing OCR on the image.

Re: Check if image contains text

Posted: 2012-06-22T10:26:44-07:00
by fmw42
If you have a page of text, then it should be arranged in rows. If you -deskew the image to correct for any rotation, then you can average the image down to 1 column using -scale and convert to txt format or make a profile. Then look for alternating bright and dark bands. You can even threshold it so that the bands are black and white. If they are regularly spaced then it is likely text. If there is noise in the text, then you should remove it by using -statistics med or -morphology open or close depending upon the polarity of the image (b on w, or w on b)

Input:
Image


convert text.jpg -deskew 40% -threshold 50% -scale 1x! -negate -threshold 0 -negate txt:

Code: Select all

# ImageMagick pixel enumeration: 1,214,255,gray
0,0: (255,255,255)  #FFFFFF  gray(255,255,255)
0,1: (255,255,255)  #FFFFFF  gray(255,255,255)
0,2: (255,255,255)  #FFFFFF  gray(255,255,255)
0,3: (255,255,255)  #FFFFFF  gray(255,255,255)
0,4: (255,255,255)  #FFFFFF  gray(255,255,255)
0,5: (  0,  0,  0)  #000000  gray(0,0,0)
0,6: (  0,  0,  0)  #000000  gray(0,0,0)
0,7: (  0,  0,  0)  #000000  gray(0,0,0)
0,8: (  0,  0,  0)  #000000  gray(0,0,0)
0,9: (  0,  0,  0)  #000000  gray(0,0,0)
0,10: (  0,  0,  0)  #000000  gray(0,0,0)
0,11: (  0,  0,  0)  #000000  gray(0,0,0)
0,12: (  0,  0,  0)  #000000  gray(0,0,0)
0,13: (  0,  0,  0)  #000000  gray(0,0,0)
0,14: (  0,  0,  0)  #000000  gray(0,0,0)
0,15: (  0,  0,  0)  #000000  gray(0,0,0)
0,16: (  0,  0,  0)  #000000  gray(0,0,0)
0,17: (  0,  0,  0)  #000000  gray(0,0,0)
0,18: (  0,  0,  0)  #000000  gray(0,0,0)
0,19: (  0,  0,  0)  #000000  gray(0,0,0)
0,20: (  0,  0,  0)  #000000  gray(0,0,0)
0,21: (  0,  0,  0)  #000000  gray(0,0,0)
0,22: (  0,  0,  0)  #000000  gray(0,0,0)
0,23: (255,255,255)  #FFFFFF  gray(255,255,255)
0,24: (255,255,255)  #FFFFFF  gray(255,255,255)
0,25: (255,255,255)  #FFFFFF  gray(255,255,255)
0,26: (255,255,255)  #FFFFFF  gray(255,255,255)
0,27: (255,255,255)  #FFFFFF  gray(255,255,255)
0,28: (  0,  0,  0)  #000000  gray(0,0,0)
0,29: (  0,  0,  0)  #000000  gray(0,0,0)
0,30: (  0,  0,  0)  #000000  gray(0,0,0)
0,31: (  0,  0,  0)  #000000  gray(0,0,0)
0,32: (  0,  0,  0)  #000000  gray(0,0,0)
0,33: (  0,  0,  0)  #000000  gray(0,0,0)
0,34: (  0,  0,  0)  #000000  gray(0,0,0)
0,35: (  0,  0,  0)  #000000  gray(0,0,0)
0,36: (  0,  0,  0)  #000000  gray(0,0,0)
0,37: (  0,  0,  0)  #000000  gray(0,0,0)
0,38: (  0,  0,  0)  #000000  gray(0,0,0)
0,39: (  0,  0,  0)  #000000  gray(0,0,0)
0,40: (  0,  0,  0)  #000000  gray(0,0,0)
0,41: (  0,  0,  0)  #000000  gray(0,0,0)
0,42: (  0,  0,  0)  #000000  gray(0,0,0)
0,43: (  0,  0,  0)  #000000  gray(0,0,0)
0,44: (  0,  0,  0)  #000000  gray(0,0,0)
0,45: (  0,  0,  0)  #000000  gray(0,0,0)
0,46: (  0,  0,  0)  #000000  gray(0,0,0)
0,47: (255,255,255)  #FFFFFF  gray(255,255,255)
0,48: (255,255,255)  #FFFFFF  gray(255,255,255)
0,49: (255,255,255)  #FFFFFF  gray(255,255,255)
0,50: (255,255,255)  #FFFFFF  gray(255,255,255)
0,51: (  0,  0,  0)  #000000  gray(0,0,0)
0,52: (  0,  0,  0)  #000000  gray(0,0,0)
0,53: (  0,  0,  0)  #000000  gray(0,0,0)
0,54: (  0,  0,  0)  #000000  gray(0,0,0)
0,55: (  0,  0,  0)  #000000  gray(0,0,0)
0,56: (  0,  0,  0)  #000000  gray(0,0,0)
0,57: (  0,  0,  0)  #000000  gray(0,0,0)
0,58: (  0,  0,  0)  #000000  gray(0,0,0)
0,59: (  0,  0,  0)  #000000  gray(0,0,0)
0,60: (  0,  0,  0)  #000000  gray(0,0,0)
0,61: (  0,  0,  0)  #000000  gray(0,0,0)
0,62: (  0,  0,  0)  #000000  gray(0,0,0)
0,63: (  0,  0,  0)  #000000  gray(0,0,0)
0,64: (  0,  0,  0)  #000000  gray(0,0,0)
0,65: (  0,  0,  0)  #000000  gray(0,0,0)
0,66: (  0,  0,  0)  #000000  gray(0,0,0)
0,67: (  0,  0,  0)  #000000  gray(0,0,0)
0,68: (  0,  0,  0)  #000000  gray(0,0,0)
0,69: (255,255,255)  #FFFFFF  gray(255,255,255)
0,70: (255,255,255)  #FFFFFF  gray(255,255,255)
0,71: (255,255,255)  #FFFFFF  gray(255,255,255)
0,72: (255,255,255)  #FFFFFF  gray(255,255,255)
0,73: (255,255,255)  #FFFFFF  gray(255,255,255)
0,74: (  0,  0,  0)  #000000  gray(0,0,0)
0,75: (  0,  0,  0)  #000000  gray(0,0,0)
0,76: (  0,  0,  0)  #000000  gray(0,0,0)
0,77: (  0,  0,  0)  #000000  gray(0,0,0)
0,78: (  0,  0,  0)  #000000  gray(0,0,0)
0,79: (  0,  0,  0)  #000000  gray(0,0,0)
0,80: (  0,  0,  0)  #000000  gray(0,0,0)
0,81: (  0,  0,  0)  #000000  gray(0,0,0)
0,82: (  0,  0,  0)  #000000  gray(0,0,0)
0,83: (  0,  0,  0)  #000000  gray(0,0,0)
0,84: (  0,  0,  0)  #000000  gray(0,0,0)
0,85: (  0,  0,  0)  #000000  gray(0,0,0)
0,86: (  0,  0,  0)  #000000  gray(0,0,0)
0,87: (  0,  0,  0)  #000000  gray(0,0,0)
0,88: (  0,  0,  0)  #000000  gray(0,0,0)
0,89: (  0,  0,  0)  #000000  gray(0,0,0)
0,90: (  0,  0,  0)  #000000  gray(0,0,0)
0,91: (  0,  0,  0)  #000000  gray(0,0,0)
0,92: (  0,  0,  0)  #000000  gray(0,0,0)
0,93: (255,255,255)  #FFFFFF  gray(255,255,255)
0,94: (  0,  0,  0)  #000000  gray(0,0,0)
0,95: (  0,  0,  0)  #000000  gray(0,0,0)
0,96: (255,255,255)  #FFFFFF  gray(255,255,255)
0,97: (255,255,255)  #FFFFFF  gray(255,255,255)
0,98: (  0,  0,  0)  #000000  gray(0,0,0)
0,99: (  0,  0,  0)  #000000  gray(0,0,0)
0,100: (  0,  0,  0)  #000000  gray(0,0,0)
0,101: (  0,  0,  0)  #000000  gray(0,0,0)
0,102: (  0,  0,  0)  #000000  gray(0,0,0)
0,103: (  0,  0,  0)  #000000  gray(0,0,0)
0,104: (  0,  0,  0)  #000000  gray(0,0,0)
0,105: (  0,  0,  0)  #000000  gray(0,0,0)
0,106: (  0,  0,  0)  #000000  gray(0,0,0)
0,107: (  0,  0,  0)  #000000  gray(0,0,0)
0,108: (  0,  0,  0)  #000000  gray(0,0,0)
0,109: (  0,  0,  0)  #000000  gray(0,0,0)
0,110: (  0,  0,  0)  #000000  gray(0,0,0)
0,111: (  0,  0,  0)  #000000  gray(0,0,0)
0,112: (  0,  0,  0)  #000000  gray(0,0,0)
0,113: (  0,  0,  0)  #000000  gray(0,0,0)
0,114: (  0,  0,  0)  #000000  gray(0,0,0)
0,115: (  0,  0,  0)  #000000  gray(0,0,0)
0,116: (255,255,255)  #FFFFFF  gray(255,255,255)
0,117: (255,255,255)  #FFFFFF  gray(255,255,255)
0,118: (255,255,255)  #FFFFFF  gray(255,255,255)
0,119: (255,255,255)  #FFFFFF  gray(255,255,255)
0,120: (  0,  0,  0)  #000000  gray(0,0,0)
0,121: (  0,  0,  0)  #000000  gray(0,0,0)
0,122: (  0,  0,  0)  #000000  gray(0,0,0)
0,123: (  0,  0,  0)  #000000  gray(0,0,0)
0,124: (  0,  0,  0)  #000000  gray(0,0,0)
0,125: (  0,  0,  0)  #000000  gray(0,0,0)
0,126: (  0,  0,  0)  #000000  gray(0,0,0)
0,127: (  0,  0,  0)  #000000  gray(0,0,0)
0,128: (  0,  0,  0)  #000000  gray(0,0,0)
0,129: (  0,  0,  0)  #000000  gray(0,0,0)
0,130: (  0,  0,  0)  #000000  gray(0,0,0)
0,131: (  0,  0,  0)  #000000  gray(0,0,0)
0,132: (  0,  0,  0)  #000000  gray(0,0,0)
0,133: (  0,  0,  0)  #000000  gray(0,0,0)
0,134: (  0,  0,  0)  #000000  gray(0,0,0)
0,135: (  0,  0,  0)  #000000  gray(0,0,0)
0,136: (  0,  0,  0)  #000000  gray(0,0,0)
0,137: (  0,  0,  0)  #000000  gray(0,0,0)
0,138: (  0,  0,  0)  #000000  gray(0,0,0)
0,139: (  0,  0,  0)  #000000  gray(0,0,0)
0,140: (255,255,255)  #FFFFFF  gray(255,255,255)
0,141: (255,255,255)  #FFFFFF  gray(255,255,255)
0,142: (255,255,255)  #FFFFFF  gray(255,255,255)
0,143: (255,255,255)  #FFFFFF  gray(255,255,255)
0,144: (  0,  0,  0)  #000000  gray(0,0,0)
0,145: (  0,  0,  0)  #000000  gray(0,0,0)
0,146: (  0,  0,  0)  #000000  gray(0,0,0)
0,147: (  0,  0,  0)  #000000  gray(0,0,0)
0,148: (  0,  0,  0)  #000000  gray(0,0,0)
0,149: (  0,  0,  0)  #000000  gray(0,0,0)
0,150: (  0,  0,  0)  #000000  gray(0,0,0)
0,151: (  0,  0,  0)  #000000  gray(0,0,0)
0,152: (  0,  0,  0)  #000000  gray(0,0,0)
0,153: (  0,  0,  0)  #000000  gray(0,0,0)
0,154: (  0,  0,  0)  #000000  gray(0,0,0)
0,155: (  0,  0,  0)  #000000  gray(0,0,0)
0,156: (  0,  0,  0)  #000000  gray(0,0,0)
0,157: (  0,  0,  0)  #000000  gray(0,0,0)
0,158: (  0,  0,  0)  #000000  gray(0,0,0)
0,159: (  0,  0,  0)  #000000  gray(0,0,0)
0,160: (  0,  0,  0)  #000000  gray(0,0,0)
0,161: (  0,  0,  0)  #000000  gray(0,0,0)
0,162: (  0,  0,  0)  #000000  gray(0,0,0)
0,163: (255,255,255)  #FFFFFF  gray(255,255,255)
0,164: (255,255,255)  #FFFFFF  gray(255,255,255)
0,165: (  0,  0,  0)  #000000  gray(0,0,0)
0,166: (  0,  0,  0)  #000000  gray(0,0,0)
0,167: (  0,  0,  0)  #000000  gray(0,0,0)
0,168: (  0,  0,  0)  #000000  gray(0,0,0)
0,169: (  0,  0,  0)  #000000  gray(0,0,0)
0,170: (  0,  0,  0)  #000000  gray(0,0,0)
0,171: (  0,  0,  0)  #000000  gray(0,0,0)
0,172: (  0,  0,  0)  #000000  gray(0,0,0)
0,173: (  0,  0,  0)  #000000  gray(0,0,0)
0,174: (  0,  0,  0)  #000000  gray(0,0,0)
0,175: (  0,  0,  0)  #000000  gray(0,0,0)
0,176: (  0,  0,  0)  #000000  gray(0,0,0)
0,177: (  0,  0,  0)  #000000  gray(0,0,0)
0,178: (  0,  0,  0)  #000000  gray(0,0,0)
0,179: (  0,  0,  0)  #000000  gray(0,0,0)
0,180: (  0,  0,  0)  #000000  gray(0,0,0)
0,181: (  0,  0,  0)  #000000  gray(0,0,0)
0,182: (  0,  0,  0)  #000000  gray(0,0,0)
0,183: (  0,  0,  0)  #000000  gray(0,0,0)
0,184: (  0,  0,  0)  #000000  gray(0,0,0)
0,185: (255,255,255)  #FFFFFF  gray(255,255,255)
0,186: (255,255,255)  #FFFFFF  gray(255,255,255)
0,187: (255,255,255)  #FFFFFF  gray(255,255,255)
0,188: (255,255,255)  #FFFFFF  gray(255,255,255)
0,189: (255,255,255)  #FFFFFF  gray(255,255,255)
0,190: (  0,  0,  0)  #000000  gray(0,0,0)
0,191: (  0,  0,  0)  #000000  gray(0,0,0)
0,192: (  0,  0,  0)  #000000  gray(0,0,0)
0,193: (  0,  0,  0)  #000000  gray(0,0,0)
0,194: (  0,  0,  0)  #000000  gray(0,0,0)
0,195: (  0,  0,  0)  #000000  gray(0,0,0)
0,196: (  0,  0,  0)  #000000  gray(0,0,0)
0,197: (  0,  0,  0)  #000000  gray(0,0,0)
0,198: (  0,  0,  0)  #000000  gray(0,0,0)
0,199: (  0,  0,  0)  #000000  gray(0,0,0)
0,200: (  0,  0,  0)  #000000  gray(0,0,0)
0,201: (  0,  0,  0)  #000000  gray(0,0,0)
0,202: (  0,  0,  0)  #000000  gray(0,0,0)
0,203: (  0,  0,  0)  #000000  gray(0,0,0)
0,204: (  0,  0,  0)  #000000  gray(0,0,0)
0,205: (  0,  0,  0)  #000000  gray(0,0,0)
0,206: (  0,  0,  0)  #000000  gray(0,0,0)
0,207: (  0,  0,  0)  #000000  gray(0,0,0)
0,208: (  0,  0,  0)  #000000  gray(0,0,0)
0,209: (255,255,255)  #FFFFFF  gray(255,255,255)
0,210: (255,255,255)  #FFFFFF  gray(255,255,255)
0,211: (255,255,255)  #FFFFFF  gray(255,255,255)
0,212: (255,255,255)  #FFFFFF  gray(255,255,255)
0,213: (255,255,255)  #FFFFFF  gray(255,255,255)


convert text.jpg -deskew 40% -threshold 50% -scale 1x! -negate -threshold 0 -negate -rotate 90 miff:- | im_profile - text_profile.gif

Image

Re: Check if image contains text

Posted: 2012-06-23T05:00:00-07:00
by whugemann
In regard to the histogram check, I have no ready-made answer. But if you check Fred's example, you will find that the histogram has a pronounced peak for brighter values (which respresents the white background), while the rest of the grey values is distributed almost evenly. This could be used for a first rough automatic text check.

Fred's method is more sophisticated, but lacks the final step of full automisation, i.e. recognition of the pattern (?). You could try compare the result of Fred's manipulation to a standard black-and-white strip and define a certain threshold as to when the original image has to be regarded as text.

Re: Check if image contains text

Posted: 2012-06-23T09:57:44-07:00
by fmw42
Just measure the width and spacing of the white areas from the txt output to see if regular.

Re: Check if image contains text

Posted: 2012-06-24T16:25:32-07:00
by anthony
galv wrote:anthony, I think that discerning the small disconnected objects in rows would take more time than doing OCR on the image.
Actually that is exactly what OCR does, though as a specialised bit of software it would do this faster!

However their is a two-pass morphology method that is VERY fast and gives you a count of how many distinct object are present in a thresholded (pure binary) image. It is called 'labeling', but has not been implemented in IM (lack of time). I really should do that - some time.

Slower techniques however do exist. including row segmentation script divide_vert, and a script by Fred for labeling.

whugemann The rotate then deskew would be a nice idea. Especially of you then resize the that result to a single column, and look for a 'square-wave' pattern indicating rows of text.

The 80% grey should be combined with and part of the initial 'mostly black and white' test. even before 'deskew' step.

Re: Check if image contains text

Posted: 2012-06-24T16:30:34-07:00
by anthony
whugemann wrote:Fred's method is more sophisticated, but lacks the final step of full automisation, i.e. recognition of the pattern (?). You could try compare the result of Fred's manipulation to a standard black-and-white strip and define a certain threshold as to when the original image has to be regarded as text.
Anyone thought of looking at the Fourier transform of the text? That should generate a very very distinct pattern, which can be looked for in a automated way, regardless of rotation (though it is still better to de-skew it).

It should also let you extract the period of the rows, even if that period is NOT a power of two.