Page 1 of 2

prepare embossed characters for ocr

Posted: 2012-12-07T19:25:16-07:00
by galv
Hello,

I'm trying to OCR information from a bank card. The problem is that it contains embossed numbers and that is not good for OCR. The image below is the closest I've gone but the characters contain holes. I can't just fill holes because they're not exactly complete holes. I've also tried several structuring elements to no avail.
Any ideas?

http://i.imgur.com/WCAHI.png

Re: prepare embossed characters for ocr

Posted: 2012-12-07T20:10:49-07:00
by fmw42
can you supply a link to your source image?

Re: prepare embossed characters for ocr

Posted: 2012-12-07T22:35:54-07:00
by galv
It's at the bottom of the post, maybe you missed it?

edit: I reread what you wrote. Can't find the original for that one but here's another source one.
http://imageshack.us/a/img6/5190/numbers2.jpg

Re: prepare embossed characters for ocr

Posted: 2012-12-08T11:19:52-07:00
by fmw42
Sorry I tried several things but none worked well.

Re: prepare embossed characters for ocr

Posted: 2012-12-09T04:00:36-07:00
by galv
:( . Thanks anyway Fred.

Anthony, maybe you have an idea?

Re: prepare embossed characters for ocr

Posted: 2012-12-09T07:03:08-07:00
by snibgo
I've tried and failed to get good results. Better lighting might help.

I can't help be curious about the application. Bank cards contain features that are easily machine readable: magnetic strips and chips. Why not use those?

Re: prepare embossed characters for ocr

Posted: 2012-12-09T19:25:42-07:00
by galv
It's for an app like card.io .

Re: prepare embossed characters for ocr

Posted: 2012-12-09T21:54:01-07:00
by snibgo
https://www.card.io/how-it-works/

Ah, I understand. A consumer uses their mobile to photograph their card, and all the usual data is captured. Yikes, that's a very difficult problem, given the variety of lighting conditions, wear of the card, and so on. More than a 30-minute problem, I'm afraid.

Re: prepare embossed characters for ocr

Posted: 2012-12-10T12:26:37-07:00
by fmw42
I don't know if this helps, but you can try background division and some log to amplify the dark. But it is mostly white.


convert numbers2.jpg -set colorspace RGB -colorspace gray \
\( -clone 0 -crop 694x3+0+0 +repage \) \
\( -clone 0 -crop 694x3+0+91 +repage \) \
\( -clone 1 -clone 2 -append -scale 694x1! -scale 694x93! \) \
-delete 1,2 +swap -compose divide -evaluate log 100 -composite \
show:

Re: prepare embossed characters for ocr

Posted: 2012-12-18T18:30:54-07:00
by anthony
galv wrote::( . Thanks anyway Fred.

Anthony, maybe you have an idea?

It can be very tricky. using morphology to expand the numbers will fill the interior, but convert the number itself into a blob. There is no gurantee of an inside or outside either.

perhaps a set of custom kernels that looks for off pixels surrounded by fairly distant on pixels.

However you are only looking for 10 well known shapes that would be about always the same size. A basic morphology or even a FFT correlation search should match each number and their position quite quickly. so convert to FFT, multiply with each of the digit patterns, and convert the 10 resulting images back to get the positions of each of the 10 digits.

Re: prepare embossed characters for ocr

Posted: 2012-12-22T19:19:04-07:00
by galv
Yes, maybe template matching would be a better fit for this problem.

However you got me lost with the FFT stuff. Can you please elaborate with some examples?
I skimmed through http://www.imagemagick.org/Usage/fourier/ . Are you referring to Fourier NCC like Fred's script?

Re: prepare embossed characters for ocr

Posted: 2012-12-22T20:03:25-07:00
by fmw42
galv wrote:Yes, maybe template matching would be a better fit for this problem.

However you got me lost with the FFT stuff. Can you please elaborate with some examples?
I skimmed through http://www.imagemagick.org/Usage/fourier/ . Are you referring to Fourier NCC like Fred's script?
I believe that he was. You can also do it with compare, but it is slower. see
http://www.imagemagick.org/script/compare.php
http://www.imagemagick.org/Usage/compare/
viewtopic.php?f=1&t=14613&p=51076&hilit ... ric#p51076

However, if the scan is not always at the same scale, then I would not expect that either technique will work very well, since neither are scale or rotation invariant.

Re: prepare embossed characters for ocr

Posted: 2013-01-01T23:48:37-07:00
by anthony
At least not without a number of other transforms. Something I would love to work on myself but sadly, my life is getting very complicated.

Re: prepare embossed characters for ocr

Posted: 2013-01-03T08:24:03-07:00
by galv
Happy new year guys!

I'm resizing the samples to be the same size with the "ground truth" images (and they're also properly rotated). compare -ncc performs well on black and white ground truth images and -AE does well on sobel-like effect ground truth images. However both are not performing well if the bounding box defined for the digit is not exact.
Anthony, can you please provide an example of what you had in mind ?

Re: prepare embossed characters for ocr

Posted: 2013-01-03T17:57:29-07:00
by anthony
Essentially the image is transformed into a special polar and log form that makes the Fourier transform pattern scale and rotation in-sensitive.
this is applied to both =pattern being looked for and the images being searched.