prepare embossed characters for ocr

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

prepare embossed characters for ocr

Post by galv »

Hello,

I'm trying to OCR information from a bank card. The problem is that it contains embossed numbers and that is not good for OCR. The image below is the closest I've gone but the characters contain holes. I can't just fill holes because they're not exactly complete holes. I've also tried several structuring elements to no avail.
Any ideas?

http://i.imgur.com/WCAHI.png
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: prepare embossed characters for ocr

Post by fmw42 »

can you supply a link to your source image?
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: prepare embossed characters for ocr

Post by galv »

It's at the bottom of the post, maybe you missed it?

edit: I reread what you wrote. Can't find the original for that one but here's another source one.
http://imageshack.us/a/img6/5190/numbers2.jpg
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: prepare embossed characters for ocr

Post by fmw42 »

Sorry I tried several things but none worked well.
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: prepare embossed characters for ocr

Post by galv »

:( . Thanks anyway Fred.

Anthony, maybe you have an idea?
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: prepare embossed characters for ocr

Post by snibgo »

I've tried and failed to get good results. Better lighting might help.

I can't help be curious about the application. Bank cards contain features that are easily machine readable: magnetic strips and chips. Why not use those?
snibgo's IM pages: im.snibgo.com
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: prepare embossed characters for ocr

Post by galv »

It's for an app like card.io .
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: prepare embossed characters for ocr

Post by snibgo »

https://www.card.io/how-it-works/

Ah, I understand. A consumer uses their mobile to photograph their card, and all the usual data is captured. Yikes, that's a very difficult problem, given the variety of lighting conditions, wear of the card, and so on. More than a 30-minute problem, I'm afraid.
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: prepare embossed characters for ocr

Post by fmw42 »

I don't know if this helps, but you can try background division and some log to amplify the dark. But it is mostly white.


convert numbers2.jpg -set colorspace RGB -colorspace gray \
\( -clone 0 -crop 694x3+0+0 +repage \) \
\( -clone 0 -crop 694x3+0+91 +repage \) \
\( -clone 1 -clone 2 -append -scale 694x1! -scale 694x93! \) \
-delete 1,2 +swap -compose divide -evaluate log 100 -composite \
show:
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: prepare embossed characters for ocr

Post by anthony »

galv wrote::( . Thanks anyway Fred.

Anthony, maybe you have an idea?

It can be very tricky. using morphology to expand the numbers will fill the interior, but convert the number itself into a blob. There is no gurantee of an inside or outside either.

perhaps a set of custom kernels that looks for off pixels surrounded by fairly distant on pixels.

However you are only looking for 10 well known shapes that would be about always the same size. A basic morphology or even a FFT correlation search should match each number and their position quite quickly. so convert to FFT, multiply with each of the digit patterns, and convert the 10 resulting images back to get the positions of each of the 10 digits.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: prepare embossed characters for ocr

Post by galv »

Yes, maybe template matching would be a better fit for this problem.

However you got me lost with the FFT stuff. Can you please elaborate with some examples?
I skimmed through http://www.imagemagick.org/Usage/fourier/ . Are you referring to Fourier NCC like Fred's script?
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: prepare embossed characters for ocr

Post by fmw42 »

galv wrote:Yes, maybe template matching would be a better fit for this problem.

However you got me lost with the FFT stuff. Can you please elaborate with some examples?
I skimmed through http://www.imagemagick.org/Usage/fourier/ . Are you referring to Fourier NCC like Fred's script?
I believe that he was. You can also do it with compare, but it is slower. see
http://www.imagemagick.org/script/compare.php
http://www.imagemagick.org/Usage/compare/
viewtopic.php?f=1&t=14613&p=51076&hilit ... ric#p51076

However, if the scan is not always at the same scale, then I would not expect that either technique will work very well, since neither are scale or rotation invariant.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: prepare embossed characters for ocr

Post by anthony »

At least not without a number of other transforms. Something I would love to work on myself but sadly, my life is getting very complicated.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
galv
Posts: 62
Joined: 2010-05-23T17:35:59-07:00
Authentication code: 8675308

Re: prepare embossed characters for ocr

Post by galv »

Happy new year guys!

I'm resizing the samples to be the same size with the "ground truth" images (and they're also properly rotated). compare -ncc performs well on black and white ground truth images and -AE does well on sobel-like effect ground truth images. However both are not performing well if the bounding box defined for the digit is not exact.
Anthony, can you please provide an example of what you had in mind ?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: prepare embossed characters for ocr

Post by anthony »

Essentially the image is transformed into a special polar and log form that makes the Fourier transform pattern scale and rotation in-sensitive.
this is applied to both =pattern being looked for and the images being searched.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
Post Reply