Page 1 of 2

How to find ticket and crop it into a single image?

Posted: 2015-09-30T19:35:00-07:00
by blainefox
Hello,

I'm working on an OCR app project which needs to read ticket info from user-shot photos. The ticket may not be perpendicular or even distorted inside the photo so I can't just crop the image to get it.

Imagemagick can detect edges, but I don't know what to do next to crop and fix the distortion to get a rectangle image file from the edge info. Many thanks.

Re: How to find ticket and crop it into a single image?

Posted: 2015-09-30T23:59:10-07:00
by fmw42
Please supply an example image. You can post to some place such as dropbox.com and put the URL here. Also please supply your IM version and platform. See viewtopic.php?f=1&t=9620

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-01T05:17:37-07:00
by blainefox
Image
This is the photo. I need to get the standard rectangle ticket image to guarantee OCR accuracy.

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-01T06:11:39-07:00
by snibgo
A perspective transformation, moving the ticket corners to the corners of a rectangle, does what you want. Windows BAT syntax.

Code: Select all

set DIST=42,645,0,0,^
498,544,500,0,^
99,966,0,500,^
614,824,500,500

%IM%convert ^
  ticket.jpg ^
  -distort perspective "%DIST%" ^
  -crop 500x500+0+0 +repage ^
  t.png
You can clean up t.png and pass it to OCR.

How do we get the corners of the ticket: 42,645 etc? There are many methods, including:

Code: Select all

%IM%convert ^
  ticket.jpg ^
  -contrast-stretch 10%%x10%% ^
  -despeckle ^
  +write t1.png ^
  -threshold 50%% ^
  -despeckle ^
  -median 5x5 ^
  +write t2.png ^
  -canny 0x1+10%%+30%% ^
  +write t3.png ^
  -hough-lines 9x9+50 ^
  +write t4.png ^
  h.mvg
h.mvg gives four lines. The intersections are at the corners.

"+write t1.png" etc are only for debugging, so you can see the intermediate results in t1.png etc. Those lines can be removed.

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-01T08:43:19-07:00
by fmw42
What platform are you on? If Linux/MacOSX/Windows with Cygwin, you can try my scripts unperspective or whiteboard at the link below.

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-04T05:15:53-07:00
by blainefox
Thank you for so detailed instruction! I'll have a try.

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-04T19:47:53-07:00
by blainefox
snibgo wrote:A perspective transformation, moving the ticket corners to the corners of a rectangle, does what you want. Windows BAT syntax.

Code: Select all

set DIST=42,645,0,0,^
498,544,500,0,^
99,966,0,500,^
614,824,500,500
[/quote]

How do I find the intersections? Thanks again...

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-04T21:48:51-07:00
by snibgo

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-05T00:46:12-07:00
by blainefox
fmw42 wrote:What platform are you on? If Linux/MacOSX/Windows with Cygwin, you can try my scripts unperspective or whiteboard at the link below.
Many thanks. I'm trying your unperspective example 8 but get different result using the sample image/argument you provide(reporting too many peaks).
I'm using Cygwin under Windows7 x64.

Re: How to find ticket and crop it into a single image?

Posted: 2015-10-05T10:01:57-07:00
by fmw42
Many thanks. I'm trying your unperspective example 8 but get different result using the sample image/argument you provide(reporting too many peaks).
The background is not clean enough -- too speckled to extract the picture. You may be better off with snibgo's solution.

Re: How to find ticket and crop it into a single image?

Posted: 2018-01-15T10:39:03-07:00
by MikeG
snibgo wrote:
2015-10-01T06:11:39-07:00
h.mvg gives four lines. The intersections are at the corners.
Thank you for explanations.
I'm newcomer to IM, sorry for not-too-smart questions.
All code from Linux environment.

I have about the same situation - I need to pass photos to OCR.
And then there will be tons of photographed rectangular documents (2-5Mpix. mostly A4/Legal) that must be converted to flatbed scanner-like images.

Sample source image http://url.mik.lv/sc/1.jpg.
I've tuned your numbers a bit to get just 4 lines. It takes about 30 seconds to compute.

Code: Select all

convert 1.jpg -contrast-stretch 10%%x10%% -despeckle -threshold 50%% \
 -despeckle -median 15x15 -canny 0x1+10%%+30%% -hough-lines 50x50+150 h.mvg
Got coordinates

Code: Select all

# Hough line transform: 50x50+150                                                                                                                                                                                                         
viewbox 0 0 3264 1836                                                                                                                                                                                                                     
line 476.236,0 -907.289,1836  # 171                                                                                                                                                                                                       
line 0,-180.575 3264,2190.86  # 509                                                                                                                                                                                                       
line 2944.17,0 1705.77,1836  # 274                                                                                                                                                                                                        
line 0,643.842 3264,2929.32  # 609
And then

Code: Select all

DIST='476.236,0,-907.289,1836,0,-180.575 3264,2190.86,2944.17,0 1705.77,1836,0,643.842 3264,2929.32'
convert 1.jpg -distort perspective "$DIST" +repage c1.jpg
in half an hour generated strange http://url.mik.lv/sc/c1.jpg

I feel that I'm missing something simple, but important, at last stage.

And also probably it's a good idea (CPU time) to scale source image down before finding corners, but I have no idea how to handle all that coordinates.

Could you please point me to proper direction?

Re: How to find ticket and crop it into a single image?

Posted: 2018-01-17T04:39:14-07:00
by snibgo
MikeG wrote:I feel that I'm missing something simple, but important, at last stage.
Yes. The first stage finds four lines, giving the coordinates of each end.

Then you distort the image to move one end of the first line to the other end, and so on for the other lines. That makes no sense.

A better idea is to find the intersections of the four lines. This gives the coordinates of the corners of the document. Then distort from those coordinates to the corners of a squared-up rectangle.

Re: How to find ticket and crop it into a single image?

Posted: 2018-01-17T10:30:39-07:00
by fmw42
If you are using a Unix-like system, then you could try my script, unperspective, at my link below.

Image

unperspective -f 50 -A 99 receipt.jpg receipt_unpersp.jpg


Image

Re: How to find ticket and crop it into a single image?

Posted: 2020-02-20T03:49:45-07:00
by randomlad
fw42: Is your tool unperspective deprecated? I tried using it on your examples and this receipt picture, but I always get this output, for almost all kinds of inputs. My version of ImageMagick is 7.0.9-21
--- Number Of Peaks Is Greater Than 4 ---
I've been trying to use your program to use on 4K pictures of whiteboards, but I always get the same output with the number of peaks.

Re: How to find ticket and crop it into a single image?

Posted: 2020-02-20T10:09:59-07:00
by fmw42
Your image probably has too much noise (not clean enough background) or the -f argument is not appropriate. Post your image to some free hosting service and put the URL here and I will look at it and see what is happening.