Remove horizontal line

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
hank2000
Posts: 2
Joined: 2012-11-26T15:54:21-07:00
Authentication code: 6789

Remove horizontal line

Post by hank2000 »

Hello,

I'm looking for help removing horizontal lines from images so that I can OCR them. Here's an example:

https://docs.google.com/open?id=0B2mMRo ... XZYWjlEVVk

I would like to remove the line under the word "Billy" so that the OCR engine doesn't get confused by it.

Version: ImageMagick 6.7.7-6 2012-07-31 Q16 on OSX

Thanks in advance,

Hank
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Remove horizontal line

Post by fmw42 »

try

convert billy.png -morphology close:2 "1x4: 0,1,1,0" result.png
hank2000
Posts: 2
Joined: 2012-11-26T15:54:21-07:00
Authentication code: 6789

Re: Remove horizontal line

Post by hank2000 »

Wow, awesome. Thanks!

Since I like to understand what I'm using instead of blindly executing it, I have a few questions:

Am I understanding the kernel correctly in that this works if there's a vertical pattern of pixels that are white, black, black, white? Will this only work on "lines" only 2 pixels thick? Sorry, if I don't quite understand kernels completely.

I ran this with only 1 iteration of "close", and it didn't remove the lines, so I'm wondering what the 2nd iteration is doing that the 1st didn't accomplish. Is it thinning the existing lines so that the 2nd pass removes them completely?

The reason for these questions is because I need to tune it some more to handle situations where the image isn't as clear as my example, and rather than post each issue I come up with, I'd rather solve it myself.

Thanks again.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Remove horizontal line

Post by fmw42 »

hank2000 wrote:
Am I understanding the kernel correctly in that this works if there's a vertical pattern of pixels that are white, black, black, white? Will this only work on "lines" only 2 pixels thick? Sorry, if I don't quite understand kernels completely.

I ran this with only 1 iteration of "close", and it didn't remove the lines, so I'm wondering what the 2nd iteration is doing that the 1st didn't accomplish. Is it thinning the existing lines so that the 2nd pass removes them completely?

The reason for these questions is because I need to tune it some more to handle situations where the image isn't as clear as my example, and rather than post each issue I come up with, I'd rather solve it myself.

Thanks again.
It was a quick solution that I did not try to refine. It is a morphological close (erode and dilate) attempt to remove two pixel tall horizontal black lines that have white above and below them. I tried running only one iteration as well as I thought that would work, but it left a thinner line. So I ran two iterations and that worked. Each iteration will remove more and more. It may be possible, though I have not tested, to create a 3 or 4 pixel tall black kernel with 1 white pixel above and below and then run only one iteration. That would be 1x5: 0,1,1,1,0, etc. It may also be possible, but again untested to use 1x3: 0,1,0 and just run it with multiple iterations. I think I or you would have to test to see how many iterations are needed for any thickness of horizontal line and which approach works best.

I will leave it to you to test further, but it would be appreciated if you would let us know what you find works best.

If you still have trouble, then provide another example and we can see what can be done further.

For more information about morphologic operators see:

http://www.imagemagick.org/Usage/morphology/
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Remove horizontal line

Post by fmw42 »

I tested multiple iterations of 1x3: 0,1,0 but that does not work and I should have known better. The filter needs to have the number of ones the same or larger than the thickness of the line. So this works in one iteration.

convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" show:


I don't see any visual difference between:


convert billy.png -morphology close:1 "1x5: 0,1,1,1,0" billy_1x5x1.gif

convert billy.png -morphology close:2 "1x4: 0,1,1,0" billy_1x4x2.gif
a_j_g_cvt
Posts: 5
Joined: 2016-01-12T16:50:39-07:00
Authentication code: 1151

Re: Remove horizontal line

Post by a_j_g_cvt »

This is excellent feedback. #1. How would I utilize a similar script on lines that are greater than a certain pixel length and #2. How would I remove the entire line even if it intersects with a letter (in this case the Y in Billy).

Perhaps we would want to identify lines (of a minimum pixel height) as being those with white above and below for a predetermined minimum horizontal length (anywhere within the line) and then beyond this length remove all vertical pixels the whole length of the line (but not beyond the maximum height discovered between the white space)?

I'd like to see us be able to remove the line below the Y in Billy without removing pixels from the Y itself.

Thank you kindly!
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Remove horizontal line

Post by fmw42 »

1) See http://magick.imagemagick.org/script/co ... onents.php and filter on areas longer than your linexthickness or by individual id.

2) You cannot as far as I know. It won't get that part of the line that connects with the bottom of the Y.
a_j_g_cvt
Posts: 5
Joined: 2016-01-12T16:50:39-07:00
Authentication code: 1151

Re: Remove horizontal line

Post by a_j_g_cvt »

Here is a link to an example file I am working on:

https://www.dropbox.com/s/2luszvqka7wc2 ... e.png?dl=0
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Remove horizontal line

Post by fmw42 »

Connected components won't remove the lines if they are connected to some other part of your text. Sorry, I do not know how to deal with that. If I get any ideas, I will post back here.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Remove horizontal line

Post by snibgo »

This works for the "for sale" example. It turns white all black lines that are at least 50 pixels wide. (The widest character is shorter than this.) Then it removes remaining noise.

Windows BAT syntax.

Code: Select all

convert ^
  Test2image.png ^
  -strip ^
  -write mpr:ORG ^
  ( +clone ^
    -negate ^
    -morphology Erode rectangle:50x1 ^
    -mask mpr:ORG -morphology Dilate rectangle:50x1 ^
    +mask ^
  ) ^
  -compose Lighten -composite ^
  ( +clone ^
    -morphology HMT "1x4:1,0,0,1" ^
  ) ^
  -compose Lighten -composite ^
  ( +clone ^
    -morphology HMT "1x3:1,0,1" ^
  ) ^
  -compose Lighten -composite ^
  ( +clone ^
    -morphology HMT "3x1:1,0,1" ^
  ) ^
  -compose Lighten -composite ^
  out.png
snibgo's IM pages: im.snibgo.com
a_j_g_cvt
Posts: 5
Joined: 2016-01-12T16:50:39-07:00
Authentication code: 1151

Re: Remove horizontal line

Post by a_j_g_cvt »

Thank you snibgo! This worked very well on the sample image I provided. I noticed that other areas of the document with large font were affected as they are comprised of a fraction of long straight lines of greater length than 50 pixels. I tried increasing the rectangle size height however this left some odd artifacts. Here is a link to a file which the script struggles with: https://www.dropbox.com/s/uxj92oteykx5o ... e.png?dl=0 (Test3image.png)

Any ideas on how to remove the lines from the initial test image (Test2image.png) without affecting text with large font in the second image (Test3image.png)? I believe the lines we want to remove all have one border which is almost completely 0 pixels. This may be a clue however the large E and letters like T have similar characteristics. Thank you kindly !
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Remove horizontal line

Post by snibgo »

test3image.png has a different problem, so needs a different solution, eg:

Code: Select all

convert test3image.png -blur 0x3 -level 30%,70% b.png
If both types of problem occur in the same image, then your difficulties are large. The problem then is of determining which area suffers from which problem. Then you can chop the image into pieces, apply the appropriate solution to each, and reassemble.

That problem is more difficult. You could start by taking your entire image (or a sample of images), and chop them manually. Find the solution for each piece. Find what identifies each area, so you can automatically chop them, and automatically solve each one.
snibgo's IM pages: im.snibgo.com
a_j_g_cvt
Posts: 5
Joined: 2016-01-12T16:50:39-07:00
Authentication code: 1151

Re: Remove horizontal line

Post by a_j_g_cvt »

snibgo -Thank you again! This is very helpful. I am including a link to a new image which combines the previous examples into one image. The ultimate goal is to write one script which removes the underlines without affecting any text (both smaller font and or larger font). I believe it can be done I just don't know how...

https://www.dropbox.com/s/7cupzdzpneiv8 ... e.png?dl=0

Thank you kindly -
a_j_g_cvt
Posts: 5
Joined: 2016-01-12T16:50:39-07:00
Authentication code: 1151

Re: Remove horizontal line

Post by a_j_g_cvt »

snibgo- One other thing, I ran your original script against the composite image of both test images. Here is the result:

https://www.dropbox.com/s/ytjgpnq4tta9o ... t.png?dl=0

Have a great day!
Post Reply