Cleaning up noise around text

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
mark0978
Posts: 4
Joined: 2011-05-09T08:51:23-07:00
Authentication code: 8675308

Cleaning up noise around text

Post by mark0978 » 2011-05-09T09:14:23-07:00

I've tried -noise radius and -noise geometry and they don't seem to do what I want at all. I have some b&w images (TIFF G4 Fax compression) with lots of noise around the characters. This noise takes the form of pixel blobs that are 1 pixel wide in most cases.

My desire is to do the following 3 steps (in this order):

Whiteout all black pixels that are 1 pixel wide
Whiteout all black pixels that are 1 pixel tall
Whiteout all black pixels that are 1 pixel wide

So the question is, do I have to crack out my C++ skills, or can I do this with imagemagick?

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-09T10:01:30-07:00

see -morphology close (it would be open if your image was white letters on black, but you need to use close for black letters on white background)

http://www.imagemagick.org/Usage/morphology/#basic


you will have to pick the shape/size of the filter to correspond to the noise you want to remove. If tall noise use narrow wide filter and vice versa.

Can you post a link to your image? It would help to have that to know if this is a viable approach.
Last edited by fmw42 on 2011-05-09T16:13:22-07:00, edited 1 time in total.

mark0978
Posts: 4
Joined: 2011-05-09T08:51:23-07:00
Authentication code: 8675308

Re: Cleaning up noise around text

Post by mark0978 » 2011-05-09T16:02:25-07:00

Here is a snippet from the image. I've read quite about about morphology, but still haven't managed to come up with something that helps with cleanup without doing more damage than it fixes.

http://www.imagehawk.com/images/cleanup.tif

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-09T16:07:15-07:00

I don't think anything is going to help as the noise is nearly as big as the thickness of the text characters and the noise is too close to the characters. If they were further away, then perhaps something might be done.

This is not too bad using morphology close with a square shape. But you can try other shapes.

convert cleanup.tif -morphology close square:1 cleanup_close1.gif

User avatar
anthony
Posts: 8874
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Cleaning up noise around text

Post by anthony » 2011-05-09T16:24:08-07:00

Most of the noise is cleaned up using

Code: Select all

convert cleanup.tif -morphology close diamond show:
How fred is right it is very hard when the noise is so close to the original text.

However you specified specifically what you want to do, and adding specific pixels (making white) can be done using a Thicken morphology operation.

For example remove black pixels that are one pixel wide

Code: Select all

   convert cleanup.tif -morphology thicken '3x1:1,0,1' show:
remove black pixels that are one pixel high

Code: Select all

   convert cleanup.tif -morphology thicken '1x3:1,0,1' show:
Or do both, one following the other (two rotated kernels)

Code: Select all

   convert cleanup.tif -morphology thicken '1x3>:1,0,1' show:
The real problem however is your source image. It looks like the text was a JPEG that has been thresholded.
It looks like the threshold levels however was wrong, leaving ringing artefacts in the resulting image.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
http://www.imagemagick.org/Usage/

mark0978
Posts: 4
Joined: 2011-05-09T08:51:23-07:00
Authentication code: 8675308

Re: Cleaning up noise around text

Post by mark0978 » 2011-05-10T16:20:48-07:00

The morphology really does clean up the image for human readability, but when I zoom in though, I think it square is going to possibly hurt the OCR :-(

However diamond may actually help quite a bit.

I get an invalid argument for -morphology when I use this command:

Code: Select all

convert cleanup.tif -morphology thicken '3x1:1,0,1'
so I'll try to update tonight and see if that will do the trick.

Version: ImageMagick 6.6.9-4 2011-04-01 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2011 ImageMagick Studio LLC
Features: OpenMP

These images were made on a really expensive ($250,000) scanner. I'm guessing they didn' t know how to use it properly..... We are working with them to do a better job on future scans (including 300 dpi....)

Thanks for the help.

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-10T16:31:05-07:00

You need to specify an output image!

convert cleanup.tif -morphology thicken '3x1:1,0,1' result.gif

The following as Anthony suggested with diamond rather than square works well.

convert cleanup.tif -morphology close diamond:1 cleanup_close1.gif

HugoRune
Posts: 90
Joined: 2009-03-11T02:45:12-07:00
Authentication code: 8675309

Re: Cleaning up noise around text

Post by HugoRune » 2011-05-10T16:39:12-07:00

Due to the nature of the ringing noise, all black noise specks are separated by at least 1 pixel from the letters.

One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.

This way the shape of the letters and other large blobs is preserved perfectly, and smaller blobs disappear.


The biggest dilate that still leaves a part of each letter shape seems to be a 3x4 rectangle for the example data; perhaps use something smaller to be on the safe side.

This command first dilates that 3x4 rectangle, end then erodes until the letters are all whole again

Code: Select all

convert cleanup.tif -write MPR:source ^
  -morphology close rectangle:3x4 ^
  -morphology erode square    MPR:source -compose Lighten -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  -morphology erode square    MPR:source -composite ^
  cleaned.png
Image

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-10T16:44:39-07:00

Very clever approach!

Fred

User avatar
anthony
Posts: 8874
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Cleaning up noise around text

Post by anthony » 2011-05-10T19:19:13-07:00

HugoRune wrote:Due to the nature of the ringing noise, all black noise specks are separated by at least 1 pixel from the letters.

One good approach to remove this noise would be to dilate the image so that at least one "seed" part of each letter remains, then erode these seeds while using the original image as a mask; in effect a flood-fill for each letter.
this is basically known as "conditional dilation" (or for negated image "conditional erode" and while I have not explored this enough to generate examples it should actually be available RIGHT NOW!


The trick is to use a 'write mask' (the original image) on the 'seed image' and then dilate to infinity.
At this time I only have quick notes on using image write masks in
http://www.imagemagick.org/Usage/maskin ... ping_masks
For morphology I would use make sure the write mask was boolean by specifying it using -clip-mask
The clip mask should be white where you do not want the image to be updated.

Hmmm... This is my first attempt at conditional morphology, exactly as I envisaged!

Code: Select all

convert cleanup.tif -write MPR:source \
    -morphology close rectangle:3x4 \
    -clip-mask MPR:source \
    -morphology erode:8 square \
    +clip-mask   cleaned.png
Hey it works!!! 8)

This is the equivalent of HugoRune's conditional erode and gets the same result.

NOTE do not use an infinite erode (iteration count = -1), as it will never end (for a long time). Morphology does not actually understand write masks, so it sees pixel changes even though they are never written, as and such it never sees a final 'static' image. In IMv7 (yet to fork) use of infinite iterations to 'seed flood fill' may be possible.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
http://www.imagemagick.org/Usage/

User avatar
anthony
Posts: 8874
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Cleaning up noise around text

Post by anthony » 2011-05-10T19:27:33-07:00

fmw42 wrote:You need to specify an output image!

convert cleanup.tif -morphology thicken '3x1:1,0,1' result.gif

The following as Anthony suggested with diamond rather than square works well.

convert cleanup.tif -morphology close diamond:1 cleanup_close1.gif
Both of you missed the '>' in my example to remove 1 pixel width and height pixels.
And that is not quite the same as a 'diamond'.

As for the use of the scanner. Yes I'd say they should scan a sample image in a number of ways so that you can look for figure out what is best. Either that or have then deliver a raw grayscale (color?) scan so you can adjust thresholding and other parameters yourself.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
http://www.imagemagick.org/Usage/

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-10T19:41:03-07:00

I did not miss it -- just finished your first example to replace show: with an image.

User avatar
anthony
Posts: 8874
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Cleaning up noise around text

Post by anthony » 2011-05-10T23:13:00-07:00

Doesn't show: work on a Mac?
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
http://www.imagemagick.org/Usage/

User avatar
fmw42
Posts: 22234
Joined: 2007-07-02T17:14:51-07:00
Location: Sunnyvale, California, USA

Re: Cleaning up noise around text

Post by fmw42 » 2011-05-11T10:58:05-07:00

anthony wrote:Doesn't show: work on a Mac?

Yes, it does (and your commands show just fine), but the user left off both show: and an output and was complaining of getting errors.

I get an invalid argument for -morphology when I use this command:
convert cleanup.tif -morphology thicken '3x1:1,0,1'
So all I was trying to do was remind him of the need for an output image.

User avatar
anthony
Posts: 8874
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Cleaning up noise around text

Post by anthony » 2011-05-11T18:45:02-07:00

Fair enough.... Back to the problem at hand.

mark0978... Are you satisfied with the solutions provided?
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
http://www.imagemagick.org/Usage/

Post Reply