connected-components different verbose results with threshold

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

connected-components different verbose results with threshold

Post by gvandyk »

HI

I am using the following command:

Code: Select all

convert img.gif -define connected-components:verbose=true -connected-components 4 null: > out.txt
on my black & white image this gives back only srgb(255,255,255) and srgb(0,0,0) output

but when I add an area threshold such as

Code: Select all

convert img.gif -define connected-components:area-threshold=1000 -define connected-components:verbose=true -connected-components 4 null: > out.txt
it gives back different srgb values of different shades of gray, the areas are still the same size, but the srgb values are now different.

What is different, as I believed the area-threshold only filters larger areas, not change the output?

How can I add a threshold and still only get srgb(255,255,255) and srgb(0,0,0) lines back?

My version of IM is 6.9.0-10
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

After an area threshold, the pixels that are thresholded out are merged with their respective background values and new graylevels are computed for the new regions. That is if you merged a white pixel into a black region, then that region contains some black pixels and the white pixel, which is changed to black, so the region is recomputed to represent the average of what original pixels are included.

This is an issue that I feel needs to be changed or to have a new option so that after merging, the new region has the same graylevel values as it did before the merge. That is to represent the average of the region after merging, so black areas are still black and white areas are still white, but the counts will change. I have a request in to the developers already to make this enhancement.
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

Re: connected-components different verbose results with threshold

Post by gvandyk »

Is there a way to limit the results from the connected-components verbose call.

I am getting the "too many objects" error, and would like to only get the top results (largest areas), without loosing the "black & white".

I assume that I can filter the srgb results to assume that values above (128) would be white, and those below black, is this assumption correct?
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

I have done post processing in unix to filter and even recolor the results from the textual output. I am not sure why you are getting "too many results"? Are you using Q16 IM? If so, that should allow 65535 different regions. Perhaps you can post your input image and some of the textual results you are getting.

See my unix bash shell script, kmeans, at my link below for color reduction and the filtering that I used to recolor after the regions were changed.
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

Re: connected-components different verbose results with threshold

Post by gvandyk »

A link to a file with the "too many objects" run on my Q16 IM:

https://www.dropbox.com/s/yew6v4a6utrvn ... s.gif?dl=0

I am trying to extract pages from images that I have scanned with a book scanner (cameras), and I need to detect the proper page.

With this I am looping through gray values from 255 to 150 with a fuzz value of 25 using:

Code: Select all

convert infile.jpg -fuzz 25% -fill white -opaque gray(grayValue) -fill black +opaque white tmp.gif
The first white area that gets extracted at the highest gray level, that is more than 50% the width of the scanned image is assumed to be a page. This page then gives me the margin of one of the page edges (page end). I then carry on looping until I find the closest other edge that is greater than "glassmargin". This is to determine the middle of the book.

Once these 2 margins are found, the page can be extracted.

I need to do the loop, as I have 1000's of pages, with differing backgrounds and page sizes.

It is within these loops, that I am getting the "too many objects" on some gray/fuzz values.

Is there a better way to determine the page boundaries, as this method takes very long to get the proper page boundaries, but it does work.
Last edited by gvandyk on 2015-10-28T11:40:40-07:00, edited 1 time in total.
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

Re: connected-components different verbose results with threshold

Post by gvandyk »

In addition to the above, I am stepping the grayvalue down by 5 each time I go through the loop to find the edges.

One of the original images:

https://www.dropbox.com/s/97957iwifwi5n ... 3.JPG?dl=0

The above "gif" was generated with a gray value of: 250, so the bottom code created the "too many objects" gif.

Code: Select all

convert IMG_0073.JPG -fuzz 25% -fill white -opaque gray(250) -fill black +opaque white manyObjs.gif
Last edited by gvandyk on 2015-10-28T11:39:51-07:00, edited 1 time in total.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

CCL is likely giving you too many objects due to all the noise in the image. Does it abort or does it still give you the textual data up to the point where it reaches to many objects.

You could try to use -morphology either open or close or smooth to get rid of the noise first. See http://www.imagemagick.org/Usage/morphology/#basic
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

Also try adding -depth 16 to your CCL command right after reading the input image.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

This seems to avoid the too many objects message by using -morphology close octagon:1

Code: Select all

convert manyObjs.gif -morphology close octagon:1 -define connected-components:verbose=true -connected-components 4 null:
But I would like to see your original image before any processing.
gvandyk
Posts: 15
Joined: 2015-09-08T12:40:32-07:00
Authentication code: 1151

Re: connected-components different verbose results with threshold

Post by gvandyk »

The original image that the gif was generated from can be found here:

https://www.dropbox.com/s/97957iwifwi5n ... 3.JPG?dl=0
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: connected-components different verbose results with threshold

Post by fmw42 »

Here is a potentially different approach using my unix bash shell script, textcleaner. After the text cleaner, I have assumed that all pages extracted from the same book have the same scale and region size for the text. So I measure an area about the text with some white border, but not too much. I then resized the image by 1/16 and did a subimage compare search to find the region that best matches to a mid gray region of that estimated text area reduced size. Once I had the offsets, I scale them back up by 16 and did a crop. (A more exact result could be achieved by repeating the subimage search at full resolution on a region that was somewhat bigger, but not full image size)

textcleaner -f 25 -o 10 -g IMG_0073.JPG tmp.png
http://www.fmwconcepts.com/misc_tests/p ... ct/tmp.png

width=1850
height=2750
factor=16
pct=`convert xc: -format "%[fx:100/$factor]" info:`
ww=`convert xc: -format "%[fx:round($width/$factor)]" info:`
hh=`convert xc: -format "%[fx:round($height/$factor)]" info:`
convert tmp.png -resize $pct% tmp2.png
http://www.fmwconcepts.com/misc_tests/p ... t/tmp2.png

vals=`compare -metric rmse -subimage-search -dissimilarity-threshold 1 tmp2.png \( -size ${ww}x${hh} xc:gray \) null: 2>&1`
coords=`echo $vals | sed -n 's/^.*[@] \(.*,.*\)/\1/p'`
xx=`echo $coords | cut -d, -f1`
yy=`echo $coords | cut -d, -f2`
xoff=$((xx*factor))
yoff=$((yy*factor))
convert tmp.png -crop ${width}x${height}+${xoff}+${yoff} +repage tmp3.png
http://www.fmwconcepts.com/misc_tests/p ... t/tmp3.png

Sorry. If you are on Windows, I do not have a corresponding script, but you can process the images similarly using -lat 25x25+10%. The rest of my code would need to be modified to Windows equivalents. I am not a Windows user, so could not help with this.
Post Reply