Clean Up a Document for Faxing/OCR

Magick.NET is an object-oriented C# interface to ImageMagick. Use this forum to discuss, make suggestions about, or report bugs concerning Magick.NET
mattj
Posts: 11
Joined: 2015-01-02T07:02:18-07:00
Authentication code: 6789

Re: Clean Up a Document for Faxing/OCR

Post by mattj »

So my command line process is working fine, but on multi-page pdf's it was only saving the first page. So I modified the initial grayscale conversion and clone and saved them individually to prove they are converting all the pages, and they are. However when it saves receipt_after.pdf after the -compose it only saves the first page. Any ideas?

Code: Select all

convert ( receipt.pdf[0--1] -colorspace gray -type grayscale -contrast-stretch 0 ) ( -clone 0--1 -colorspace gray -negate -lat 15x15+5% -contrast-stretch 0 ) -compose copy_opacity -composite -fill "white" -opaque none +matte -deskew 40% -sharpen 0x1 receipt_after.pdf
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Clean Up a Document for Faxing/OCR

Post by snibgo »

"-composite" works on two images (or three, including a mask). But you have many images, from the input pdf, plus all the clones.

You might get around this with two lists separated by null: , or something. But if your pdf is large, you might run out of memory. A better strategy might be to put in in a loop: process page [0], then [1], etc.
snibgo's IM pages: im.snibgo.com
mattj
Posts: 11
Joined: 2015-01-02T07:02:18-07:00
Authentication code: 6789

Re: Clean Up a Document for Faxing/OCR

Post by mattj »

snibgo wrote:"-composite" works on two images (or three, including a mask). But you have many images, from the input pdf, plus all the clones.

You might get around this with two lists separated by null: , or something. But if your pdf is large, you might run out of memory. A better strategy might be to put in in a loop: process page [0], then [1], etc.
Yes, I see where it talks about Layers Composition and in their example it's an animated gif. So I tried using the null: and added -layers Composite to the command, but it says it can't find the null:.

Code: Select all

convert ( receipt.pdf[0--1] -colorspace gray -type grayscale -contrast-stretch 0 ) null: ( -clone 0--1 -colorspace gray -negate -lat 15x15+5% -contrast-stretch 0 ) -compose copy_opacity -composite -fill "white" -opaque none +matte -deskew 40% -sharpen 0x1 -layers Composite receipt_after.pdf
I guess plan b is when I go to start processing the pdf, is to write the pages to disk, process them in a loop, then re-combine them. Yuk.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Clean Up a Document for Faxing/OCR

Post by fmw42 »

perhaps your IM version is too old for -layers composite. best to always provide your IM version and platform when asking a question. or perhaps it is not implemented yet in Magick.Net
mattj
Posts: 11
Joined: 2015-01-02T07:02:18-07:00
Authentication code: 6789

Re: Clean Up a Document for Faxing/OCR

Post by mattj »

fmw42 wrote:perhaps your IM version is too old for -layers composite. best to always provide your IM version and platform when asking a question. or perhaps it is not implemented yet in Magick.Net
Sorry, yes I'm on 6.9.0 on Windows.

So what I'm doing *should* work?
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Clean Up a Document for Faxing/OCR

Post by fmw42 »

My guess is that you cannot do the clone after the null: (which by the way includes the null: in your list) nor the compose copy_opacity composite operation that follows. The only type composite you can use in that circumstance with the null: is -layers composite. See http://www.imagemagick.org/Usage/anim_mods/#composite

To be honest I really do not understand what you are trying to do. Perhaps you need to explain it in more detail.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Clean Up a Document for Faxing/OCR

Post by snibgo »

I think he wants to do something like this, which works fine for me:

Code: Select all

convert logo: wizard: null: ( -clone 0--2 -negate ) -compose CopyOpacity -layers composite x.tiff
I have just 2 inputs but there could be any number. Then null:, then I clone from 0 to -2, so from zero to the one before the null.

Then I declare the compose method, and "-layers composite" does a pair-wise operation. My output has two images, being the inputs but with the lighter parts transparent.

EDIT correcting typo: pair-wise, not pair-size. I mean the first input is paired with the first clone, then the second input with the second input, and so on to the last input with the last clone.
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Clean Up a Document for Faxing/OCR

Post by fmw42 »

Nice. A simple command always makes it clearer. Interesting way to do multi-image processing without using mogrify, which would not do in this case, nor without a loop. A simple loop over each input image would do the same and avoid using too much memory, if there are many input images.
mattj
Posts: 11
Joined: 2015-01-02T07:02:18-07:00
Authentication code: 6789

Re: Clean Up a Document for Faxing/OCR

Post by mattj »

Thanks snibgo! I had my null: in the wrong place and also the "-layers Composite" had to replace the "-composite" command that I had. So here's the final command:

Code: Select all

convert ( receipt.pdf[0--1] null: -colorspace gray -type grayscale -contrast-stretch 0 ) ( -clone 0--2 -colorspace gray -negate -lat 15x15+5% -contrast-stretch 0 ) -compose copy_opacity -layers Composite -fill "white" -opaque none +matte -deskew 40% -sharpen 0x1  receipt_after.pdf
This takes a multi-page pdf, turns the background white and sharpens the text a bit, and saves all the pages back to pdf.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Clean Up a Document for Faxing/OCR

Post by fmw42 »

Probably should be

convert ( receipt.pdf[0--1] -colorspace gray -type grayscale -contrast-stretch 0 ) null: ( -clone 0--2 -colorspace gray -negate -lat 15x15+5% -contrast-stretch 0 ) -compose copy_opacity -layers Composite -fill "white" -opaque none +matte -deskew 40% -sharpen 0x1 receipt_after.pdf

That way you do not spend effort process null:

All of the posts about this command are really about the command line and not Magick.Net. So they should really have been posted to the User's forum.
mattj
Posts: 11
Joined: 2015-01-02T07:02:18-07:00
Authentication code: 6789

Re: Clean Up a Document for Faxing/OCR

Post by mattj »

fmw42 wrote:Probably should be

convert ( receipt.pdf[0--1] -colorspace gray -type grayscale -contrast-stretch 0 ) null: ( -clone 0--2 -colorspace gray -negate -lat 15x15+5% -contrast-stretch 0 ) -compose copy_opacity -layers Composite -fill "white" -opaque none +matte -deskew 40% -sharpen 0x1 receipt_after.pdf

That way you do not spend effort process null:
Yep, good point.
fmw42 wrote:
All of the posts about this command are really about the command line and not Magick.Net. So they should really have been posted to the User's forum.
Right but at first I was trying to accomplish this using Magick.Net and it morphed from there.
Post Reply