Page 1 of 1

Textcleaner and multi-page docs

Posted: 2016-07-10T08:58:20-07:00
by mstone
I can't seem to get textcleaner to run against multi-page docs like PDFs and TIFF. Every time I do the result document is truncated to one page, even if I use a convert-friendly wildcard like out_%d.tif for my outfile argument. Is it a known issue that textcleaner should only be used on single-page docs? I suppose I could split the docs into individual pages prior to cleaning, but I was hoping to avoid that step.

Best,
- Matt

Re: Textcleaner and multi-page docs

Posted: 2016-07-10T10:56:19-07:00
by fmw42
Currently, the majority of my scripts do not work for multi-page image formats. There may be a a few very simple ones that do, but I have no idea which ones work and which do not. They were not originally designed for mult-page formats. You will have to separate them, run the scripts, then combine them. You can do that in a script loop. That would be what I would have to do internally to modify them to make them work. Sorry!

I will put a message on the my home page about this. I just never gave it a thought.

P.S. PDF files are a special issue since the file size depends upon the density you use when reading in the image. So be cautious of that.

Re: Textcleaner and multi-page docs

Posted: 2016-07-10T12:46:54-07:00
by mstone
OK, understood. Thanks for the quick reply. At least I can stop trying different permutations of inputs looking for the magic incantation to make it handle multi-page docs! :-)

Splitting the doc into pages and re-assembling will work, just not something I wanted to do if it was easily handled in the convert-clean step.

Thanks again. Love the good work you do!

Best,
- Matt