Search found 20 matches

by isfando
2020-04-15T06:52:13-07:00
Forum: Developers
Topic: Old releases source code
Replies: 5
Views: 9436

Re: Old releases source code

Fred i want to get the binaries for ubuntu of imagemagick version 6.7.7-10. i want it specifically for using Rmagick in ruby. I got the source code but while building it i get the below failing test that i cant resolve from googling . FAIL: wand/wandtest.sh I would be thankful if you can point me to...
by isfando
2020-03-09T07:32:15-07:00
Forum: Users
Topic: mark empty row in a column to avoid mixup in tesseract output
Replies: 4
Views: 2190

mark empty row in a column to avoid mixup in tesseract output

I need to ocr pdf of financial statements . I am using tesseract to ocr the pdf documents. For the sake of this question we assume there are three columns. Column 1 | Column 2 | Column 3 <keywords> | <number> | <number> Column1 are financial keywords while column 2 and column 3 are numbers represent...
by isfando
2020-03-04T08:28:46-07:00
Forum: Users
Topic: decrease text boldness while not losing accuracy in convert command to improve tesseract ocr
Replies: 5
Views: 1635

Re: decrease text boldness while not losing accuracy in convert command to improve tesseract ocr

I have adapted your approach to a pdf. *******************************CURRENT APPROACH**************************************** 1)input pdf https://drive.google.com/open?id=1nFpEQ2fN89qodqVbhuLu5WnO9b6S4UOT 2) convert -density 300 pdfinputsample.pdf -strip -background white -alpha off -colorspace gra...
by isfando
2020-03-03T12:01:06-07:00
Forum: Users
Topic: decrease text boldness while not losing accuracy in convert command to improve tesseract ocr
Replies: 5
Views: 1635

Re: decrease text boldness while not losing accuracy in convert command to improve tesseract ocr

A thousands thanks fred. I have updated the code in question showing the original code for removing horizantal lines but not minus. I have a few questions because my knowledge of imagemagick is very surface level. if you have time to answer i would be thankful. 1) why are you not using the below par...
by isfando
2020-03-03T08:50:01-07:00
Forum: Users
Topic: decrease text boldness while not losing accuracy in convert command to improve tesseract ocr
Replies: 5
Views: 1635

decrease text boldness while not losing accuracy in convert command to improve tesseract ocr

I need to ocr pdf of financial statements . Right now i am using a convert command but i am not satisfied with the output in some cases.I want to decrease the boldness of the text as far as possible without losing accuracy. Also the whole picture after conversion looks a bit blurry. ****************...
by isfando
2018-09-27T06:23:38-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@snibgo i was able to formulate the following script for bash. (I feel good to contribute somehow to the topic) #!/bin/bash INPDF=$1 PAGES=$(exiftool -args -PageCount $INPDF | cut -d'=' -f2) N=$(( $PAGES )) for ((I=1;I<=$N;I++)); do convert -density 300 $INPDF[$I] -depth 8 -strip -background white ...
by isfando
2018-09-26T07:00:46-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@snibgo i was able to formulate the following script for bash. (I feel good to contribute somehow to the topic) #!/bin/bash INPDF=$1 PAGES=$(exiftool -args -PageCount $INPDF | cut -d'=' -f2) N=$(( $PAGES )) for ((I=1;I<=$N;I++)); do convert -density 300 $INPDF[$I] -depth 8 -strip -background white -...
by isfando
2018-09-26T06:07:36-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@snibgo okay thanks alot. Sorry i had to ask alot questions because of lack of previous experience on the topic.
by isfando
2018-09-26T04:24:46-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@snibgo

Sorry i meant to say do you have an alternative to this script in the format of a bash script which can be run on linux server.
by isfando
2018-09-26T04:04:12-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@snibgo

do you have a shell script alternative to this batch script.

Code: Select all

set INPDF=sam.pdf

for /F "usebackq" %%L in (`exiftool -args -PageCount %INPDF%`) do set %%L

set /A LASTPAGE=%-PageCount%-1

for /L %%I in (0,1,%LASTPAGE%) do call DoOnePage %INPDF%[%%I] out_%%I.png
by isfando
2018-09-25T15:42:33-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@bratpit

okay i would give it a try.
by isfando
2018-09-25T15:40:03-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

@isfando: You have reverted to small characters, as you had in your first posts. Why? Your later post had larger characters, which will give better quality, thus better OCR. @snibgo What shows that i have reverted to small characters. I am not able to grasp it. I am trying to stream line the approa...
by isfando
2018-09-25T07:59:21-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

ok I got the point.Your guidance is indeed very helpful. I was able to construct your script on my machine. That goes through pdf and run your convert command page by page. But now the quality of the result png images are not sharp. Below i have given steps for two approaches and the result from ear...
by isfando
2018-09-25T06:37:59-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

Thanks for you suggestion. I will try to make use of it. But In my case i will run the script on a server machine where memory is not a problem.The server has 128gb ram. If the original convert script could be changed to handle a pdf file as a whole, it would make my work quite easy.I would want the...
by isfando
2018-09-25T04:23:53-07:00
Forum: Users
Topic: Remove horizontal summation lines but keep a minus
Replies: 28
Views: 13281

Re: Remove horizontal summation lines but keep a minus

I made the suggested changes and the results are pretty good. thanks alot. One last question in this regard. How can i feed a pdf file with multiple pages to your code and for each page the code is applied to it and as a result i get images in png format equal to number of pages in pdf file.