Page 1 of 1

Remove Lines from Scanned Images

Posted: 2019-01-29T08:32:43-07:00
by shobhitkapil
Hi Team,

I am looking help for removing all lines from the scanned pdf, so the scenario is i have tax bills in which my values reside under the rectangular area so i am to exactly remove the rectangle borders and want text only.

All the code I need in c# i am seeing so many related posts in php but need some assistance on c#

I am using Magick.NET-Q16-x64.dll for image processing before passing it to Tesseract to lift the text from the Images.
If you need Images i can provide that too.

Thanks,
Shobhit

Re: Remove Lines from Scanned Images

Posted: 2019-01-29T14:20:44-07:00
by fmw42
What have you tried? Can you provide an example image?

Re: Remove Lines from Scanned Images

Posted: 2019-01-30T01:49:42-07:00
by shobhitkapil
Thanks for the reply...

Please find the below link having sample image...
https://www.dropbox.com/s/4liae22jjwwzk ... l.pdf?dl=0

Below is the little explanation again what exactly i am looking for:

Registration Evaluation Vin Bill Number
AP31BE3785 16800 ABCD123DRF3GH 23005

I need the data to be like this but all this data is there inside a table and rectangle border so i need to remove those rectangular border.

Thanks,
Shobhit

Re: Remove Lines from Scanned Images

Posted: 2019-01-30T10:39:34-07:00
by fmw42