Page 1 of 1

convert PDF to plain text

Posted: 2018-09-07T15:40:58-07:00
by vargheseg
I have a business need to convert pdf file which has tabular data (which has table formatted data). is it possible to convert the pdf document to a single text file keeping tabular data context.

I do not need table/cell boundary and row but need the column, row in the correct context

Re: convert PDF to plain text

Posted: 2018-09-07T16:21:25-07:00
by snibgo
ImageMagick is a raster image processor. So it can read a PDF and create raster images from it. If you want formatted text, IM is not the appropriate tool. Try something like pdftotext.

Re: convert PDF to plain text

Posted: 2018-09-07T18:21:05-07:00
by vargheseg
Is it possible to convert to PostScript file

Re: convert PDF to plain text

Posted: 2018-09-07T18:22:32-07:00
by vargheseg
I doubt, as postscript files has no raster images

Re: convert PDF to plain text

Posted: 2018-09-08T03:52:00-07:00
by snibgo
vargheseg wrote:Is it possible to convert to PostScript file
With IM, you can:

Code: Select all

magick in.pdf out.ps
... but that will rasterize the PDF and embed the raster image in the PS file. If you want the text to remain as editable text, IM is the wrong tool.