convert PDF to plain text

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
vargheseg
Posts: 11
Joined: 2016-02-08T14:37:40-07:00
Authentication code: 1151

convert PDF to plain text

Post by vargheseg »

I have a business need to convert pdf file which has tabular data (which has table formatted data). is it possible to convert the pdf document to a single text file keeping tabular data context.

I do not need table/cell boundary and row but need the column, row in the correct context
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: convert PDF to plain text

Post by snibgo »

ImageMagick is a raster image processor. So it can read a PDF and create raster images from it. If you want formatted text, IM is not the appropriate tool. Try something like pdftotext.
snibgo's IM pages: im.snibgo.com
vargheseg
Posts: 11
Joined: 2016-02-08T14:37:40-07:00
Authentication code: 1151

Re: convert PDF to plain text

Post by vargheseg »

Is it possible to convert to PostScript file
vargheseg
Posts: 11
Joined: 2016-02-08T14:37:40-07:00
Authentication code: 1151

Re: convert PDF to plain text

Post by vargheseg »

I doubt, as postscript files has no raster images
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: convert PDF to plain text

Post by snibgo »

vargheseg wrote:Is it possible to convert to PostScript file
With IM, you can:

Code: Select all

magick in.pdf out.ps
... but that will rasterize the PDF and embed the raster image in the PS file. If you want the text to remain as editable text, IM is the wrong tool.
snibgo's IM pages: im.snibgo.com
Post Reply