Page 1 of 1

UnicodeDecodeError when using PythonMagick 0.9.13 and Python 3.6

Posted: 2018-07-23T08:40:55-07:00
by rcasae
Hello,

I have some imaging code that converts TIF files into PDFs. It was originally coded in Python 2.7 using PythonMagick, and it worked great. But now, trying to port it to Python 3.6 I'm running into issues. The issue involved the PythonMagick Blob datatype. I have a blob that I am writing image data to, but whenever I try to get that data from the blob (Blob.data) then I get an unicode error:

Code: Select all

Traceback (most recent call last):
  File "image_experimenting.py", line 35, in <module>
    convert_page_in_tif_file_to_pdf(img, 1)
  File "image_experimenting.py", line 27, in convert_page_in_tif_file_to_pdf
    print(pdf_blob.data)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 634: invalid continuation byte
This unicode error will happen with any attempt to access the Blob data. There is no way to manipulate the data without getting that error.

Has anyone else run into this issue before? Or is there a recommended alternative?
I've also included code below that can reproduce the issue. The last line of the function convert_page_in_tif_file_to_pdf, which attempts to print the blob data, throws the error.

Thanks for any help,
R

Code: Select all

import io
import os


import PythonMagick
from PIL import Image


def convert_page_in_tif_file_to_pdf(tif_img, page_number):
    tif_page = io.BytesIO()
    tif_img.seek(page_number)
    tif_img.save(tif_page, format='PNG')
    binary_data = tif_page.getvalue()

    img_blob = PythonMagick.Blob()
    img_blob.update(binary_data)

    py_image = PythonMagick.Image()
    py_image.read(img_blob)
    py_image.magick('PDF')

    pdf_blob = PythonMagick.Blob()
    py_image.write(pdf_blob)

    print(pdf_blob.data)




if __name__ == "__main__":
    img_path = os.path.join('C:', os.path.sep, 'Dev', 'PythonDev', 'img', 'a.TIF')
    with Image.open(img_path) as img:
        convert_page_in_tif_file_to_pdf(img, 1)

Re: UnicodeDecodeError when using PythonMagick 0.9.13 and Python 3.6

Posted: 2018-07-23T11:37:09-07:00
by rcasae
UPDATE - I was actually able to find a 'solution' to this issue.

Code: Select all

                        
                        try:
                            pdf_blob_data = pdf_blob.data
                        except UnicodeDecodeError as e:
                            pdf_blob_data = e.object
                        pdf_pages.append(io.BytesIO(pdf_blob_data))