Page 1 of 1

Codepage for JSON output

Posted: 2018-07-05T22:28:20-07:00
by AlexRozen
Let's imagine that we have someimage.jpg with embedded comment, containing some non-latin characters.

I am trying to use following command:
magick.exe convert someimage.jpg someimage.json

resulting json does contain the comment, but it is written in the current windows ANSI codepage (1251 in my location).
I am sure that correct encoding for JSON output should be something more universal, like UTF-8.

Can that be configured via the CLI, or it's a bug?

Re: Codepage for JSON output

Posted: 2018-07-05T22:41:38-07:00
by fmw42
I cannot answer the question about json output. But in Imagemagick 7, one uses magick, not convert and not magick convert.

Re: Codepage for JSON output

Posted: 2018-07-06T00:25:55-07:00
by snibgo
@AlexRozen: please link to a sample image file that contains a comment with non-Latin characters.

Please also say what version of IM you use, on what platform (I guess Windows).

Re: Codepage for JSON output

Posted: 2018-07-06T05:57:34-07:00
by AlexRozen
I have checked it with ImageMagick-7.0.8-5-portable-Q16-x64
Sample image is here https://drive.google.com/file/d/1R-bRWZ ... sp=sharing

use command
magick.exe convert IMG_3010.JPG IMG_3010.JSON

Resulting json-file contains "comment": "Надежда"

It's pure Cyrillic, so it can be saved into cp1251 correctly. But I can't be sure about it on other platforms and/or distributions.

P.S. I have checked the binary of this jpeg file and another DICOM image file. Both are containing cyrillic strings in binary cp1251 form inside of them.
So, it seems that they are originally stored without unicode and ImageMagick have no chances to determine their true codepage :(

Re: Codepage for JSON output

Posted: 2018-07-06T06:28:02-07:00
by snibgo
AlexRozen wrote:So, it seems that they are originally stored without unicode and ImageMagick have no chances to determine their true codepage
Yes, as you say, the text is encoded as CP 1251, not UTF. IM can't guess which codepage is needed.