Combining diacritics sometimes cut off using Pango

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Post Reply
CatBus
Posts: 11
Joined: 2016-04-18T15:28:21-07:00
Authentication code: 1151

Combining diacritics sometimes cut off using Pango

Post by CatBus »

It looks like the bounding rectangle for text sometimes isn't quite large enough to accommodate certain combining diacritics.

convert -pointsize 100 -font Mangal pango:"बहुत चुकी" example.png

You'll see that in the first word, there's a combining diacritic that gets a few pixels shaved off the bottom, while in the second word, that combining mark is placed higher and remains intact.

I'm not sure if this is really a bug in ImageMagick, or just a limitation of the Pango integration, but I've seen this problem on Windows and Linux, in IM6 and IM7, in Arabic, Thai, and Hindi, using Arial, Tahoma, and Mangal, respectively. Technically, in Unicode, you can just stack combining diacritics over & over to levels never actually used in any language, so a reasonable cutoff for the bounding rectangle is still probably desirable--the current one is just a little small for normal words in these languages.

There is also an easy workaround. If you insert a blank line before & after the text you render, all diacritics are displayed intact. I'm certainly happy doing that if that's the only way to prevent this issue.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Combining diacritics sometimes cut off using Pango

Post by fmw42 »

We cannot see your characters. Are you sure you are actually giving it UTF-8 compatible characters. Can you put your text into a UTF-8 compatible text file and do

Code: Select all

convert -pointsize 100 -font Mangal pango:"@testfile.txt" example.png
Does the result work any better?
CatBus
Posts: 11
Joined: 2016-04-18T15:28:21-07:00
Authentication code: 1151

Re: Combining diacritics sometimes cut off using Pango

Post by CatBus »

No, it's the same inline and from a text file, and yes, its UTF-8. You'd need a Hindi font to see the characters. Mangal is a pretty common one you can get with any recent version of MS Office.

I chose the Hindi example because it was the most apparent. I could be staring right at the Thai and Arabic examples and not see the problem unless I lined up the two images in Photoshop. All of them are what I'd call minor--the text is still perfectly readable, there's just an odd edge on the character where there shouldn't be.
Post Reply