Annotate fails for unicode chars from broader set

PerlMagick is an object-oriented Perl interface to ImageMagick. Use this forum to discuss, make suggestions about, or report bugs concerning PerlMagick.
Post Reply
magiotrope
Posts: 11
Joined: 2012-03-19T06:05:50-07:00
Authentication code: 8675308

Annotate fails for unicode chars from broader set

Post by magiotrope »

I tried Times New Roman and other fonts. Regardless -- characters such as '∥', '⊞', '≅', and many others, including some composite (combining) characters from various alphabets are rendered as quesitonmarks by -->Annotate

(this is the output of the code below: http://www.flight.us/misc/myannot.jpg and this is a snapshot of the same text displayed in my OpenOffice WRiter, same system, same font: http://www.flight.us/misc/oowriter_unicode_render.jpg )

Code: Select all

my $bi=new Image::Magick( size => '120x60');
$bi->Read('xc:white');
my $text='A ∥  ⊞ ≅ B';
$bi->Annotate(font=>$timesfont, pointsize=>20, fill=>'black', text=>$text,
              encoding=>{UTF-8}, geometry=>'+20+40'
    );

$bi->Write( "pix/myannot.jpg" );
Any ideas how to get perlmagick to render these correctly?

More common symbols, and letters are rendered fine.
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Annotate fails for unicode chars from broader set

Post by anthony »

ImageMagick will handle UNICODE just fine.

The text must be in utf-8 but that is pretty well implied as standard. You can use a converter like "iconv" or "recode" to convert other text formats (like utf16, or ISO8859-1, or gb2312) to utf8

What it does however need is a font that contains the unicode characters.
Most fonts are incomplete, with only 'common' unicode characters defined.
And almost none has the very large set of Asian Characters!

I do not think Times has all those symbols. Though the Microsoft "mincho" font generally does.

You can test with the command line...

Code: Select all

   convert -font Mincho -pointsize 36  label:"A ∥  ⊞ ≅ B"  mincho_test.png
no problems

Code: Select all

   convert -font Times -pointsize 36  label:"A ∥  ⊞ ≅ B"  times_test.png
I get image containing "A ? ? ? B" so obviously the 'glyphs' are not defined in that font! I also get the same thing from an "Arial" font. Other fonts often use dotted boxes for unknown characters to better differentiate them from question marks.

See IM Examples, Text to Image Handling, Unicode
http://www.imagemagick.org/Usage/text/#unicode

For adding new fonts to IM for easy use see...
http://www.imagemagick.org/Usage/#font
But you can specify the TTF font file directly too.

Code: Select all

   -font /path/to/fonts/mincho.ttf
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
magiotrope
Posts: 11
Joined: 2012-03-19T06:05:50-07:00
Authentication code: 8675308

Re: Annotate fails for unicode chars from broader set

Post by magiotrope »

anthony wrote:ImageMagick will handle UNICODE just fine.

The text must be in utf-8 but that is pretty well implied as standard. You can use a converter like "iconv" or "recode" to convert other text formats (like utf16, or ISO8859-1, or gb2312) to utf8

What it does however need is a font that contains the unicode characters.
Most fonts are incomplete, with only 'common' unicode characters defined.
And almost none has the very large set of Asian Characters!

I do not think Times has all those symbols. Though the Microsoft "mincho" font generally does.

You can test with the command line...

Code: Select all

   convert -font Mincho -pointsize 36  label:"A ∥  ⊞ ≅ B"  mincho_test.png
no problems

Code: Select all

   convert -font Times -pointsize 36  label:"A ∥  ⊞ ≅ B"  times_test.png
I get image containing "A ? ? ? B" so obviously the 'glyphs' are not defined in that font! I also get the same thing from an "Arial" font. Other fonts often use dotted boxes for unknown characters to better differentiate them from question marks.
"ms-mincho.ttf" works for me as well. I was certain it was not the shortcomings of the font, because, like i said originally, I see these characters rendered perfectly in my OOwriter, when I select the Times New Roman font. Then, just to be sure, I cursor through every one of these letters, and OOwriter confirms that the Times New Roman is selected (could it be doing some sneaky behind-the-scenes substitution?)
Post Reply