Page 1 of 1

Annotate fails for unicode chars from broader set

Posted: 2012-03-19T10:17:25-07:00
by magiotrope
I tried Times New Roman and other fonts. Regardless -- characters such as '∥', '⊞', '≅', and many others, including some composite (combining) characters from various alphabets are rendered as quesitonmarks by -->Annotate

(this is the output of the code below: http://www.flight.us/misc/myannot.jpg and this is a snapshot of the same text displayed in my OpenOffice WRiter, same system, same font: http://www.flight.us/misc/oowriter_unicode_render.jpg )

Code: Select all

my $bi=new Image::Magick( size => '120x60');
$bi->Read('xc:white');
my $text='A ∥  ⊞ ≅ B';
$bi->Annotate(font=>$timesfont, pointsize=>20, fill=>'black', text=>$text,
              encoding=>{UTF-8}, geometry=>'+20+40'
    );

$bi->Write( "pix/myannot.jpg" );
Any ideas how to get perlmagick to render these correctly?

More common symbols, and letters are rendered fine.

Re: Annotate fails for unicode chars from broader set

Posted: 2012-03-19T17:50:03-07:00
by anthony
ImageMagick will handle UNICODE just fine.

The text must be in utf-8 but that is pretty well implied as standard. You can use a converter like "iconv" or "recode" to convert other text formats (like utf16, or ISO8859-1, or gb2312) to utf8

What it does however need is a font that contains the unicode characters.
Most fonts are incomplete, with only 'common' unicode characters defined.
And almost none has the very large set of Asian Characters!

I do not think Times has all those symbols. Though the Microsoft "mincho" font generally does.

You can test with the command line...

Code: Select all

   convert -font Mincho -pointsize 36  label:"A ∥  ⊞ ≅ B"  mincho_test.png
no problems

Code: Select all

   convert -font Times -pointsize 36  label:"A ∥  ⊞ ≅ B"  times_test.png
I get image containing "A ? ? ? B" so obviously the 'glyphs' are not defined in that font! I also get the same thing from an "Arial" font. Other fonts often use dotted boxes for unknown characters to better differentiate them from question marks.

See IM Examples, Text to Image Handling, Unicode
http://www.imagemagick.org/Usage/text/#unicode

For adding new fonts to IM for easy use see...
http://www.imagemagick.org/Usage/#font
But you can specify the TTF font file directly too.

Code: Select all

   -font /path/to/fonts/mincho.ttf

Re: Annotate fails for unicode chars from broader set

Posted: 2012-03-20T00:51:36-07:00
by magiotrope
anthony wrote:ImageMagick will handle UNICODE just fine.

The text must be in utf-8 but that is pretty well implied as standard. You can use a converter like "iconv" or "recode" to convert other text formats (like utf16, or ISO8859-1, or gb2312) to utf8

What it does however need is a font that contains the unicode characters.
Most fonts are incomplete, with only 'common' unicode characters defined.
And almost none has the very large set of Asian Characters!

I do not think Times has all those symbols. Though the Microsoft "mincho" font generally does.

You can test with the command line...

Code: Select all

   convert -font Mincho -pointsize 36  label:"A ∥  ⊞ ≅ B"  mincho_test.png
no problems

Code: Select all

   convert -font Times -pointsize 36  label:"A ∥  ⊞ ≅ B"  times_test.png
I get image containing "A ? ? ? B" so obviously the 'glyphs' are not defined in that font! I also get the same thing from an "Arial" font. Other fonts often use dotted boxes for unknown characters to better differentiate them from question marks.
"ms-mincho.ttf" works for me as well. I was certain it was not the shortcomings of the font, because, like i said originally, I see these characters rendered perfectly in my OOwriter, when I select the Times New Roman font. Then, just to be sure, I cursor through every one of these letters, and OOwriter confirms that the Times New Roman is selected (could it be doing some sneaky behind-the-scenes substitution?)