caption word wrap problem in Chinese (UTF-8) with whitespace

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Post Reply
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

caption word wrap problem in Chinese (UTF-8) with whitespace

Post by lne1030 »

viewtopic.php?f=3&t=14179&start=0
That topic has discussed chinese string without whitespace.

Now I need to output an image with such command :
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif


The chinese string is split by a whitespace. The words before the whitespace has not been wrapped , and words after whitespace has been wrapped currently.

If I run another command :
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif
The whitespace is replaced by a SBC case whitespace

Then the caption.gif will appear some messy code like "???"


The chinese font can be downloaded form http://cle.linux.org.tw/fonts/wangfonts/wts11.ttf
My ImageMagick version is 6.7.0-7 2011-07-04 Q16 , Mac Snow Leopard
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by magick »

We can reproduce the problem you reported and will get a patch in ImageMagick 6.7.1-3 Beta within a few days. Thanks.
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

magick wrote:We can reproduce the problem you reported and will get a patch in ImageMagick 6.7.1-3 Beta within a few days. Thanks.
How and When could I get the patch ?

I think We need much more testing to avoid this kind of BUGs... :lol:
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by anthony »

The patch is in the very latest IM whcih you can download from the specified beta release from the IM web site downloads from from the SVN branches for version 6.7.1 (main trunk is being used for the new IM v7 development)
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

anthony wrote:The patch is in the very latest IM whcih you can download from the specified beta release from the IM web site downloads from from the SVN branches for version 6.7.1 (main trunk is being used for the new IM v7 development)
You mean downloading the newest source code for ImageMagick-6.7.1-3.zip on mirrors?

I did't found the SVN url on IM web site...
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by anthony »

Subversion is the first line (embedded in the text) on the Download page.
http://imagemagick.org/script/download.php
Specifically...
http://imagemagick.org/script/subversion.php

Code: Select all

mkdir IM; cd IM;
svn co https://magick.imagemagick.org/subversion/ImageMagick/branches/ImageMagick-6.7.1/ .
Anyone can download, but only developers can upload.


NOTE the main trunk is being used for IMv7 development, and this is in alpha development, and as such many problems will be present until the core work is complete, and the command line interface (and examples for IMv7) is updated. It is not recommended for non-developers at this time.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

Code: Select all

convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif
When I run this command, some messy code will appear in the image...


I thought this is another bug..
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by fmw42 »

you did not say what version of IM your last test was using? IM 6.7.1.7 is available at http://www.imagemagick.org/script/binary-releases.php or http://www.imagemagick.org/download/www ... ource.html
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by magick »

We can reproduce the problem you reported. We'll try to get a patch in ImageMagick 6.7.1-8 Beta within a day or two. Thanks.
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

magick wrote:We can reproduce the problem you reported. We'll try to get a patch in ImageMagick 6.7.1-8 Beta within a day or two. Thanks.
I update the newest code form SVN , and that bug has disappear.

Good Job.

But, there is another BUG:
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 200x -encoding utf8 caption:"一二三四五六 一二三四五" caption.gif
On this command, what I want is just only a white space , not a line break.

In chinese, "一二三四五六" is six words, not one word.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by magick »

Caption breaks the text if it can't fit in the allocated space. It will try to break on a whitespace otherwise it breaks between two unicode characters. If you do not want a break, use label: instead of caption:.
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

I'm sorry I didn't explain my question.

Code: Select all

convert -background black -fill white \
			-font wts11.ttf \
			-pointsize 20 -size 80x \
			-encoding utf8 \
			caption:"一二三四" \
			caption.gif \
On this command, the characters "一二三四" is in one line. That is no problem.

Code: Select all

convert -background black -fill white \
			-font wts11.ttf \
			-pointsize 20 -size 80x \
			-encoding utf8 \
			caption:"一二 三四" \
			caption.gif \
On this command with a whitespace between "一二" and "三四". The result is "一二 " on the first line, "三四" on the second line.

But in chinese convertion, the result is should be "一二 三" on the first line, "四" on the second line. Chinese dont need to break line on a whitespace, but only break between two chinese characters.

In English, one word means some characters between two whitespaces or other symbols. In chinese, one character is one word, so don't need to break line on a whitespace.

http://en.wikipedia.org/wiki/Space_%28p ... ween_words
There is wiki about this .

I'm sorry for I don't kown C++, otherwise I could try to submit the patch..
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by magick »

In your original example, the whitespace is ideographic space (dec 12288). In your last example, it is a traditional space (dec 32). If you modify your last example to use the ideographic space for the whitespace character you should get the correct rendering. Otherwise we'll need to add an option to the ImageMagick command line, something like +break-on-whitespace or -language chinese. With ImageMagick, we're only interested in rudimentary text handling. For proper handling of text in any language, you might want to switch to Pango.
lne1030
Posts: 7
Joined: 2011-08-05T23:59:35-07:00
Authentication code: 8675308

Re: caption word wrap problem in Chinese (UTF-8) with whites

Post by lne1030 »

Thank you, I will try Pango.

http://www.w3.org/International/article ... #Slide0090
There is w3c article about CJK line breaking.
Post Reply