Page 1 of 1

caption word wrap problem in Chinese (UTF-8) with whitespace

Posted: 2011-08-06T02:01:01-07:00
by lne1030
viewtopic.php?f=3&t=14179&start=0
That topic has discussed chinese string without whitespace.

Now I need to output an image with such command :
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif


The chinese string is split by a whitespace. The words before the whitespace has not been wrapped , and words after whitespace has been wrapped currently.

If I run another command :
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif
The whitespace is replaced by a SBC case whitespace

Then the caption.gif will appear some messy code like "???"


The chinese font can be downloaded form http://cle.linux.org.tw/fonts/wangfonts/wts11.ttf
My ImageMagick version is 6.7.0-7 2011-07-04 Q16 , Mac Snow Leopard

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-06T08:01:15-07:00
by magick
We can reproduce the problem you reported and will get a patch in ImageMagick 6.7.1-3 Beta within a few days. Thanks.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-07T23:20:59-07:00
by lne1030
magick wrote:We can reproduce the problem you reported and will get a patch in ImageMagick 6.7.1-3 Beta within a few days. Thanks.
How and When could I get the patch ?

I think We need much more testing to avoid this kind of BUGs... :lol:

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-07T23:57:02-07:00
by anthony
The patch is in the very latest IM whcih you can download from the specified beta release from the IM web site downloads from from the SVN branches for version 6.7.1 (main trunk is being used for the new IM v7 development)

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-08T17:21:39-07:00
by lne1030
anthony wrote:The patch is in the very latest IM whcih you can download from the specified beta release from the IM web site downloads from from the SVN branches for version 6.7.1 (main trunk is being used for the new IM v7 development)
You mean downloading the newest source code for ImageMagick-6.7.1-3.zip on mirrors?

I did't found the SVN url on IM web site...

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-08T17:31:39-07:00
by anthony
Subversion is the first line (embedded in the text) on the Download page.
http://imagemagick.org/script/download.php
Specifically...
http://imagemagick.org/script/subversion.php

Code: Select all

mkdir IM; cd IM;
svn co https://magick.imagemagick.org/subversion/ImageMagick/branches/ImageMagick-6.7.1/ .
Anyone can download, but only developers can upload.


NOTE the main trunk is being used for IMv7 development, and this is in alpha development, and as such many problems will be present until the core work is complete, and the command line interface (and examples for IMv7) is updated. It is not recommended for non-developers at this time.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-18T18:03:33-07:00
by lne1030

Code: Select all

convert -background black -fill white -font wts11.ttf -pointsize 20 -size 100x -encoding utf8 caption:"一二三四五六七八九十 一二三四五六七八九十" caption.gif
When I run this command, some messy code will appear in the image...


I thought this is another bug..

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-18T18:24:36-07:00
by fmw42
you did not say what version of IM your last test was using? IM 6.7.1.7 is available at http://www.imagemagick.org/script/binary-releases.php or http://www.imagemagick.org/download/www ... ource.html

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-18T18:29:18-07:00
by magick
We can reproduce the problem you reported. We'll try to get a patch in ImageMagick 6.7.1-8 Beta within a day or two. Thanks.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-19T17:42:42-07:00
by lne1030
magick wrote:We can reproduce the problem you reported. We'll try to get a patch in ImageMagick 6.7.1-8 Beta within a day or two. Thanks.
I update the newest code form SVN , and that bug has disappear.

Good Job.

But, there is another BUG:
convert -background black -fill white -font wts11.ttf -pointsize 20 -size 200x -encoding utf8 caption:"一二三四五六 一二三四五" caption.gif
On this command, what I want is just only a white space , not a line break.

In chinese, "一二三四五六" is six words, not one word.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-20T06:51:37-07:00
by magick
Caption breaks the text if it can't fit in the allocated space. It will try to break on a whitespace otherwise it breaks between two unicode characters. If you do not want a break, use label: instead of caption:.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-20T07:39:38-07:00
by lne1030
I'm sorry I didn't explain my question.

Code: Select all

convert -background black -fill white \
			-font wts11.ttf \
			-pointsize 20 -size 80x \
			-encoding utf8 \
			caption:"一二三四" \
			caption.gif \
On this command, the characters "一二三四" is in one line. That is no problem.

Code: Select all

convert -background black -fill white \
			-font wts11.ttf \
			-pointsize 20 -size 80x \
			-encoding utf8 \
			caption:"一二 三四" \
			caption.gif \
On this command with a whitespace between "一二" and "三四". The result is "一二 " on the first line, "三四" on the second line.

But in chinese convertion, the result is should be "一二 三" on the first line, "四" on the second line. Chinese dont need to break line on a whitespace, but only break between two chinese characters.

In English, one word means some characters between two whitespaces or other symbols. In chinese, one character is one word, so don't need to break line on a whitespace.

http://en.wikipedia.org/wiki/Space_%28p ... ween_words
There is wiki about this .

I'm sorry for I don't kown C++, otherwise I could try to submit the patch..

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-20T09:25:44-07:00
by magick
In your original example, the whitespace is ideographic space (dec 12288). In your last example, it is a traditional space (dec 32). If you modify your last example to use the ideographic space for the whitespace character you should get the correct rendering. Otherwise we'll need to add an option to the ImageMagick command line, something like +break-on-whitespace or -language chinese. With ImageMagick, we're only interested in rudimentary text handling. For proper handling of text in any language, you might want to switch to Pango.

Re: caption word wrap problem in Chinese (UTF-8) with whites

Posted: 2011-08-20T17:47:01-07:00
by lne1030
Thank you, I will try Pango.

http://www.w3.org/International/article ... #Slide0090
There is w3c article about CJK line breaking.