Unicode BOM header

Post any defects you find in the released or beta versions of the ImageMagick software here. Include the ImageMagick version, OS, and any command-line required to reproduce the problem. Got a patch for a bug? Post it here.
Post Reply
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Unicode BOM header

Post by snibgo »

Windows 7, IM 6.5.8-8:

convert -size 1920x1080 -gravity South ^
-background None ^
-fill blue ^
-font Verdana -pointsize 100 ^
caption:@captCopy.txt ^
captCopy.png

captcopy.txt is encoded in UTF-8:
{ef}{bb}{bf}Copyright {c2}{a9} 2010 Alan Gibson\r\n

Where {ef} is the single byte, represented in hex as EF, etc.

The first three bytes are the UTF byte order mark, commonly added by Windows software to indicate UTF coding.

ImageMagick correctly interprets {c2}{a9} as the copyright symbol, but writes a question mark for {ef}{bb}{bf}. Could it please ignore BOM?
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Unicode BOM header

Post by fmw42 »

In the mean time, can you not use a Windows editor that skips the BOM? It appears that Notepad can do that. See viewtopic.php?f=1&t=15326&p=54104&hilit=BOM#p54104
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Unicode BOM header

Post by snibgo »

That is "Notepad++", which is a different program.

There are lots of workarounds. For this purpose, I use a program that strips out any BOMs from a file. But it is annoying; the correct behaviour for IM, I feel, is to act like any text editor and not to display the "?" glyph or any other glyph for a BOM.
snibgo's IM pages: im.snibgo.com
Post Reply