Failed to open unicode filenames + windows

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
nicolas1

Failed to open unicode filenames + windows

Post by nicolas1 »

Hello,

It is impossible to open unicode (utf-8) filenames in IM under winxp platform. 6.3.7 -Q-8.

Filenames are manually converted from Utf-16 to Utf-8, then Magick::Image::ping(utf8_filename) is called.

I have checked what is going on in debugger. function "MagickOpenStream" calls "Latin1ToUnicodeString" if "__WINDOWS__" macro is defined and then calls "_wfopen", but as I see function must be "Utf8ToUnicodeString" instead of "Latin1ToUnicodeString"??

How is it possible to open unicode filenames under windows platform?

Fixed with own routine "Utf8ToUnicodeString" in sources inside "MagickOpenStream". Opens unicode filename under winxp.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Failed to open unicode filenames + windows

Post by magick »

We coded a "Utf8ToUnicodeString" routine in ImageMagick 6.3.7-8 beta, however, before we release we would like to compare our solution with yours to ensure we have the best possible solution. can you post your "Utf8ToUnicodeString" routine here. Thanks.
nicolas1

Re: Failed to open unicode filenames + windows

Post by nicolas1 »

Code: Select all

static wchar_t* Utf8ToUnicodeString(const char *string)
{
  int count;

  int length;

  wchar_t *unicode_string;


  length = UTF8_to_UNICODE16(string, NULL);
  
  if ((length == 0)||(length == -1))
  {
    return((wchar_t *) "");
  }
  
  count = length + 1;
  
  unicode_string=(wchar_t *) AcquireQuantumMemory(count, sizeof(wchar_t));
  
  UTF8_to_UNICODE16(string, unicode_string);
  
  unicode_string[length] = 0;
  
  return (unicode_string);
}

int UTF8_to_UNICODE16(const unsigned char* utf8, unsigned short* unicode)
{
    const unsigned char* src = utf8;
    unsigned short u;
    long cnt;
    
    if(unicode)
    {
        unsigned short* dst = unicode;
        while(*src)
        {
            if(!(*src & 0x80))
                *dst = *src;
            else
            if((*src & 0xE0)==0xC0)
            {
                u = *src;
                *dst = (u & 0x1F) << 6;
                src++;
                if((*src & 0xC0)!=0x80)
                    return -1;
                *dst |= (*src & 0x3F);
            }
            else
            if((*src & 0xF0)==0xE0)
            {
                u = *src;
                *dst = u << 12;
                src++;
                if((*src & 0xC0)!=0x80)
                    return -1;
                u = *src;
                *dst |= (u & 0x3F) << 6;
                src++;
                if((*src & 0xC0)!=0x80)
                    return -1;
                *dst |= (*src & 0x3F);
            }
            else
                return -1;
            src++;
            dst++;
        };
        *dst = 0;
        return (dst-unicode)+1;
    };
    
    cnt = 0;

    while(*src)
    {
        if(!(*src & 0x80))
            ;
        else
        if((*src & 0xE0)==0xC0)
        {
            src++;
            if((*src & 0xC0)!=0x80)
                return -1;
        }
        else
        if((*src & 0xF0)==0xE0)
        {
            src++;
            if((*src & 0xC0)!=0x80)
                return -1;
            src++;
            if((*src & 0xC0)!=0x80)
                return -1;
        }
        else
            return -1;
        src++;
        cnt++;
    };
    return cnt+1;
}
jmunin

Re: Failed to open unicode filenames + windows

Post by jmunin »

Hello:

I use winxp platform and IM 6.3.7-8 Q-8, and i think that my problem is the same (or related to subject).

i have the problem both convert.exe (in console mode) and ImageMagickObject (both me vb6/vbscript code and simpleTest.vbs example). I try (in console):

convert.exe logo: españa.gif

or (vbscript code, similar in vb6 code):

Set img = CreateObject("ImageMagickObject.MagickImage.1")
msgs = img.Convert("logo:","-format","%m,%h,%w","camión.tif")


i get the next error (in console):

convert: unable to open image `espa±a.gif': No such file or directory.

and (vb6 code, identical error in vbscript):

Error -2147215503 (identity: 435: unable to open image `C:\Documents and Settings\Administrador\Escritorio\hotfolder\camión.tif': No such file or directory:)

I'm open to (any) suggestions.

Best regards
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Failed to open unicode filenames + windows

Post by anthony »

As the look of the letter changed, you may have a encoding style problem. Check that the code of the file name is UTF-8 which should have been at least two charcaters, not a single character.
That is what it looks like.

Other than that, I can't say.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
jmunin

Re: Failed to open unicode filenames + windows

Post by jmunin »

Hello:

Thanks for your reply. In the convert.exe command case (and the other console commands too), i write the filenames directly into windows console, (don't copy-paste) so that the characters encoding used, will be the windows xp default character encoding (utf-16), i supposed.

I don't understand much about the issue, but not much i can do about that (ok, rename all my files, i don't think so).

Best regards

p.s. if you use windows, please copy next filename:

riñó.jpg

and use it (paste) in the console with next example:

convert.exe logo: riñó.jpg

does it work?
User avatar
anthony
Posts: 8883
Joined: 2004-05-31T19:27:03-07:00
Authentication code: 8675308
Location: Brisbane, Australia

Re: Failed to open unicode filenames + windows

Post by anthony »

Sorry I'm a UNIX/Linux Expert. I generally avoid windows like the plague. I just have no use for it beyond playing games!

No offense intended to anyone or any thing. It was just where my training and knowledge lies.
Anthony Thyssen -- Webmaster for ImageMagick Example Pages
https://imagemagick.org/Usage/
el_supremo
Posts: 1015
Joined: 2005-03-21T21:16:57-07:00

Re: Failed to open unicode filenames + windows

Post by el_supremo »

I use winxp pro with ImageMagick 6.3.5 07/31/07 Q8 and your command works for me.

Pete
jmunin

Re: Failed to open unicode filenames + windows

Post by jmunin »

hello,

el_supremo, thanks for your reply. Your suggestion (try old version) has been very useful

I tried the example codes with ImageMagick-5.5.7-Q8-windows-dll, using vista home and xp, and works fine!!

this way, i tried with the last version ImageMagick-6.3.8-0-Q8-windows-dll, but fail again:

I open a windows console, and type

convert logo: españa.jpg

or

convert logo: camión.png

and

convert: unable to open image `espa±a.jpg': No such file or directory.

or

convert: unable to open image `cami¾n.png': No such file or directory.

In short:

ImageMagick-5.5.7-Q8-windows-dll works ok, but ImageMagick-6.3.7.x or ImageMagick-6.3.8-0-Q8-windows-dll fail.

Best regards
bananas2
Posts: 14
Joined: 2008-02-12T08:51:47-07:00

Re: Failed to open unicode filenames + windows

Post by bananas2 »

we have the same problem with latest imagemagick and japanese filenames.

i've managed to build and debug identify, i think converting utf-8 to unicode fails for some reason and _wfopen returns 0 then.

system win2003, japanese locale

---
as identify is console app with main() i assume encoding(code page) is system default, so expecting argv to be uttf8 encoded is unreasonable...

What do you think?
bananas2
Posts: 14
Joined: 2008-02-12T08:51:47-07:00

Re: Failed to open unicode filenames + windows

Post by bananas2 »

@Anthony,

argv -> UTF-8 to UTF-16 -> wfopen -> fail.
argv -> MultiByteToWideChar + CP_ACP -> wfopen -> success

code:

Code: Select all

	wchars_num =  MultiByteToWideChar(CP_ACP , 0 , path , -1, NULL , 0 );
	unicode_path=(wchar_t *) AcquireQuantumMemory(wchars_num, sizeof(wchar_t));
	MultiByteToWideChar( CP_ACP , 0 , path , -1, unicode_path , wchars_num );
same for getpathattributes

*NO* utf8 on windows. i guess this must be fixed.
Post Reply