How can I get cygwin to recognize file names with special characters such as the n with a tilde above it?

Posted on 2011-10-11
Last Modified: 2013-11-15

I have recently encountered a problem with accessing files that have special characters in the file names. Specifically, the stat() function in C program fails to stat the files.  The special characters include (a couple examples) the trademark symbol and the n with a tilde above it.

I have looked at internationalization, but the recommendation from "" is:
Filenames with unusual (foreign) characters Windows filesystems use Unicode encoded as UTF-16 to store filename information. If you don't use the UTF-8 character set (see the section called “Internationalization”) then there's a chance that a filename is using one or more characters which have no representation in the character set you're using.

      In the default "C" locale, Cygwin creates filenames using the UTF-8 charset. This will always result in some
       valid filename by default, but again might impose problems when switching to a non-"C" or non-"UTF-8" charset.


      To avoid this scenario altogether, always use UTF-8 as the character set.

Suggestions on how to access these files?  

Thanks in advance...
Question by:leonvan
    LVL 37

    Expert Comment

    by:Gerwin Jansen
    Hello leonvan, it seems that Cygwin is quite clear about the issue your facing:

    "Only by using the UTF-8 charset you can avoid this problem safely." - found here, same page you got your info from, right?

    Guess you are trying to acces Windows files from cygwin, and Windows is using UTF-16 to store filename information.

    You could try stat from mkstoolkit instead of stat from cygwin

    Accepted Solution

    Solved with the help of

    If you don't define UNICODE, FindFirstFile/FindNextFile will use the ANSI versions of this API, FindFirstFileA/FindNextFileA.  If you didn't set your LANG/LC_CTYPE/LC_ALL variables to use your current Windows ANSI charset *and* called setlocale, Cygwin will use UTF-8 by default.  Therefore, the character ñ will have another multibyte encoding, 0xc3 0xb1, rather than, say, 0xf1 in Windows codepage 1252.  To avoid this problem, you can use the UNICODE API FindFirstFileW/ FindNextFileW and convert the filename the current multibyte charset via wcstombs and friends.

    Author Closing Comment

    Because this is what solved the problem.

    Featured Post

    Highfive Gives IT Their Time Back

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Join & Write a Comment

    Suggested Solutions

    I use more than 1 computer in my office for various reasons. Multiple keyboards and mice take up more than just extra space, they make working a little more complicated. Using one mouse and keyboard for all of my computers makes life easier. This co…
    A list of useful business intelligence software.
    An overview on how to enroll an hourly employee into the employee database and how to give them access into the clock in terminal.
    XMind Plus helps organize all details/aspects of any project from large to small in an orderly and concise manner. If you are working on a complex project, use this micro tutorial to show you how to make a basic flow chart. The software is free when…

    729 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now