using fopen with hebrew file name inside a UTF-8 encoded file problem

Hello Experts,

I'm having problems saving a file in php which has an hebrew file name characters.
The problem occurs if the php file which I run the script from is a UTF-8 encoded php file.
If the php file is ANSI encoded the file created is ok.

I'm working with Windows 2000 Server and the Regional Options of the system are configured for the Hebrew language.

Please your advice,
Many thanks,
Doron Tal

$fp = fopen('images/àáÙÕß.dat', 'w');
fwrite($fp, '1');
fwrite($fp, '23');

Open in new window

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Please see this document:

if you have to write a file in UTF-8 format, you have to add an header to the file like this :

$f=fopen("test.txt", "wb");
// adding header
fputs($f, $text);
doront99Author Commented:
hey mgonullu,

I don't need to write a file in UTF-8 format.

What I need is to name the file with a utf-8 file name.
Please see example attached to my original message.

doront99Author Commented:
the issue as actually about the FILE NAME and not about the file's content.

Become a Microsoft Certified Solutions Expert

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

I get the point,
what is the error your are getting?
doront99Author Commented:
it's not an error, but the file's name that was created on the folder is wrong.
for example: ó ó¡ó"ó"óŸ.dat instead of àáÙÕß.dat
which are the same name, but the first is the unicode in characters (two chars for each letter in the string).

Marcus BointonCommented:
You say the system is configured for Hebrew - is it using an 8-bit encoding (ISO-8859 or Windows code page) or unicode? If you are writing UTF-8 filenames in an 8-bit system, it will not work. Make sure your system is expecting UTF-8, or alternatively switch your charset to UTF-system-wide.
doront99Author Commented:
From what I read about the issue on the net, PHP prior version 6 does not support UTF inside the code (like UTF-8 variables).

So, I guess this is it.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Marcus BointonCommented:
No, that's not it. This kind of thing should not be affected by whether PHP supports unicode or not. PHP will happily handle UTF-8 data transparently, even in PHP5. What PHP6 changes is how it processes UTF-8 data, that is functions like strlen do not understand UTF-8 in PHP5, so you will get unexpected results, however, that has zero impact on being able to put some UTF-8 data into a string and pass it around without problems.

You can't have UTF-8 variable names, but there is nothing stopping you putting literal UTF-8 strings into variables and it will work just fine, for example:

$var = "­)3é4";

(that's a mixture of Cyrilic, Amenian, Katakana, Arabic and Hebrew, fingers crossed that EE copes with it...)

For that string, strlen will give the wrong result in PHP5 but the correct one in PHP6. Either way, if you do:

echo $var;

it will work fine in both.

Now i would expect that you can use UTF-8 for filenames too, if your underlying OS supports it, as PHP just treats it as a bunch of bytes. It's quite common for OSs to use UTF-16 instead, so you probably just need to make sure that your encodings line up.

I just tried this on OS X:

$var = "­)3é4";
file_put_contents($var, $var);

And it worked perfectly, creating a file containing that string, named using that string.
Marcus BointonCommented:
Well there's a surprise - EE can't cope with UTF-8, which makes this kind of hard to show...
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.