Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

using fopen with hebrew file name inside a UTF-8 encoded file problem

Posted on 2008-10-29
9
1,806 Views
Last Modified: 2013-12-13
Hello Experts,

I'm having problems saving a file in php which has an hebrew file name characters.
The problem occurs if the php file which I run the script from is a UTF-8 encoded php file.
If the php file is ANSI encoded the file created is ok.

I'm working with Windows 2000 Server and the Regional Options of the system are configured for the Hebrew language.

Please your advice,
Many thanks,
Doron Tal


<?php
$fp = fopen('images/àáÙÕß.dat', 'w');
fwrite($fp, '1');
fwrite($fp, '23');
fclose($fp);
?>

Open in new window

test-ansi.txt
test-utf.txt
0
Comment
Question by:doront99
  • 4
  • 3
  • 2
9 Comments
 
LVL 9

Assisted Solution

by:mgonullu
mgonullu earned 100 total points
ID: 22839031
Please see this document:
http://www.php.net/fwrite

if you have to write a file in UTF-8 format, you have to add an header to the file like this :

<?php
$f=fopen("test.txt", "wb");
$text=utf8_encode("ýaý!");
// adding header
$text="\xEF\xBB\xBF".$text;
fputs($f, $text);
fclose($f);
?>
0
 

Author Comment

by:doront99
ID: 22839098
hey mgonullu,

I don't need to write a file in UTF-8 format.

What I need is to name the file with a utf-8 file name.
Please see example attached to my original message.

Thanks,
Doron
0
 

Author Comment

by:doront99
ID: 22839102
the issue as actually about the FILE NAME and not about the file's content.

Thanks
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
LVL 9

Expert Comment

by:mgonullu
ID: 22845893
Hmmm,
I get the point,
what is the error your are getting?
0
 

Author Comment

by:doront99
ID: 22847772
it's not an error, but the file's name that was created on the folder is wrong.
for example: ó ó¡ó"ó"óŸ.dat instead of àáÙÕß.dat
which are the same name, but the first is the unicode in characters (two chars for each letter in the string).

Thanks,
Doron
0
 
LVL 25

Assisted Solution

by:Marcus Bointon
Marcus Bointon earned 100 total points
ID: 22848478
You say the system is configured for Hebrew - is it using an 8-bit encoding (ISO-8859 or Windows code page) or unicode? If you are writing UTF-8 filenames in an 8-bit system, it will not work. Make sure your system is expecting UTF-8, or alternatively switch your charset to UTF-system-wide.
0
 

Accepted Solution

by:
doront99 earned 0 total points
ID: 22892704
From what I read about the issue on the net, PHP prior version 6 does not support UTF inside the code (like UTF-8 variables).

So, I guess this is it.

Thanks,
Doron
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 22894238
No, that's not it. This kind of thing should not be affected by whether PHP supports unicode or not. PHP will happily handle UTF-8 data transparently, even in PHP5. What PHP6 changes is how it processes UTF-8 data, that is functions like strlen do not understand UTF-8 in PHP5, so you will get unexpected results, however, that has zero impact on being able to put some UTF-8 data into a string and pass it around without problems.

You can't have UTF-8 variable names, but there is nothing stopping you putting literal UTF-8 strings into variables and it will work just fine, for example:

$var = "­)3é4";

(that's a mixture of Cyrilic, Amenian, Katakana, Arabic and Hebrew, fingers crossed that EE copes with it...)

For that string, strlen will give the wrong result in PHP5 but the correct one in PHP6. Either way, if you do:

echo $var;

it will work fine in both.

Now i would expect that you can use UTF-8 for filenames too, if your underlying OS supports it, as PHP just treats it as a bunch of bytes. It's quite common for OSs to use UTF-16 instead, so you probably just need to make sure that your encodings line up.

I just tried this on OS X:

<?php
$var = "­)3é4";
file_put_contents($var, $var);
?>

And it worked perfectly, creating a file containing that string, named using that string.
0
 
LVL 25

Expert Comment

by:Marcus Bointon
ID: 22894251
Well there's a surprise - EE can't cope with UTF-8, which makes this kind of hard to show...
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
This article discusses how to create an extensible mechanism for linked drop downs.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

838 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question