Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 41
  • Last Modified:

PHP &Mysql encoding problem

I am selecting Arabic and English file names from mysql database...

But the php function is_file() don't recognize the file name, although it can see the English files.

I tried to detect the selected name encoding, found (UTF-8)

$q = mysql_query("select `file` from `my_table` ");

while( $data = mysql_fetch_assoc($q) ){
    
    $file_path = "path_to_file".$data['file'];

    if (is_file ($file_path)){
    
       echo $data['file']." found<br/>";
  
    }

}

Open in new window

0
darroosh
Asked:
darroosh
1 Solution
 
Dave BaldwinFixer of ProblemsCommented:
0
 
Ray PaseurCommented:
Couple of thoughts... PHP was built on the assumption that a character == a byte.  Perhaps this made sense in a 1990's sort of way in western languages, where all 256 characters could be represented by 8 bits (ISO-8859-1), but it ignored most of the world where there can be, literally, thousands of characters needed to communicate meaning.  Palpably something was amiss.

Enter UTF-8 encoding.  Now you can have from one to four bytes in each character.  Below code point 128 the ASCII characters match, so ISO-8859-1 and UTF-8 look the same there.  Above code point 128, UTF-8 characters are multi-byte.  This article explains the details and shows the symptoms of character set collisions.

PHP's assumptions about character sets are changing at Release 5.4, so you may be in for some surprises as you upgrade.  However, the server file system probably still has the 1:1 ratio of byte:character.  I recommend that you use only ASCII characters in file names.  This will guarantee that your names will be in consonance with the 1:1 ratio that prevents collisions between UTF-8 and ASCII.  The recommendation only extends to the file names, not the contents of the files or the SQL tables.  The internal encoding of the data can be UTF-8, no matter what characters are used to name the files.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now