adznon
asked on
Problem displaying foreign characters on the website from a MS SQL Server
Problem displaying foreign characters on the website from a MS SQL Server
The Name in SQL Database is : Lukáš Kovdsanda
This is displaying on website as : Luk� Kovdsanda
However, if I save from the website to the database it saves in the database as this "Lukáš Kovdsanda"
select SERVERPROPERTY('collation' ) give the MS SQL collation as : Latin1_General_CI_AS
What do I need to do on the html/php code to fix this
The Name in SQL Database is : Lukáš Kovdsanda
This is displaying on website as : Luk� Kovdsanda
However, if I save from the website to the database it saves in the database as this "Lukáš Kovdsanda"
select SERVERPROPERTY('collation'
What do I need to do on the html/php code to fix this
Which version of MS SQL Server are you using?
Try to change your character set to UTF-8 Unicode encoding.
You can find more in detail on below link
https://docs.microsoft.com /en-us/sql /relationa l-database s/collatio ns/collati on-and-uni code-suppo rt?view=sq l-server-2 017
https://docs.microsoft.com
ASKER
Hi,
Thanks for the reply the Database is SQL Server 13.0
I have tried to change the character set in html to utf-8 but that doesnt help
Thanks for the reply the Database is SQL Server 13.0
I have tried to change the character set in html to utf-8 but that doesnt help
When you checked your data in the Database, is that in correct format?
ASKER
Yes, all names display correctly in the database
So first, make sure you provide us with the code you're using to pull the values from the database and display them.
Now, if it looks correct in the database but not in the browser, then that almost always means that the browser is not using the same character set as what's in the database, and so it doesn't know what to do with the data except try to display it how it THINKS it should.
There are several character sets that can display special characters, and usually they all have different ways of storing the data, and they have pros and cons compared to the others. For example, if your data is truly stored as Latin1_General_CI_AS, then special characters only take up one byte of data, but you don't have very many choices of special characters (for example, you couldn't store a Japanese character with Latin1_General_CI_AS). If you use UTF-8, then you can use almost any special character in the world but some of them might take up a couple of extra bytes of storage.
So it's really a question of how you are really collecting and storing the data. It's not uncommon for people to collect data in UTF-8 format and store it in a non-UTF-8 database (which works, but the database usually has trouble viewing or sorting it if it doesn't know the right character set).
My guess is that you're working with UTF-8-encoded data and that your output screen just isn't properly set up to tell the browser to read the page as UTF-8.
If you want to be certain of what data you're using, you'll have to look at the hex codes for your data. To do this, drop this function into your code somewhere:
...and then use it on the variable that contains your name:
...and then run that and give us the results. If you get this:
4C 75 6B E1 9A 20 4B 6F 76 64 73 61 6E 64 61
...then your data IS NOT in UTF-8 format. But if you get this:
4C 75 6B C3 A1 C5 A1 20 4B 6F 76 64 73 61 6E 64 61
...then your data IS in UTF-8 format.
Again, based on what you showed in your original post, I'm pretty sure you ARE using UTF-8, so it's just a matter of setting up your output page to properly tell the browser to read the page as UTF-8.
This is normally accomplished by just adding this line somewhere between your <head> and </head> tags in your page:
There is also optionally an HTTP header that might have to be updated, but I would just try the meta tag first.
Finally, make sure you read my article on UTF-8 so you can understand what's going on (and make sure you don't improperly use functions like utf8_encode or utf8_decode):
https://www.experts-exchange.com/articles/25999/Unicode-UTF-8-and-Multibyte-in-Plain-English.html
Now, if it looks correct in the database but not in the browser, then that almost always means that the browser is not using the same character set as what's in the database, and so it doesn't know what to do with the data except try to display it how it THINKS it should.
There are several character sets that can display special characters, and usually they all have different ways of storing the data, and they have pros and cons compared to the others. For example, if your data is truly stored as Latin1_General_CI_AS, then special characters only take up one byte of data, but you don't have very many choices of special characters (for example, you couldn't store a Japanese character with Latin1_General_CI_AS). If you use UTF-8, then you can use almost any special character in the world but some of them might take up a couple of extra bytes of storage.
So it's really a question of how you are really collecting and storing the data. It's not uncommon for people to collect data in UTF-8 format and store it in a non-UTF-8 database (which works, but the database usually has trouble viewing or sorting it if it doesn't know the right character set).
My guess is that you're working with UTF-8-encoded data and that your output screen just isn't properly set up to tell the browser to read the page as UTF-8.
If you want to be certain of what data you're using, you'll have to look at the hex codes for your data. To do this, drop this function into your code somewhere:
function hexDump($data)
{
$out = '';
$len = strlen($data);
for ($i=0; $i<$len; $i++) { $out .= dechex(ord($data[$i])) . " "; }
return strtoupper($out);
}
...and then use it on the variable that contains your name:
$name = $row_from_database["name"]; // Lukáš Kovdsanda
echo "NAME IN HEX = " . hexDump($name);
...and then run that and give us the results. If you get this:
4C 75 6B E1 9A 20 4B 6F 76 64 73 61 6E 64 61
...then your data IS NOT in UTF-8 format. But if you get this:
4C 75 6B C3 A1 C5 A1 20 4B 6F 76 64 73 61 6E 64 61
...then your data IS in UTF-8 format.
Again, based on what you showed in your original post, I'm pretty sure you ARE using UTF-8, so it's just a matter of setting up your output page to properly tell the browser to read the page as UTF-8.
This is normally accomplished by just adding this line somewhere between your <head> and </head> tags in your page:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
...
</body>
</html>
There is also optionally an HTTP header that might have to be updated, but I would just try the meta tag first.
Finally, make sure you read my article on UTF-8 so you can understand what's going on (and make sure you don't improperly use functions like utf8_encode or utf8_decode):
https://www.experts-exchange.com/articles/25999/Unicode-UTF-8-and-Multibyte-in-Plain-English.html
Hi!
Have you tried to change all fields/params of type char/varchar/text to nchar/nvarchar/ntext yet?
They should look like: (N'Lukáš Kovdsanda') afterwards!
With Greek language (for example) that does work!
Best regards,
Raisor
Have you tried to change all fields/params of type char/varchar/text to nchar/nvarchar/ntext yet?
They should look like: (N'Lukáš Kovdsanda') afterwards!
With Greek language (for example) that does work!
Best regards,
Raisor
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you all for your help.
LauraGB you nailed it, the connection strings were missing the UTF8
LauraGB you nailed it, the connection strings were missing the UTF8