Chinese characters - Internationalization

Posted on 2004-09-21
Last Modified: 2009-01-07
Currently we have Norwegian and English(UK) customers. We are going international and need support for all languages. What is the best way to do this? I am currently doing testing with Chinese.

I have tried using UTF-8 in asp and html. Chinese characters are displayed, but when I try to update the database the characters are converted to ???. If i save directly in enterprise manager, it works.

According to:
Unicode values must have N in front of the string. I.e UPDATE Table SET Field = N'value';
That means we have to change ALL sql queries in our application. That's a lot of changes! There must be some other way.
Using N works though!

But according to:
Using N resuls in poor performance.

What are my alternatives, what is the best thing to do?

1. Use N.

2. Set up new server for each new language. Do I still have to use N for chinese characters?

I know that when using Norwegian/Danish collation on SQL server (as we do) then the index will not be correct if a user inserts chinese characters. That, with the point of using N results in poor performance, suggest one server for each language.

I really can't believe it is this hard to support different languages. To have one server for each language and use N in SQL queries seems ridicolous.

According to:
When web-application and database both use Unicode, then conversion should be handled automatically. This is the case for us, but why isn't it working? Could it be because I have a norwegian keyboard and all characters have been copied/pasted from chinese sites.

Database: Microsoft SQL Server 2000. Set up with Norwegian/Danish collation.
Fields: All text fields are using unicode - nchar/nvarchar/ntext

Application: ASP/VbScript/COM with 65001 codepage and UTF-8 charset
HTML: UTF-8 charset

      Response.AddHeader "Content-Type","text/html;charset=UTF-8"
      Response.CodePage = 65001
      session.CodePage = 65001
      Response.CharSet = "utf-8"

      <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
Question by:Sleepyhead_NO
  • 2
  • 2
LVL 12

Expert Comment

ID: 12112930
It is always complicated mixup of response encoding, request encoding and SQL codepages. Also installed MDAC can play a role.

Check this for details and known issues:

Unfortunately you'll have to rewrite parts of system.

Author Comment

ID: 12113812
I have read that article earlier. We are using Windows Server 2003, so MDAC version is not a problem.

If we set up a new database with chinese collation & codepage and use BIG-5 as charset in HTML, will we still need to use the N prefix? If not, then I think we will go for that option instead.

Accepted Solution

davehilditch earned 200 total points
ID: 12114323
The N prefix is only used to specify a string is nvarchar inside a query.  If you declared a variable type then you will not need this as presumably you will have the nvarchar variable as a parameter in a stored procedure.  The use of N is for:

select N'this is unicode text'
select 'this is ascii text'

But if you have your variables as parameters and they are nvarchar then you will not to perform the above.  Yes, unicode is going to be slower than regular varchars because 2 bytes are used per character instead of 1, so you have double the amount of data to read and write for your text columns.

When you say ??? is being displayed, is this only in Query Analyzer?  I have seen QA multiple times displaying ??? because the internationalisation options haven't been set for QA but the data is still stored correctly and when it finally reaches the ASP page should display correctly.

You don't need to have a seperate database for each language - at a company i worked for before we had greek, norweigan, english, french, german, russian, and (tested but not used) japanese in the same database.  Worked perfectly fine.  Cant' remember how to get QA to display the characters properly, but there is a way.  Looks pretty cool returning greek from a select statement.  (ok, i'm easily pleased)

Dave Hilditch.

Author Comment

ID: 12115559
Thanks for your comments.

we already use unicode fields (nchar/nvarchar/ntext), i know it takes up more space and thus is slower.

for SELECT this does not seem to be a problem, except I haven't got SELECT WHERE to work yet (tried with and without N prefix), thought not testet much. The problem seems to be mainly with INSERT/UPDATE's where I have to specify N'value' to store the value correctly. Without N'value' it stores ???.

Like I said, display is no problem. Was you application a web-application and were you storing data from the web-application as well? Not just displaying it? And what about collation? It will be different with for example Norwegian and English.

In stored procedures where we cast variable as nvarchar, then I assume this will not be a problem. But in allmost 100% of the queries we don't use stored procedures, just generate the sql in ASP and execute it with ADO.

QA and Enterprise Manager displays the chinese characters correctly now, not sure what happend earlier...

Expert Comment

ID: 12121691
My application was a survey system, so data was stored from the web page in any of the languages, and displayed in any of the languages.  If you are using dynamic SQL in your insert/updates then you will definitely have to use N to ensure they are treated as unicode.

If on the other hand, you have something like:

create procedure insertSomething(@param1 nvarchar(100))
insert into mytable
select @param1

Then there is no need for N anywhere, unless you are inserting from Query Analzer.

Dave Hilditch.

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now