Chinese characters - Internationalization

Posted on 2004-09-21
Medium Priority
Last Modified: 2009-01-07
Currently we have Norwegian and English(UK) customers. We are going international and need support for all languages. What is the best way to do this? I am currently doing testing with Chinese.

I have tried using UTF-8 in asp and html. Chinese characters are displayed, but when I try to update the database the characters are converted to ???. If i save directly in enterprise manager, it works.

According to: http://seba.studentenweb.org/thesis/db-sql.php
Unicode values must have N in front of the string. I.e UPDATE Table SET Field = N'value';
That means we have to change ALL sql queries in our application. That's a lot of changes! There must be some other way.
Using N works though!

But according to: http://www.sql-server-performance.com/datatypes.asp
Using N resuls in poor performance.

What are my alternatives, what is the best thing to do?

1. Use N.

2. Set up new server for each new language. Do I still have to use N for chinese characters?

I know that when using Norwegian/Danish collation on SQL server (as we do) then the index will not be correct if a user inserts chinese characters. That, with the point of using N results in poor performance, suggest one server for each language.

I really can't believe it is this hard to support different languages. To have one server for each language and use N in SQL queries seems ridicolous.

According to: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql2k/html/sql_dataencoding.asp
When web-application and database both use Unicode, then conversion should be handled automatically. This is the case for us, but why isn't it working? Could it be because I have a norwegian keyboard and all characters have been copied/pasted from chinese sites.

Database: Microsoft SQL Server 2000. Set up with Norwegian/Danish collation.
Fields: All text fields are using unicode - nchar/nvarchar/ntext

Application: ASP/VbScript/COM with 65001 codepage and UTF-8 charset
HTML: UTF-8 charset

      Response.AddHeader "Content-Type","text/html;charset=UTF-8"
      Response.CodePage = 65001
      session.CodePage = 65001
      Response.CharSet = "utf-8"

      <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
Question by:Sleepyhead_NO
  • 2
  • 2
LVL 12

Expert Comment

ID: 12112930
It is always complicated mixup of response encoding, request encoding and SQL codepages. Also installed MDAC can play a role.

Check this for details and known issues:

Unfortunately you'll have to rewrite parts of system.

Author Comment

ID: 12113812
I have read that article earlier. We are using Windows Server 2003, so MDAC version is not a problem.

If we set up a new database with chinese collation & codepage and use BIG-5 as charset in HTML, will we still need to use the N prefix? If not, then I think we will go for that option instead.

Accepted Solution

davehilditch earned 400 total points
ID: 12114323
The N prefix is only used to specify a string is nvarchar inside a query.  If you declared a variable type then you will not need this as presumably you will have the nvarchar variable as a parameter in a stored procedure.  The use of N is for:

select N'this is unicode text'
select 'this is ascii text'

But if you have your variables as parameters and they are nvarchar then you will not to perform the above.  Yes, unicode is going to be slower than regular varchars because 2 bytes are used per character instead of 1, so you have double the amount of data to read and write for your text columns.

When you say ??? is being displayed, is this only in Query Analyzer?  I have seen QA multiple times displaying ??? because the internationalisation options haven't been set for QA but the data is still stored correctly and when it finally reaches the ASP page should display correctly.

You don't need to have a seperate database for each language - at a company i worked for before we had greek, norweigan, english, french, german, russian, and (tested but not used) japanese in the same database.  Worked perfectly fine.  Cant' remember how to get QA to display the characters properly, but there is a way.  Looks pretty cool returning greek from a select statement.  (ok, i'm easily pleased)

Dave Hilditch.

Author Comment

ID: 12115559
Thanks for your comments.

we already use unicode fields (nchar/nvarchar/ntext), i know it takes up more space and thus is slower.

for SELECT this does not seem to be a problem, except I haven't got SELECT WHERE to work yet (tried with and without N prefix), thought not testet much. The problem seems to be mainly with INSERT/UPDATE's where I have to specify N'value' to store the value correctly. Without N'value' it stores ???.

Like I said, display is no problem. Was you application a web-application and were you storing data from the web-application as well? Not just displaying it? And what about collation? It will be different with for example Norwegian and English.

In stored procedures where we cast variable as nvarchar, then I assume this will not be a problem. But in allmost 100% of the queries we don't use stored procedures, just generate the sql in ASP and execute it with ADO.

QA and Enterprise Manager displays the chinese characters correctly now, not sure what happend earlier...

Expert Comment

ID: 12121691
My application was a survey system, so data was stored from the web page in any of the languages, and displayed in any of the languages.  If you are using dynamic SQL in your insert/updates then you will definitely have to use N to ensure they are treated as unicode.

If on the other hand, you have something like:

create procedure insertSomething(@param1 nvarchar(100))
insert into mytable
select @param1

Then there is no need for N anywhere, unless you are inserting from Query Analzer.

Dave Hilditch.

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article shows gives you an overview on SQL Server 2016 row level security. You will also get to know the usages of row-level-security and how it works
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question