SQL Collation with support for Japanese

Hello, I have an SQL Server 2008 database using SQL_Latin1_General_CP1_CI_AS right now. It seems that Japanese text shows up as question marks...so I was going to change it to Japanese_CI_AS or Japanese_Unicode_CI_AS, but I'm not sure what the differences are, or if these are even the ones I should choose. It's mostly going to be filled with Latin-based text, but there are also some fields that should accept Japanese.

Any advice would be appreciated.

Thanks~
LVL 8
YoungBonziAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

James MurrellProduct SpecialistCommented:
we had this a will ago and the then DBA used http://developer.mimer.com/collations/index.tml
0
YoungBonziAuthor Commented:
Thank you for the option, but I want to stick with Microsoft technology.
0
Mark WillsTopic AdvisorCommented:
Well, unicode is multi-byte and needed for "special" characters. So you will want to change to a unicode basis. That also means making sure datatypes are unicode as well - e.g. instead of varchar, it then becomes nvarchar.

There is some reasonable documentation about "international considerations [SQL Server]" in books on-line and highly recommend you research that before you change anything.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Mark WillsTopic AdvisorCommented:
Sorry, might have given the wrong impression there... Have been a bit too brief with some of my answers lately...

The database collation is not as important as the datatype being unicode and then making sure it "knows" it is dealing with unicode.

For example... Try this quick example :
create table tbl_japanese_example_1 (place nvarchar(200))
 
insert tbl_japanese_example_1 values ('ۇ osaka')
 
select * from tbl_japanese_example_1
 
 
create table tbl_japanese_example_2 (place nvarchar(200))
 
insert tbl_japanese_example_2 values (N'ۇ osaka')
 
select * from tbl_japanese_example_2

Open in new window

0
Mark WillsTopic AdvisorCommented:
Guess what - this website is not unicode !
Japanese-Test.zip
0
YoungBonziAuthor Commented:
Sorry, I haven't been getting any email notifications so I never bothered checking back. Thank you for the solution mark wills, that's good to know about the datatypes.

What I wound up doing was, in Management Studio, changing the collation on individual datatypes that could possible receive Japanese to the Japanese_Unicode collation (I was unaware this could be done). I noticed that text indeed becomes ntext, and varchar becomes nvarchar. Like you recommend.

I will probably just leave things as they are, because it's working fine...but are you in fact saying that I don't have to change the collation and just change the datatype?
0
Mark WillsTopic AdvisorCommented:
Yes, the collation will change things like sort sequences and such like, and might want to consider the most appropriate collation, but the secrete to handling those wonderful character sets is in being unicode enabled.
0
Mark WillsTopic AdvisorCommented:
Oh, and that attachment a couple of postings back does show the database (latin) being able to correctly render Kanji becuase they are unicode data types.
0
YoungBonziAuthor Commented:
Ahhhhhh...thank you, I wish I'd read your reply earlier.
0
Mark WillsTopic AdvisorCommented:
Except the top one should have been just varchar and the second one nvarchar - sorry about that.
0
YoungBonziAuthor Commented:
Yep, the unicode is preserved. I figured that's what you were showing me...I played it out in my head because I didn't want to execute it on my DB. Still a bit skittish about playing around with it.

Thanks again~
0
Mark WillsTopic AdvisorCommented:
Yep, you got it... and thank you too...
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.