identify if a string contains chinese characters

Hi
I am looking to find a method of identifying if a string contains chinese characters.
I am using SQL 2005.
Cheers

Open in new window

Nickie27Asked:
Who is Participating?
 
PaultheBrokerConnect With a Mentor Commented:
OK - here is a little function which will parse a string out into ASCII and unicode caracters.  It will work for strings up to 10000 long, which should cover you.

My idea is to first find by trial and error where the chinese character set starts.  I don't have Chinese on my system, so I can't tell - but it seems that ASCII = UNICODE for Latin charset.

So the test usage is :
declare @mystring as nvarchar(50)
set @mystring = N'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
select * from dbo.unidecoder(@mystring)

and you might use this like this in your code like this:
declare @mystring as nvarchar(50)
set @mystring = N'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
--select * from dbo.unidecoder(@mystring)

IF EXISTS(SELECT * FROM dbo.unidecoder(@mystring) WHERE [unicode] > 256)

CREATE TABLE seqNums (
    seqNum SMALLINT,
    CONSTRAINT seqNums_CI
        UNIQUE CLUSTERED (seqNum)
    )
INSERT INTO seqNums
SELECT ones + tens + hundreds + thousands
FROM (
    SELECT 0 AS ones UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL 
    SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL 
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS ones
CROSS JOIN (
    SELECT 00 AS tens UNION ALL SELECT 10 UNION ALL SELECT 20 UNION ALL 
    SELECT 30 UNION ALL SELECT 40 UNION ALL SELECT 50 UNION ALL SELECT 60 UNION ALL 
    SELECT 70 UNION ALL SELECT 80 UNION ALL SELECT 90
) AS tens
CROSS JOIN (
    SELECT 000 AS hundreds UNION ALL SELECT 100 UNION ALL SELECT 200 UNION ALL 
    SELECT 300 UNION ALL SELECT 400 UNION ALL SELECT 500 UNION ALL SELECT 600 UNION ALL 
    SELECT 700 UNION ALL SELECT 800 UNION ALL SELECT 900
) AS hundreds
CROSS JOIN (
    SELECT 0000 AS thousands UNION ALL SELECT 1000 UNION ALL SELECT 2000 UNION ALL 
    SELECT 3000 UNION ALL SELECT 4000 UNION ALL SELECT 5000 UNION ALL SELECT 6000 UNION ALL 
    SELECT 7000 UNION ALL SELECT 8000 UNION ALL SELECT 9000
) AS thousands
ORDER BY ones + tens + hundreds + thousands
------------------------------------------
CREATE FUNCTION unidecoder
	(@unicode 	nvarchar(4000))
RETURNS @table table
	( [character] 	nvarchar(1)
	 ,[ascii]	integer
	 ,[unicode] 	integer)
AS 
BEGIN
	INSERT INTO @table
	SELECT 	 [character] 	= SUBSTRING(@unicode, seqNum, 1)
		,[ascii]	= ASCII(SUBSTRING(@unicode, seqNum, 1))
		,[unicode] 	= UNICODE(SUBSTRING(@unicode, seqNum, 1))
	FROM 	seqNums b
	WHERE 	seqNum BETWEEN 1 AND LEN(@unicode)-1
RETURN
END

Open in new window

0
 
PaultheBrokerCommented:
hmm - not sure !! (good question!)- what happens if you ask for the ANSI of a chinese character?

select ASCII(mychinesecharacter)
0
 
Nickie27Author Commented:
I get 63 which is not good as thats a question mark.  I do have the character set installed and when i perform a select on the table I see the chinese characters.
0
 
PaultheBrokerCommented:
Obviously you don't need to implement this via a function - the select statement from the function would also quite happily live inside your main SQL statement as a subquery or whatnot.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.