• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2421
  • Last Modified:

identify if a string contains chinese characters

Hi
I am looking to find a method of identifying if a string contains chinese characters.
I am using SQL 2005.
Cheers

Open in new window

0
Nickie27
Asked:
Nickie27
  • 3
1 Solution
 
PaultheBrokerCommented:
hmm - not sure !! (good question!)- what happens if you ask for the ANSI of a chinese character?

select ASCII(mychinesecharacter)
0
 
Nickie27Author Commented:
I get 63 which is not good as thats a question mark.  I do have the character set installed and when i perform a select on the table I see the chinese characters.
0
 
PaultheBrokerCommented:
OK - here is a little function which will parse a string out into ASCII and unicode caracters.  It will work for strings up to 10000 long, which should cover you.

My idea is to first find by trial and error where the chinese character set starts.  I don't have Chinese on my system, so I can't tell - but it seems that ASCII = UNICODE for Latin charset.

So the test usage is :
declare @mystring as nvarchar(50)
set @mystring = N'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
select * from dbo.unidecoder(@mystring)

and you might use this like this in your code like this:
declare @mystring as nvarchar(50)
set @mystring = N'1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
--select * from dbo.unidecoder(@mystring)

IF EXISTS(SELECT * FROM dbo.unidecoder(@mystring) WHERE [unicode] > 256)

CREATE TABLE seqNums (
    seqNum SMALLINT,
    CONSTRAINT seqNums_CI
        UNIQUE CLUSTERED (seqNum)
    )
INSERT INTO seqNums
SELECT ones + tens + hundreds + thousands
FROM (
    SELECT 0 AS ones UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL 
    SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL 
    SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
) AS ones
CROSS JOIN (
    SELECT 00 AS tens UNION ALL SELECT 10 UNION ALL SELECT 20 UNION ALL 
    SELECT 30 UNION ALL SELECT 40 UNION ALL SELECT 50 UNION ALL SELECT 60 UNION ALL 
    SELECT 70 UNION ALL SELECT 80 UNION ALL SELECT 90
) AS tens
CROSS JOIN (
    SELECT 000 AS hundreds UNION ALL SELECT 100 UNION ALL SELECT 200 UNION ALL 
    SELECT 300 UNION ALL SELECT 400 UNION ALL SELECT 500 UNION ALL SELECT 600 UNION ALL 
    SELECT 700 UNION ALL SELECT 800 UNION ALL SELECT 900
) AS hundreds
CROSS JOIN (
    SELECT 0000 AS thousands UNION ALL SELECT 1000 UNION ALL SELECT 2000 UNION ALL 
    SELECT 3000 UNION ALL SELECT 4000 UNION ALL SELECT 5000 UNION ALL SELECT 6000 UNION ALL 
    SELECT 7000 UNION ALL SELECT 8000 UNION ALL SELECT 9000
) AS thousands
ORDER BY ones + tens + hundreds + thousands
------------------------------------------
CREATE FUNCTION unidecoder
	(@unicode 	nvarchar(4000))
RETURNS @table table
	( [character] 	nvarchar(1)
	 ,[ascii]	integer
	 ,[unicode] 	integer)
AS 
BEGIN
	INSERT INTO @table
	SELECT 	 [character] 	= SUBSTRING(@unicode, seqNum, 1)
		,[ascii]	= ASCII(SUBSTRING(@unicode, seqNum, 1))
		,[unicode] 	= UNICODE(SUBSTRING(@unicode, seqNum, 1))
	FROM 	seqNums b
	WHERE 	seqNum BETWEEN 1 AND LEN(@unicode)-1
RETURN
END

Open in new window

0
 
PaultheBrokerCommented:
Obviously you don't need to implement this via a function - the select statement from the function would also quite happily live inside your main SQL statement as a subquery or whatnot.
0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now