Remove Chinese characters in SQL

I am working with data from Hong Kong and getting alot of Chinese characters mixed in. The software I am importing into does not read these. Is there a simple way in SQL to clear out the Chinese characters from the fields?

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Presumably that the acceptable characters for the software you are using are the standard alpanumerics, as opposed to languages using pictorial script such as Chinese/Japanese/Korean etc.

The 'acceptable' characters will have character codes, just as the unacceptable characters will have character codes...  so filter your data to remove the characters that are not in the acceptable 'range'.  It means a lot of processing as you will be checking every character, but it should work.  

Alternatively you you could some kind of regular expression/pattern matching.  As to how to implement that, I found a useful article on StackOverflow Strip unwanted characters from string

I modified it slightly for your needs, so give this a go:

Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)

    Declare @KeepValues as varchar(100) = '%[^a-zA-Z0-9]%'
    While PatIndex(@KeepValues, @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

    Return @Temp

Open in new window

charkerrAuthor Commented:
So far, a manual search and replace is sounding much simpler.  Typically the data is just a name in alphanumeric followed by the name in Chinese characters, like:

Kobe Huang ¿¿¿

I'd be willing to do the work in Excel if that makes things simpler.

charkerrAuthor Commented:
Haha... those symbols were Chinese characters until I hit submit... :)
OK, so you want an Excel VBA solution, not a SQL one?

How about the following:

Function RemoveNonAlphaNumerics(strOriginal As String) As String 
    Dim intLoop        As Integer
    Dim strCurrentChar As String 
    Dim strProcessed   As String   

    For intLoop = 1 To Len(strOriginal) 
        strCurrentChar = Mid(strOriginal, intLoop, 1) 
        If strCurrentChar Like "[A-Z,a-z,0-9]" Then 
            strProcessed = strProcessed & strCurrentChar
        End If 
    Next intLoop

    RemoveNonAlphaNumerics = strProcessed
End Function

Open in new window

I've attached an example spreadsheet using the above function.  Click the button to call the function that will process all of the cells in Column A from cell A1 until it finds a blank cell.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
charkerrAuthor Commented:
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.