Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 3287
  • Last Modified:

How to find (and replace) characters like ö, ü, ä, etc

Hi,
Sometimes some of our fields in a SQL Server database contain characters like ö, Ç ä, Ö etc.
One of our applications that has to work with this data really doesn't like these.
I am looking for a way to find these characters; Let's say i want to see all records in the name column of the customer table that contain non-ascii characters like these. How to do that?
And if i'd like to replace these characters with 'normal' characters (like Ö -> O, ä -> a, etc), does anyone know of an elegant way how to do that?

Thanks
0
dready
Asked:
dready
2 Solutions
 
chapmandewCommented:
First, you have to find the ASCII codes for the characters...then you can do this:

select ascii('ü')

select replace('aosidmfüasdfom', 'ü', 'X')

with a table it would be:

select replace(fieldvalue', 'ü', '')
from tablename

0
 
chapmandewCommented:
actually....my example doesn't take the ascii code into account..

0
 
chapmandewCommented:
so, just omit my first line from my example.
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
Scott PletcherSenior DBACommented:
At the lowest level, you will probably need a function to replace the individual characters.  For example:


create function dbo.replace_extended_char (
    @char char(1)
)
returns char(1)
as
begin
return (
    select case when ascii(@char) < 127 then @char else
           case @char
                when 'Ç' then 'C'
                when 'ö' then 'o'
                when 'ä' then 'a'
                when 'Ö' then 'O'
                else '?' end end
    )
end
go


Then you need "driver code" that sends each extended character (only) to the function.  More on that later if needed :-) .
0
 
Mark WillsTopic AdvisorCommented:
Can be a problem to replace... There has been a couple of these in EE, one of which had a reasonable function...

The normal "printable" characters range from 32 to 126 inclusive - before that you have carriage returns, tabs, line feeds etc... It goes up to 255 in the Ansi character set, so, there is potentially 255-126 characters to be checked. Ouch.

The challenge will be what codeset, language, binaries are being used, or, are you assuming just ascii characters and the English language are being used.

In which case, using physical character representations as acperkins does above will work OK...

In which case, first step is to create a character map... Normally create a table for that :


create table uCharMap (AsciiNumber int primary key, AsciiCharacter char(1), Printable char(1))
GO
declare @int int
set @int= 127
while @int < 256
begin
  insert uCharMap (AsciiNumber,AsciiCharacter) values (@int, char(@int))
  set @int = @int + 1
end
GO


then open the table and manually decide the most appropriate characters to substitute (csv is included for one prepared earlier) ...

Then can do the function business (created below) as part of a select, or update or what ever e.g.

select dbo.ufix_characters('ABCDefg hij 1233128¬E134 +140RÈÉÊËÌÍÏÐÑÒÓÔÕÖ×ØÙÚÛÝÞ')



create function uFix_Characters(@incoming varchar(max))
returns varchar(max)
as
begin
declare @AsciiNumber int
declare @c char(1)
declare @p char(1)
declare @i int
set @i = 0
 
if patindex('%[^0-9 ,.";:-=~!@#$%*?()+}{a-zA-Z]%',@incoming) = 0
return @incoming
 
while @i < len(rtrim(@incoming)) 
begin
 
  set @i = @i + 1
  set @asciinumber = ascii(substring(@incoming,@i,1)) 
  if @asciinumber > 126
  begin
     select @c = asciicharacter, @p = printable from ucharmap where asciinumber = @asciinumber
     set @incoming = replace (@incoming,@c,@p)
  end
 
end
 
return @incoming
end
go

Open in new window

ucharmap.csv.txt
0
 
susanysCommented:
What code should i use exactly (i've created the function and table with extended characters and appropriate replacements).

I want to run it on a table called products, on a field called name for all possible replacements.

Thanks!
0
 
Mark WillsTopic AdvisorCommented:
Well...

First you test it with

select name as oldname, dbo.ufix_characters(name) as newname from products

then if you are happy...

update products set name = dbo.ufix_characters(name)

Could probably add in a "where" clause - something like a pattern match (patindex) for characters not in the range of 0-9 and A-Z, but if a once off job, just choose a quite time...

But test first !!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now