Solved

tsql clean non printable chars using xml path

Posted on 2010-08-24
2
659 Views
Last Modified: 2012-05-10
Hi guys -

I have a bunch of tables that have non printable characters in them and I need to get them out. Here's what I have so far...which works great to clean the bad stuff. Until I run it and get the original name as a name with a bunch of non printables (I'd paste but it wont' show up) and then the cleaned name as "Johnson & Company" instead of "Johnson & Company                       " like it did in the first place.

My select statement at the bottom is to select only the records which have the bad chars. (I'll use that to update my tables.)

Any clues as to what I'm doing wrong/not doing where I'm getting this? (Doing it both for SQL2005 and 2008)

Thanks!
;WITH Num1 (n) AS (SELECT 1 UNION ALL SELECT 1),
Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y),
Num3 (n) AS (SELECT 1 FROM Num2 AS X, Num2 AS Y),
Num4 (n) AS (SELECT 1 FROM Num3 AS X, Num3 AS Y),
Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num4),

CleanCTE
AS
(SELECT CompanyID, CompanyName,
       CAST ((SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, n, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, n, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, n, 1)
               END + ''
        FROM Company AS B
        JOIN Nums
          ON n <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by Nums.N
        FOR XML PATH(''), TYPE) AS VARCHAR(256)) AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
Comment
Question by:rmm2001
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 500 total points
ID: 33517141

;With Company as
(select *, 'operator' as OperatorName from (
 select CompanyID=1, CompanyName=cast('Johnson & Company' as varchar(256)) union all
 select 2, 'McDonalds' union all select 3, 'Jim''s Towing' union all
 select 4, 'Wingdings' + Char(252) + '<<tick') X)
,CleanCTE
AS
(SELECT CompanyID, CompanyName,
       (SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, v.number, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, v.number, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, v.number, 1)
               END + ''
        FROM Company AS B
        JOIN master..spt_values v on type='P' and number between 1 and 256
          and v.number <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by v.number
        FOR XML PATH(''), root('r'), TYPE).value('.','VARCHAR(256)') AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
 
LVL 7

Author Comment

by:rmm2001
ID: 33517210
That works! Thank you! I didn't know about the spt_values table
0

Featured Post

Forrester Webinar: xMatters Delivers 261% ROI

Guest speaker Dean Davison, Forrester Principal Consultant, explains how a Fortune 500 communication company using xMatters found these results: Achieved a 261% ROI, Experienced $753,280 in net present value benefits over 3 years and Reduced MTTR by 91% for tier 1 incidents.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I'm trying, I really am. But I've seen so many wrong approaches involving date(time) boundaries I despair about my inability to explain it. I've seen quite a few recently that define a non-leap year as 364 days, or 366 days and the list goes on. …
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Attackers love to prey on accounts that have privileges. Reducing privileged accounts and protecting privileged accounts therefore is paramount. Users, groups, and service accounts need to be protected to help protect the entire Active Directory …

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question