?
Solved

tsql clean non printable chars using xml path

Posted on 2010-08-24
2
Medium Priority
?
663 Views
Last Modified: 2012-05-10
Hi guys -

I have a bunch of tables that have non printable characters in them and I need to get them out. Here's what I have so far...which works great to clean the bad stuff. Until I run it and get the original name as a name with a bunch of non printables (I'd paste but it wont' show up) and then the cleaned name as "Johnson & Company" instead of "Johnson & Company                       " like it did in the first place.

My select statement at the bottom is to select only the records which have the bad chars. (I'll use that to update my tables.)

Any clues as to what I'm doing wrong/not doing where I'm getting this? (Doing it both for SQL2005 and 2008)

Thanks!
;WITH Num1 (n) AS (SELECT 1 UNION ALL SELECT 1),
Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y),
Num3 (n) AS (SELECT 1 FROM Num2 AS X, Num2 AS Y),
Num4 (n) AS (SELECT 1 FROM Num3 AS X, Num3 AS Y),
Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num4),

CleanCTE
AS
(SELECT CompanyID, CompanyName,
       CAST ((SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, n, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, n, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, n, 1)
               END + ''
        FROM Company AS B
        JOIN Nums
          ON n <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by Nums.N
        FOR XML PATH(''), TYPE) AS VARCHAR(256)) AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
Comment
Question by:rmm2001
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 2000 total points
ID: 33517141

;With Company as
(select *, 'operator' as OperatorName from (
 select CompanyID=1, CompanyName=cast('Johnson & Company' as varchar(256)) union all
 select 2, 'McDonalds' union all select 3, 'Jim''s Towing' union all
 select 4, 'Wingdings' + Char(252) + '<<tick') X)
,CleanCTE
AS
(SELECT CompanyID, CompanyName,
       (SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, v.number, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, v.number, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, v.number, 1)
               END + ''
        FROM Company AS B
        JOIN master..spt_values v on type='P' and number between 1 and 256
          and v.number <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by v.number
        FOR XML PATH(''), root('r'), TYPE).value('.','VARCHAR(256)') AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
 
LVL 7

Author Comment

by:rmm2001
ID: 33517210
That works! Thank you! I didn't know about the spt_values table
0

Featured Post

[Webinar] Lessons on Recovering from Petya

Skyport is working hard to help customers recover from recent attacks, like the Petya worm. This work has brought to light some important lessons. New malware attacks like this can take down your entire environment. Learn from others mistakes on how to prevent Petya like worms.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article I will describe the Detach & Attach method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question