• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 669
  • Last Modified:

tsql clean non printable chars using xml path

Hi guys -

I have a bunch of tables that have non printable characters in them and I need to get them out. Here's what I have so far...which works great to clean the bad stuff. Until I run it and get the original name as a name with a bunch of non printables (I'd paste but it wont' show up) and then the cleaned name as "Johnson & Company" instead of "Johnson & Company                       " like it did in the first place.

My select statement at the bottom is to select only the records which have the bad chars. (I'll use that to update my tables.)

Any clues as to what I'm doing wrong/not doing where I'm getting this? (Doing it both for SQL2005 and 2008)

Thanks!
;WITH Num1 (n) AS (SELECT 1 UNION ALL SELECT 1),
Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y),
Num3 (n) AS (SELECT 1 FROM Num2 AS X, Num2 AS Y),
Num4 (n) AS (SELECT 1 FROM Num3 AS X, Num3 AS Y),
Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num4),

CleanCTE
AS
(SELECT CompanyID, CompanyName,
       CAST ((SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, n, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, n, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, n, 1)
               END + ''
        FROM Company AS B
        JOIN Nums
          ON n <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by Nums.N
        FOR XML PATH(''), TYPE) AS VARCHAR(256)) AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
rmm2001
Asked:
rmm2001
1 Solution
 
cyberkiwiCommented:

;With Company as
(select *, 'operator' as OperatorName from (
 select CompanyID=1, CompanyName=cast('Johnson & Company' as varchar(256)) union all
 select 2, 'McDonalds' union all select 3, 'Jim''s Towing' union all
 select 4, 'Wingdings' + Char(252) + '<<tick') X)
,CleanCTE
AS
(SELECT CompanyID, CompanyName,
       (SELECT CASE WHEN ASCII(SUBSTRING(CompanyName, v.number, 1))
                         BETWEEN 0x00 AND 0x1F 
                         OR ASCII(SUBSTRING(CompanyName, v.number, 1)) BETWEEN 0x80 AND 0xBF
                    THEN ''
                    ELSE SUBSTRING(CompanyName, v.number, 1)
               END + ''
        FROM Company AS B
        JOIN master..spt_values v on type='P' and number between 1 and 256
          and v.number <= LEN(CompanyName)
        WHERE B.CompanyID = A.CompanyID
        Order by v.number
        FOR XML PATH(''), root('r'), TYPE).value('.','VARCHAR(256)') AS CleanName
 FROM Company AS A)

SELECT TOP 100 o.CompanyName, CleanName
FROM
	CleanCTE cte
JOIN
	Company o on o.CompanyID = cte.CompanyID
WHERE
	LEN(CleanName) <> LEN(o.OperatorName)

Open in new window

0
 
rmm2001Author Commented:
That works! Thank you! I didn't know about the spt_values table
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now