Link to home
Start Free TrialLog in
Avatar of Lee R Liddick Jr
Lee R Liddick JrFlag for United States of America

asked on

Remove Letters/Symbols from String

I have an old table that displays an ID number as AA110000 or BBB110000 or CCC11-000.  What I need to do and extract the leading letters and - from the number...so all that displays is a 4 digit number.  So in the last case where the number is only three digits, I need it to lead with a 0.

Basically I am exporting this data from its old table to a new table as an INTEGER only field.  Now in that field I want to mask the number so it looks like 11-0000 when displayed.
Avatar of Lowfatspread
Lowfatspread
Flag of United Kingdom of Great Britain and Northern Ireland image

if your going to treat it as an integer then cast/convert it to an integer....

otherwise concatenate it with leading zeros....


like


select  right('0000'+right(yourcolumn,case when charindex('-',yourcolumn)> 0 then len(yourcolumn) - charindex('-',yourcolumn)
                                              else 4 end),4)
  from (select rtrim(youcolumn) as yourcolumn from yourtable) as x
Is it always the right-most 4 digits, with or without a dash?  So that:
AA110000 --> 0000
BBB110000 --> 0000
CCC11-000 --> 0000

Then:

new_int_column = RIGHT('0000' + REPLACE(RIGHT(column, 4), '-', ''), 4)

In the new table, make the display value a computed column, like so:

CREATE TABLE ...
    new_alphanum_column varchar(30),
    new_int_column int,
    display_column AS new_alphanum_column + '-' + RIGHT('0000' + CAST(new_int_column AS varchar(4)), 4)

Then in the SELECT, you can use "display_column" just like any other column.  It's no overhead, because SQL will only construct the column if it's actually used in the query.

That way the definition is in ONE place if it ever needs to change.
If the "-" can leave fewer than 3 digits to its right, then you need to make a slight adjustment:

new_int_column = RIGHT('0000' + REPLACE(RIGHT(column, 4), '-', '000'), 4)
Avatar of Lee R Liddick Jr

ASKER

Okay well how do i get that incorporated in my current query:  

SELECT ID, NAME
FROM MYTABLE
4 Numbers are always to the right Scott...
Actually using '000' on the REPLACE doesn't hurt even no matter how many digits are to the right, so it's best to leave it as a precaution against fewer digits after the -.

ALTER TABLE MyTable
ADD
    masked_column AS ID '-' + RIGHT('0000' + CAST(int_column AS varchar(10)), 4)
(Assuming "ID" contains the varchar part of the value, and "int_column" contains the new int-only part):

ALTER TABLE MyTable
ADD
    masked_column AS ID + '-' + RIGHT('0000' + CAST(int_column AS varchar(10)), 4)
Scott...your example worked as written but I am finding I have more variables in here than I thought.  I have:

AA9##
AA09###
9####
10###
10####
AA10###
AA10####
11####
AA11####
TCR11-###
NOTE:  The AA is either IN or FD

I'm guessing that is way too many variables and I'm going to have to manually fix this field.


if that is the full range then its manageable



; with cte as (
select 'AA9' as starter
union all select 'AA09'
union all select '9'
union all select '10'
union all select '10'
union all select 'AA10'
union all select 'AA10'
union all select '11'
union all select 'AA11'
union all select 'TCR11-')
,cte1 as (select case x.n when 1 then replace(starter,'AA','IN')
                      when 2 then replace(starter,'AA','FD')
           from (select distinct starter from cte) as c
            cross join (select 1 as n union all select 2) as x
           where x.n =1 or
             (x.n=2 and starter like 'AA%')
         )
SELECT stuff(ID,1,len(cte1.starter),'') as ID
    , NAME
FROM MYTABLE
inner join cte1
on id like cte1.starter +'%'

SELECT convert(integer,stuff(ID,1,len(cte1.starter),'')) as ID
Below is one possibility.  If those prefixes are *always* to be excluded from the result, you can set the minDataLength to 1 for every entry.

It may need some tweaking, but you should be able to get to a programmatical solution.  NO reason you should have to do this by hand.
SELECT
    CASE WHEN CHARINDEX('-', yt.dataColumn) > 0 
         THEN RIGHT('0000' + REPLACE(yt.dataColumn, '-', '0000'), 4)
         WHEN prefixes.prefix IS NOT NULL
         THEN SUBSTRING(yt.dataColumn, LEN(prefix) + 1, 100)
         ELSE SUBSTRING(yt.dataColumn, PATINDEX('%[0-9]%', yt.dataColumn), 4)
    END
FROM yourTable yt
LEFT OUTER JOIN (
    SELECT '9', 4 AS prefix, minDataLength UNION ALL
    SELECT '10', 4 UNION ALL
    SELECT '11', 4 UNION ALL
    SELECT 'FD09', 6  UNION ALL
    SELECT 'FD10', 6 UNION ALL
    SELECT 'FD11', 6 UNION ALL
    SELECT 'FD9', 5 UNION ALL
    SELECT 'IN09', 6  UNION ALL
    SELECT 'IN10', 6 UNION ALL
    SELECT 'IN11', 6 UNION ALL
    SELECT 'IN9', 5
) AS prefixes ON 
    LEFT(yt.dataColumn, LEN(prefixes.prefix)) = prefixes.prefix AND
    LEN(yt.dataColumn) >= prefixes.minDataLength

Open in new window

In these examples, how can incooperate all of that into my current query?  I guess I should have added the query in my original question...
SELECT
	MAN.intTempID,
	MAN.intID,
	intBU_ID =
	   CASE MAN.strSubmitBy
                WHEN 'cb1166' then '3'
                WHEN 'as3199' then '2'
                ELSE '1'
        END,
    intCATEGORY_ID =
       CASE
           WHEN MAN.intID LIKE 'IN%' then '3'
           WHEN MAN.intID LIKE 'FD%' then '2'
           WHEN MAN.intID LIKE 'DD%' then '4'
           WHEN MAN.intID LIKE 'TC%' then '5'
           ELSE '1'
       END,
    intCHANGETYPE_ID =
       CASE MAN.strCatItem
                WHEN 'New' then '1'
                WHEN 'Change' then '2'
                WHEN 'Other' then '3'
                WHEN 'Remove' then '4'
                ELSE '5'
        END,
    STA.intStatID, SAP.intAprvlID,
    STA.dtiUpdated,
    STA.strUpdateBy,
    DAT.dtiOpen,
    MAN.strSubmitBy
FROM
	dbo.main MAN INNER JOIN
	dbo.data DAT ON MAN.intTempID = DAT.intTempID INNER JOIN
        dbo.status STA ON MAN.intTempID = STA.intTempID INNER JOIN
        dbo.approval SAP ON MAN.intTempID = SAP.intTempID

WHERE
    STA.intID = (select max(b.IntID) from status B
         where b.intTempID = STA.intTempID and
         b.dtiUpdated >= STA.dtiUpdated
    )

Open in new window

YES

you've lost me , please relate you query components to the initial question you asked...

which columns are being stripped, joined , and where is the display element...
So sorry...I should know you can't leave out the smallest of details and in this case, the entire query.  The 'intID' in the query is the field/column that holds all of the multiple variations that I need to strip.  What I am needing to do is to create a query that will provide me with the exact data (that I will export the query results to a spreadsheet) that I need to then import into the new table.
try this

....

could you explain what this is trying to achieve

as its not quite gelling with me...

WHERE
    STA.intID = (select max(b.IntID) from status B
         where b.intTempID = STA.intTempID and
         b.dtiUpdated >= STA.dtiUpdated
    )


are you expecting multiple rows to be generated by the status table?
; with cte as (
select 'AA9' as starter
union all select 'AA09'
union all select '9'
union all select '10'
union all select 'AA10'
union all select '11'
union all select 'AA11'
union all select 'TCR11-')
,cte1 as (select case x.n when 1 then replace(starter,'AA','IN') 
                      when 2 then replace(starter,'AA','FD')
           from (select distinct starter from cte) as c
            cross join (select 1 as n union all select 2) as x
           where x.n =1 or
             (x.n=2 and starter like 'AA%')
         )
SELECT
	MAN.intTempID,
	convert(integer,stuff(MAN.intID,1,len(cte1.starter),'')) as intID,
	intBU_ID =
	   CASE MAN.strSubmitBy
                WHEN 'cb1166' then '3'
                WHEN 'as3199' then '2'
                ELSE '1'
        END,
    intCATEGORY_ID =
       CASE
           WHEN MAN.intID LIKE 'IN%' then '3'
           WHEN MAN.intID LIKE 'FD%' then '2'
           WHEN MAN.intID LIKE 'DD%' then '4'
           WHEN MAN.intID LIKE 'TC%' then '5'
           ELSE '1'
       END,
    intCHANGETYPE_ID =
       CASE MAN.strCatItem
                WHEN 'New' then '1'
                WHEN 'Change' then '2'
                WHEN 'Other' then '3'
                WHEN 'Remove' then '4'
                ELSE '5'
        END,
    STA.intStatID, SAP.intAprvlID,
    STA.dtiUpdated,
    STA.strUpdateBy,
    DAT.dtiOpen,
    MAN.strSubmitBy

FROM
	dbo.main MAN 
INNER JOIN
	dbo.data DAT 
ON MAN.intTempID = DAT.intTempID 
INNER JOIN
        dbo.status STA 
ON MAN.intTempID = STA.intTempID 
INNER JOIN
        dbo.approval SAP 
ON MAN.intTempID = SAP.intTempID
inner join cte1 
on man.intid like cte1.starter +'%'

WHERE
    STA.intID = (select max(b.IntID) from status B
         where b.intTempID = STA.intTempID and
         b.dtiUpdated >= STA.dtiUpdated
    )

Open in new window

The WHERE clause is created because the STATUS table I get the 'status id' from is a one-to-many relationship with the main record (intTempID)...so this gets the 'last' known status for that main record.
I got an incorrect syntax near ';' and other incorrect syntax errors.
WHERE
    STA.intID = (select max(b.IntID) from status B
         where b.intTempID = STA.intTempID and
         b.dtiUpdated >= STA.dtiUpdated
    )



yes i expected it to be something like that, so why the >= on the dtiupdated column?

so i think i'd do it with just a subquery join  
e.g.
this without the where clause

FROM
      dbo.main MAN
INNER JOIN
      dbo.data DAT
ON MAN.intTempID = DAT.intTempID
INNER JOIN
      ( select x.* from
       (select ¿s.*
          ,row_number() over (partition by s.inttempid order by intid desc) as rn
        from dbo.status
       ) as x
       where rn=1
       )  STA

ON MAN.intTempID = STA.intTempID
INNER JOIN
        dbo.approval SAP
ON MAN.intTempID = SAP.intTempID
inner join cte1
on man.intid like cte1.starter +'%'



















ok you need to post the errors or i can't debug....

i saw a missing END from the case in the cte...


confirm how you run this please.

?
; with cte as (
select 'AA9' as starter
union all select 'AA09'
union all select '9'
union all select '10'
union all select 'AA10'
union all select '11'
union all select 'AA11'
union all select 'TCR11-')
,cte1 as (select case x.n when 1 then replace(starter,'AA','IN') 
                      when 2 then replace(starter,'AA','FD')
                      end
           from (select distinct starter from cte) as c
            cross join (select 1 as n union all select 2) as x
           where x.n =1 or
             (x.n=2 and starter like 'AA%')
         )
SELECT
	MAN.intTempID,
	convert(integer,stuff(MAN.intID,1,len(cte1.starter),'')) as intID,
	intBU_ID =
	   CASE MAN.strSubmitBy
                WHEN 'cb1166' then '3'
                WHEN 'as3199' then '2'
                ELSE '1'
        END,
    intCATEGORY_ID =
       CASE
           WHEN MAN.intID LIKE 'IN%' then '3'
           WHEN MAN.intID LIKE 'FD%' then '2'
           WHEN MAN.intID LIKE 'DD%' then '4'
           WHEN MAN.intID LIKE 'TC%' then '5'
           ELSE '1'
       END,
    intCHANGETYPE_ID =
       CASE MAN.strCatItem
                WHEN 'New' then '1'
                WHEN 'Change' then '2'
                WHEN 'Other' then '3'
                WHEN 'Remove' then '4'
                ELSE '5'
        END,
    STA.intStatID, SAP.intAprvlID,
    STA.dtiUpdated,
    STA.strUpdateBy,
    DAT.dtiOpen,
    MAN.strSubmitBy

FROM
	dbo.main MAN 
INNER JOIN
	dbo.data DAT 
ON MAN.intTempID = DAT.intTempID 
INNER JOIN
      ( select x.* from
       (select ¿s.*
          ,row_number() over (partition by s.inttempid order by intid desc) as rn 
        from dbo.status
       ) as x
       where rn=1
       )  STA 
ON MAN.intTempID = STA.intTempID 
INNER JOIN
        dbo.approval SAP 
ON MAN.intTempID = SAP.intTempID
inner join cte1 
on man.intid like cte1.starter +'%'

Open in new window

Msg170, Level 15, State 1, Line 1
Line 1:  Incorrect syntax near ';'.
Msg170, Level 15, State 1, Line 10
Line 10:  Incorrect syntax near ','.

Its just a query in SSMS that I execute to get the results.
ok sorry  the missing end also required an AS Starter

please retry
; with cte as (
select 'AA9' as starter
union all select 'AA09'
union all select '9'
union all select '10'
union all select 'AA10'
union all select '11'
union all select 'AA11'
union all select 'TCR11-')
,cte1 as (select case x.n when 1 then replace(starter,'AA','IN') 
                      when 2 then replace(starter,'AA','FD')
                      end as starter
           from (select distinct starter from cte) as c
            cross join (select 1 as n union all select 2) as x
           where x.n =1 or
             (x.n=2 and starter like 'AA%')
         )
SELECT
	MAN.intTempID,
	convert(integer,stuff(MAN.intID,1,len(cte1.starter),'')) as intID,
	intBU_ID =
	   CASE MAN.strSubmitBy
                WHEN 'cb1166' then '3'
                WHEN 'as3199' then '2'
                ELSE '1'
        END,
    intCATEGORY_ID =
       CASE
           WHEN MAN.intID LIKE 'IN%' then '3'
           WHEN MAN.intID LIKE 'FD%' then '2'
           WHEN MAN.intID LIKE 'DD%' then '4'
           WHEN MAN.intID LIKE 'TC%' then '5'
           ELSE '1'
       END,
    intCHANGETYPE_ID =
       CASE MAN.strCatItem
                WHEN 'New' then '1'
                WHEN 'Change' then '2'
                WHEN 'Other' then '3'
                WHEN 'Remove' then '4'
                ELSE '5'
        END,
    STA.intStatID, SAP.intAprvlID,
    STA.dtiUpdated,
    STA.strUpdateBy,
    DAT.dtiOpen,
    MAN.strSubmitBy

FROM
	dbo.main MAN 
INNER JOIN
	dbo.data DAT 
ON MAN.intTempID = DAT.intTempID 
INNER JOIN
      ( select x.* from
       (select s.*
          ,row_number() over (partition by s.inttempid order by intid desc) as rn 
        from dbo.status
       ) as x
       where rn=1
       )  STA 
ON MAN.intTempID = STA.intTempID 
INNER JOIN
        dbo.approval SAP 
ON MAN.intTempID = SAP.intTempID
inner join cte1 
on man.intid like cte1.starter +'%'

Open in new window

Hmmmm, it still doesn't like the ; or the , ...
ok i opened up my management studio (2008 r2 )

and have confirmed that the code syntax checks ok, and also
that the first part of the cte works correctly...


which version of sql server are you using?
do you have anything else in the query window?

if you highlight the whole query  does it syntax check ok?
does it execute?
Im using SSMS 2005 and it appears the dbase is in SQL 2000.
ASKER CERTIFIED SOLUTION
Avatar of Lowfatspread
Lowfatspread
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That worked LowFat...that didn't Scott.

Msg 207, Level 16, State 3, Line 1
Invalid column name 'minDataLength'.
Msg 8155, Level 16, State 2, Line 1
No column was specified for column 1 of 'prefixes'.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial