Link to home
Start Free TrialLog in
Avatar of idealaero
idealaeroFlag for United States of America

asked on

Sql Extract Between Parentheses

have a table(Item) with a field (ItemName)that I need to extract every thing between parentheses and put into an alias field.  I.e.

ItemName                                        Alias
part123 (12-1) (23-1)                       12-1, 23-1
part456 (45) (45-2345) (12)                45, 45-2345, 12
Avatar of Mike Eghtebas
Mike Eghtebas
Flag of United States of America image

Declare @v varchar(20)='part123 (12-1) (23-1)';
Select Replace(Replace(substring(@v,charindex('(',@v)+1,1000),')',''),'(','') As AliasName
Avatar of Scott Pletcher
If the values will always be contiguous, this is easy:

SELECT ItemName,
    REPLACE(REPLACE(SUBSTRING(ItemName, open_paren_byte, close_paren_byte - open_paren_byte + 1), '(', ''), ')', ',')
FROM (
    SELECT 'part123 (12-1) (23-1)' as itemname union all
    select 'part456 (45) (45-2345) (12)' union all
    select 'part789 (56) (56-5678) more text'
) AS test_data
cross apply (
    SELECT CHARINDEX('(', itemname) AS open_paren_byte,
        LEN(itemname) - CHARINDEX(')', REVERSE(itemname)) AS close_paren_byte
) as assign_alias_names
Avatar of idealaero

ASKER

I should clarify…

I have a table Named (Item)
In table (Item)  I have a column Named (ItemName)
This table has 89000 + records.

In the ItemName column  there maybe one alias name contained in parentheses there maybe 10 they may be separated by a comma, space or not at all.

Essentially I need to extract everything contained inside an open and closed parentheses and put into a field called alias.
SOLUTION
Avatar of Mike Eghtebas
Mike Eghtebas
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Eghtebas,

Very close.

The script is returning the entire ItemName field if it doesn't find a parentheses where it should return Null.  I can handle this with a case statement, but I think there are more efficient ways. also if there is text after the parentheses it is displayed where it shouldn't.
Is there a way to list all variations possible for ItemName so we can code accordingly?
Unfortunately there isn't

The data I'm work with is 20 + years old with no controls placed on it.

The best I can come up with is everything between an open and close parentheses and nothing else.
The project I'm working is implementing a new system and I don't want to drag garbage data into the new system.
We need ScottPletcher's attention to handle this.

It probably is best to use in inline function considering you will not use this routinely but for data migration.

Mike
My code above gets everything from the first ( to the last ), ignoring any other leading or trailing data.  I show sample data and the results above.

If you have "in-between" data, such as "abc (123) def (456)" and you want to get "123, 456" then that will take additional code.
What I have is "abc (123) def (456) ghi" and I need "123, 456"

This is exactly what I have

Here is the code I have developed so far with Mike's help.

Select
Case
When Replace(Replace(substring(IMA_ItemName ,charindex('(',IMA_ItemName )+1,1000),')',''),'(',' ,') = IMA_ItemName then Null
Else
Replace(Replace(substring(IMA_ItemName ,charindex('(',IMA_ItemName )+1,1000),')',''),'(','') end As AliasName

FROM         Item
I took this query and gave it to the owner of the data and they are cleaning it up manually.  The farther I dug into the data the dirtier it became.
Wow, I wish I had known you were abandoning this, I wouldn't have spent time coding up a solution that worked for "abc (123) def (456) ghi (789) jkl ...".
Hi idealaero,

Because the solution from Scott is more accurate and much closer to what originally you were asking for, his solution ought to get recognition and the points. I understand there was a change of plan in the office for that you may give me 100 point for assist.
 
Because EE makes a repository of good questions for the future readers, it is important the accepted answer match the original question.

Thank you,

Mike
Hi idealaero,

I forgot the last part. Please put a request via "Request Attention" button below your original question asking this question to be opened to be regraded.

Thanks,

Mike
Here's code that uses a "standard" tally table -- a table of sequential numbers -- to pull out all values enclosed in ().  [I tried to post code to create a tally table but my server at worked blocked it -- I'll try again in a bit.]

SELECT IMA_ItemName, STUFF(CAST((
    SELECT ', ' + SUBSTRING(IMA_ItemName, tally + 1, CHARINDEX(')',  IMA_ItemName, tally + 1) - tally - 1)
    FROM dbo.tally tally
    WHERE SUBSTRING(IMA_ItemName, tally, 1) = '('
    FOR XML PATH('')
    ) AS varchar(8000)), 1, 2, '')
FROM ( --dbo.your_table_name_here
    SELECT 'part123 (12-1) (23-1) abc' as IMA_ItemName union all
    select 'part456 (45) ab(45-2345)xq zprl(12)' union all
    select 'abc (123) def (456) ghi' union all
    select 'part789'
) AS test_data_as_an_example
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial