Solved

regexp_replace - further filtering needed

Posted on 2011-03-08
18
496 Views
Last Modified: 2012-05-11
I'm currently using:

regexp_replace(ee.sku,' ?\([^)]+\)') as sku_cleansed

which makes the string before:
MC HDMIB 2m (6.56 ft.)

look like:
MC HDMIB 2m

Is there a way to further filter the "2m"  - or anything matching: ##.##m ? I would also want to remove "12.4m"  (and ft.).

Thanks!
0
Comment
Question by:globalwm2
  • 10
  • 8
18 Comments
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073448
Why did you close your previous question if that part of the requirement wasn't complete?

It is also why I asked what I did in:
http://www.experts-exchange.com/Database/Oracle/Q_26869077.html?#a35060315

We need to know more about the entire set of 'good' and 'bad' data or we'll solve one then you'll need to continue posting new questions.

For example can the following be in the data:  'QWE 2m' and you only want QWE?
0
 

Author Comment

by:globalwm2
ID: 35073506
Sorry Expert, at the time in my validation it was workable. Then the specs changed from the Business Unit. Originally we thought we satisfied the requirement.

Back to your q:

yes, want to strip of references to meters and feet (measurements).
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073693
>>Then the specs changed

removing "2m" was part of the original question "would also like to strip out the "0.6m" if possible too."

>>meters and feet (measurements).

So the Feet will ALWAYS be in paranthesis.  Meters will ALWAYS be numbers, a possible decimal point, no space and a lowercase m ( ###.###m ).

You can never have 1,000.98m?

It's also safe to remove ANYTHING in that format or will the meters always appear to the left of the feet in parans?
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073745
Try this:

with mytab as (
select 'Chocolate 0.6m (1.96 ft.) Braided' myCol from dual
union all
select 'MC HDMIB 2m (6.56 ft.)' from dual
)
select regexp_replace(regexp_replace(myCol,'\(.*\)|[0-9]*.[0-9]+m',''),'[ ]+',' ') from mytab
/
0
 

Author Comment

by:globalwm2
ID: 35073770
I have all kinds of information within the parens - here are some examples:

AP4N Bamboo (Pr)
SoundWare White (Ea)
Pearl 0.6m (1.96 ft.) Black/White
Solo HD (PRODUCT)RED
SoundDock (R) Portable Black
ML MTH (2 pr)
FPTAG01(U)
CFBFAG (2 Pr)

but when the parens has "ft" - then preceding it is some metric measurement like '0.6m'

0
 

Author Comment

by:globalwm2
ID: 35073795
That looks promising.

So, as a newbie here -- you suggesting I don't close this until I complete my script?

I had posted some examples above while you were also replying....

Sorry, just trying to be fair and not have an endless string of comments...

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073828
'ft' with no space ALWAYS?

Also your sample data says 'ft.'.  Is it 'ft' or 'ft.' or both?

Try this one

with mytab as (
select 'Chocolate 0.6m (1.96 ft.) Braided' myCol from dual
union all
select 'MC HDMIB 2m (6.56 ft.)' from dual
union all
select 'AP4N Bamboo (Pr)' from dual
union all
select 'SoundWare White (Ea)' from dual
union all
select 'Pearl 0.6m (1.96 ft.) Black/White' from dual
union all
select 'Solo HD (PRODUCT)RED' from dual
union all
select 'SoundDock (R) Portable Black' from dual
union all
select 'ML MTH (2 pr)' from dual
union all
select 'FPTAG01(U)' from dual
union all
select 'CFBFAG (2 Pr)' from dual
)
select regexp_replace(regexp_replace(myCol,'\([0-9]*\.[0-9]+ ?ft\.?\)|[0-9]*.[0-9]+m',''),'[ ]+',' ') from mytab
/

Open in new window

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073879
>>So, as a newbie here -- you suggesting I don't close this until I complete my script?

First: Welcome to Experts-Exchange!

Close it when you feel the question asked, has been answered.  If we didn't answer it, then you shouldn't close.  If you aren't getting the response you feel you should, click the 'Request Attention' link above and a Moderator will do everything they can to find you the right Experts.

Occasionally people will ask more than one question per question and the Experts will ask you to close the current one and as a related one.  That's OK.  

You should just try to not close it until you are satisfied it is fully answered.  Asking to closely related questions might cause you to violate the rules of 500 points for a single question.  You might get a warning from a Moderator or Zone Advisor and as long as you do not abuse it, everything will be fine.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35073917
I should add:  that it is perfectly fine to ask related questions like:
In previous question <paste link here> I thought it was ZZZ but now I find it's XXX and need further assistance.

Just try to not make a habit out of it.  ;)

You can also ask related questions like:
In previous question <paste link here> my question was answered but an Expert also mentioned ZZZ and I would like to get additional information.  Can someone provide any information on ZZZ?
0
 

Author Comment

by:globalwm2
ID: 35073955
Thank you!

> Occasionally people will ask more than one question per question and the Experts will ask you to close the current one and as a related one.  That's OK.  

ok...  I was trying to be aware of that not wanting to keep "piling on" so to speak.  :)

Thanks for the welcome!

Back to your previous response, so I may learn to "speak" regexp - can you explain your string:

regexp_replace(regexp_replace(myCol,'\([0-9]*\.[0-9]+ ?ft\.?\)|[0-9]*.[0-9]+m',''),'[ ]+',' ')

??

0
 

Author Comment

by:globalwm2
ID: 35073993
Oh... got it!

> ou can also ask related questions like:
In previous question <paste link here> my question was answered but an Expert also mentioned ZZZ and I would like to get additional information.  Can someone provide any information on ZZZ?
0
 

Author Comment

by:globalwm2
ID: 35074044
I just modified your line to be:

select regexp_replace(regexp_replace(regexp_replace(myCol,'\([0-9]*\.[0-9]+ ?ft\.?\)|[0-9]*.[0-9]+m',''),'[ ]+',' '),' ?\([^)]+\)') from mytab

(combined with the first statement)

Looking good - I need to incorporate into my script and further validate. brb
0
 

Author Comment

by:globalwm2
ID: 35074125
One refinement I see is when the column begins with a "mm" value - for example:  

67mm UV Haze is becoming: m UV Haze

EF 70200mm f/2.8L IS II USM is becoming: EFm f/2.8L IS II USM

Is there a way to 'delay' the replacement or only replace when a '(' character is found?
0
 

Author Comment

by:globalwm2
ID: 35074377
Okay, looks like I'm not going to have to go this deep anymore.

So far, they are liking the results with just:

regexp_replace(ee.sku,' ?\([^)]+\)') as sku_cleansed

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35074732
Again, this is why we ask for the possibilities/examples.

Regex is a powerful pattern matching program.  Once you define the pattern(s), it's just coming up expression to meet that pattern.
0
 

Author Comment

by:globalwm2
ID: 35074778
It was a good learning event nonetheless.  Thank you for your help!

I'm going to go with your statement containing:

select regexp_replace(regexp_replace(myCol,'\(.*\)|[0-9]*.[0-9]+m',''),'[ ]+',' ') from mytab

as a correct response to what I originally asked for?
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 125 total points
ID: 35074941
Accept whichever post or posts helped you get the answer.  Since this question really had no answer, you can award points for the learning experience.

>>One refinement I see is when the column begins with a "mm" value - for example:  

For what it's worth, you can get around this with some 'space' tricks (may not work in all instances but appears to for the current test cases).

You keep throwing them out, we'll try to keep tweaking the regex or close it out.  It's up to you.  We'll play as long as you want (maybe...).


with mytab as (
select 'Chocolate 0.6m (1.96 ft.) Braided' myCol from dual
union all
select 'MC HDMIB 2m (6.56 ft.)' from dual
union all
select 'AP4N Bamboo (Pr)' from dual
union all
select 'SoundWare White (Ea)' from dual
union all
select 'Pearl 0.6m (1.96 ft.) Black/White' from dual
union all
select 'Solo HD (PRODUCT)RED' from dual
union all
select 'SoundDock (R) Portable Black' from dual
union all
select 'ML MTH (2 pr)' from dual
union all
select 'FPTAG01(U)' from dual
union all
select 'CFBFAG (2 Pr)' from dual
union all
select '67mm UV Haze is becoming: m UV Haze' from dual
union all
select 'EF 70200mm f/2.8L IS II USM is becoming: EFm f/2.8L IS II USM' from dual
)
select trim(regexp_replace(regexp_replace(myCol || ' ','\([0-9]*\.[0-9]+ ?ft\.?\)|[0-9]*.[0-9]+m ',''),'[ ]+',' ')) from mytab
/

Open in new window

0
 

Author Closing Comment

by:globalwm2
ID: 35074974
Pts awarded on overall learning experience with reg expressions and EE etiquette.

Thank you!
0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Oracle Insert not working 10 32
performance query 4 32
Update one table with results from another table in SQL 6 24
SQL Syntax Grouping Sum question 7 24
How to Unravel a Tricky Query Introduction If you browse through the Oracle zones or any of the other database-related zones you'll come across some complicated solutions and sometimes you'll just have to wonder how anyone came up with them.  …
If you find yourself in this situation “I have used SELECT DISTINCT but I’m getting duplicates” then I'm sorry to say you are using the wrong SQL technique as it only does one thing which is: produces whole rows that are unique. If the results you a…
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question