[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

Replace all non numerical characters in a varchar field

Posted on 2013-05-21
12
Medium Priority
?
1,498 Views
Last Modified: 2013-05-24
I'm trying to strip out everything except numbers from ptmorphcast. I have a select case statement and sort by ptmorphcast, ptmorphraw
SELECT CASE 
WHEN a.post_title LIKE '%Single Stream%' THEN REPLACE( a.post_title, 'Single Stream', '1 Stream' ) 
WHEN a.post_title LIKE '%Double Stream%' THEN REPLACE( a.post_title, 'Double Stream', '2 Stream' ) 
WHEN a.post_title LIKE '%Triple Stream%' THEN REPLACE( a.post_title, 'Triple Stream', '3 Stream' ) 
WHEN a.post_title LIKE '%Quad Stream%' THEN REPLACE( a.post_title, 'Quad Stream', '4 Stream' ) 
WHEN a.post_title LIKE '%Single Unit%' THEN REPLACE( a.post_title, 'Single Unit', '1 Unit' ) 
WHEN a.post_title LIKE '%Double Unit%' THEN REPLACE( a.post_title, 'Double Unit', '2 Unit' ) 
WHEN a.post_title LIKE '%Triple Unit%' THEN REPLACE( a.post_title, 'Triple Unit', '3 Unit' ) 
WHEN a.post_title LIKE '%Single Unit%' THEN REPLACE( a.post_title, 'Quad Unit', '4 Unit' ) 
WHEN a.post_title LIKE '%Single Series%' THEN REPLACE( a.post_title, 'Single Series', '1 Series' ) 
WHEN a.post_title LIKE '%Double Series%' THEN REPLACE( a.post_title, 'Double Series', '2 Series' ) 
WHEN a.post_title LIKE '%Triple Series%' THEN REPLACE( a.post_title, 'Triple Series', '3 Series' ) 
WHEN a.post_title LIKE '%Quad Series%' THEN REPLACE( a.post_title, 'Quad Series', '4 Series' ) 
ELSE a.post_title END AS ptmorphraw, 
REPLACE(a.post_title, '[^0-9]+', '') as ptmorphcast, a.post_title as ptmorph, a.* 
FROM posts a 
ORDER By ptmorphcast, ptmorphraw

Open in new window


Replace() function for ptmorphcast isn't updating any values.
0
Comment
Question by:m2ew
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
12 Comments
 
LVL 22

Expert Comment

by:Ivo Stoykov
ID: 39186704
there is MySQL user-defined function REGEXP_REPLACE.

HTH

Ivo Stoykov
0
 
LVL 35

Expert Comment

by:James0628
ID: 39186815
I haven't used MySQL, but from a little checking, it appears that the standard REPLACE doesn't support regexp's, so you would need another function, possibly user-created, like the one that ivostoykov mentioned.

 However, I think your regexp is also wrong.  I believe that's looking for two character strings that consist of a single character that's not a number, followed by a "+" ("a+", "B+", ".+", "++", etc.).  That doesn't sound like what you wanted, but I guess it would depend on your data.  Maybe that was supposed to be '[^0-9+]'  ?

 James
0
 

Author Comment

by:m2ew
ID: 39187936
@James0628 yes I just wanted to replace the numerical values; its really an issue where I have a bunch of products that include 100+ in the title (i.e. Product 102 Gallons) as they have to be moved so they appear after all the smaller sizes (i.e. Product 45 Gallons). I've considered either adding padding to smaller sizes.

Not sure which would be better,
Right now I'm working on a user defined function (below) to pull only numbers out and then reworking it into the replace function

DELIMITER $$
CREATE FUNCTION wwpSTRIP_NON_DIGIT(input VARCHAR(255))
   RETURNS VARCHAR(255)
BEGIN
  DECLARE output   VARCHAR(255) DEFAULT '';
  DECLARE iterator INT          DEFAULT 1;
  WHILE iterator < (LENGTH(input) + 1) DO
     IF SUBSTRING(input, iterator, 1) IN ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' ) THEN
        SET output = CONCAT(output, SUBSTRING(input, iterator, 1));
     END IF;
     SET iterator = iterator + 1;
  END WHILE;
  RETURN output;
END;
$$
DELIMITER ;

Open in new window

0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 46

Expert Comment

by:aikimark
ID: 39188065
I think the easiest and quickest solution might involve a tally table (number table).  If you aren't familiar with these, read this article:
http:A_3573-A-MySQL-Tidbit-Quick-Numbers-Table-Generation.html

=========
Basically, you need a small (0-9 values only) table of numbers that you join up with your posts table in an Update query.  You use the number values in the Replace() function in the SET clause.

You could improve the performance of the query if you add a WHERE clause condition that looks for the numeric digit character in the post_title field.
0
 

Author Comment

by:m2ew
ID: 39188140
@aikimark I read over the query. I can try that method. It gets complex because there is the overall A-Z sort, Then Size (Gallons), and Stream count

Ie. Product 102 Gallon 4 Stream
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39188200
@ m2ew

I don't know what your data looks like and don't understand how you are trying to sort.  I suspect that your data is not sufficiently normalized or tables improperly designed.  Some more detailed information would be helpful to the participating experts.
0
 

Author Comment

by:m2ew
ID: 39188228
@aikimark I'm working within wordpress setup, so I was referring to post_title field for sorting. The format "General Name ### Gallon # of Streams" is a general format based of the that field. I could split it into strings as the sorting priority goes

A-Z on General Name
Then ### Gallon
Then # Streams
0
 
LVL 46

Accepted Solution

by:
aikimark earned 2000 total points
ID: 39188350
@m2ew

This now seems like a parsing problem, since you need to extract different bits of the title in order to sort in the orders you described.  First, start by reading Kevin's related article, where he uses the tally/numbers table to parse a string.
http:A_3622-A-MySQL-Tidbit-In-line-CSV-Parsing.html

Do you have any other fields you can work with?  It would be much better if we were able to do this parsing once and save the parsed data in some other fields on the row.

If you have three fields, my approach would be to first split the title into two fields, based on the string before and after "Gallons".  Trim the result.  Populate a Gallons field with the number string following the last space in the first parsed string. Populate a Streams field with the number string before the first space in the second parsed string.  The raw General Name field is the trimmed string before the Gallons digits in the first parsed string.

========
Another way to do this is to go back to my first post and double the number of 'number' items by prepending and appending a space to our 0-9 values.  What you do, in essence, is to add your own delimiters to the title that you can split on with the Replace() function.

"Product Name 102 Gallon 45 of Streams" becomes
"Product Name^102^Gallon^45^of Streams"

Then you can use the splitting technique in Kevin's second article to split the string based on the "^" delimiter.  His article uses a comma delimiter, but I didn't want to take a chance that your general names might include a comma character.
0
 

Author Comment

by:m2ew
ID: 39188422
@aikimark Thanks for pointing me in the right direction.

I was thinking along the second option.

Concat() function (within replace()) actually came in handy with adding a leading character for the gallon size. I simplified the WHEN to check the digit length > 2 and exclude or filter streams (compartments) over 5; otherwise, wwpSTRIP_NON_DIGIT function would add that value. The majority of items are listed as Single Stream, Double Stream, Triple Stream & Quad Stream so its not an issue. None the less anything 5 & greater was creating a potential problem. Once I had the LENGTH() check, I set either "Z" or "0" as padding to the Gallon number. Query ran fast at  0.0348 sec as I'm limiting to 16 records on the pagination. I have to tweek this a bit more for 5 -8 streams, but works prefect for smaller sizes.

CASE 
   WHEN 
   LENGTH(wwpSTRIP_NON_DIGIT(a.post_title)) > 2 AND a.post_title NOT LIKE '%5 Stream%' AND a.post_title NOT LIKE '%6 Stream%'  AND a.post_title NOT LIKE '%7 Stream%'  AND a.post_title NOT LIKE '%8 Stream%'    
   || LENGTH(wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, '5 Stream', 'Stream' ))) > 2 
   || LENGTH(wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, '6 Stream', 'Stream' ))) > 2 
   || LENGTH(wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, '7 Stream', 'Stream' ))) > 2 
   || LENGTH(wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, '8 Stream', 'Stream' ))) > 2
  THEN 
    REPLACE 
       ( REPLACE 
           ( REPLACE
               (REPLACE 
                  (REPLACE (a. post_title, wwpSTRIP_NON_DIGIT (REPLACE ( a. post_title, 'Quad Stream', 'Stream' )), CONCAT("Z",wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, 'Quad Stream', 'Stream' )))), 
                  'Quad Stream', '4 Stream'),
                'Triple Stream', '3 Stream'),
             'Double Stream', '2 Stream'),
          'Single Stream', '1 Stream')
   ELSE
     REPLACE 
       ( REPLACE 
           ( REPLACE
               (REPLACE 
                  (REPLACE (a. post_title, wwpSTRIP_NON_DIGIT (REPLACE ( a. post_title, 'Quad Stream', 'Stream' )), CONCAT("0",wwpSTRIP_NON_DIGIT(REPLACE( a.post_title, 'Quad Stream', 'Stream' )))), 
                  'Quad Stream', '4 Stream'),
                'Triple Stream', '3 Stream'),
             'Double Stream', '2 Stream'),
          'Single Stream', '1 Stream')
END as ptmorphraw

Open in new window

0
 
LVL 111

Expert Comment

by:Ray Paseur
ID: 39194943
I got a neglected question alert on this, and even though it's already closed, I wanted to share my thoughts.  Considering this:

General Name ### Gallon # of Streams

I see a "name" - just a string field that should not carry any inherent meaning.  The meaning of the field could be normalized out of this by taking the information in the ## components and putting it into a separate column.

The problem with trying to get "102 gallon" to come out after "45 gallon" is that the natural sorting order for character strings sees 1 as less than 4.  But if you have this number in a column of type INT, your WHERE clause will work to your advantage.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39194957
@Ray

I agree.  That was the gist of my table-normalization comment http:#a39188200
0

Featured Post

Looking for a new Web Host?

Lunarpages' assortment of hosting products and solutions ensure a perfect fit for anyone looking to get their vision or products to market. Our award winning customer support and 30-day money back guarantee show the pride we take in being the industry's premier MSP.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

656 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question