Solved

MySQL: extract data from BLOB, modify, place into another table

Posted on 2011-10-01
9
399 Views
Last Modified: 2012-06-27
In a particular DB table, I need to locate a URL in one field, extract part of it, and copy it to another field.

The source field is (seemingly for no good reason) a MEDIUMBLOB, however it is simply html. In it is:

http://domain.com/blog/?p=XXXX

In the rows containing this URL in the field, how can I extract XXXX from the URL and copy it into another field?
0
Comment
Question by:SonicVoom
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 24

Expert Comment

by:johanntagle
ID: 36898209
Think you're better of selecting the needed columns into a file then processing it using Perl, Sed, Awk, etc.  MySQL has regular expressions, but afaik it's only for matching, not extracting.  Do something like:

select primary_key_column, medium_blob_column from tablename
where medium_blob_column regex 'http:\/\/domain.com\/blog';

Note: obviously untested :)

0
 
LVL 23

Expert Comment

by:nemws1
ID: 36898221
From what you describe, use SUBSTR and INSTR:
UPDATE my_table
SET new_field=SUBSTR(html_field, INSTR(html_field, 'p=')+2)
;

Open in new window

0
 
LVL 24

Expert Comment

by:johanntagle
ID: 36898228
newws1 - that might work, but I think you need to chop off also any text after xxxx since I think the URL is not the only text in the column.

SonicVoom - would help if we know what comes before and after the url, i.e. is it part of an href?
0
Get Database Help Now w/ Support & Database Audit

Keeping your database environment tuned, optimized and high-performance is key to achieving business goals. If your database goes down, so does your business. Percona experts have a long history of helping enterprises ensure their databases are running smoothly.

 
LVL 23

Expert Comment

by:nemws1
ID: 36898547
SonicVoom didn't say there was anything after, so I assume they want everything after the 'p=' portion of the URL.  If there is more, I can provide more. ;-)
0
 
LVL 2

Author Comment

by:SonicVoom
ID: 36900374
yes, it is an html link.

<a Href =" URL " >text<a>
0
 
LVL 23

Accepted Solution

by:
nemws1 earned 500 total points
ID: 36900439
Ah... still doable in a single SQL statement, just gets a slightly more complicated.

So, this grabs a portion from "html_field" and puts it into "new_field" (SUBSTR).  It does that by finding the first occurrence of "p=" in the field and starts looking 2 characters afterwards (to skip the "p=") - this value is stored in "@start".

Then it figures out the ending point by starting at the @start position (3rd argument to the 2nd LOCATE()) and looking for a double-quote.  However, this is an end point, and SUBSTR wants a length, so we just subtract our starting point (@start) and we have the length of part we want.
UPDATE my_table
SET new_field=SUBSTR(html_field,
                     @start:=LOCATE('p=', html_field)+2,
                     (LOCATE('"', html_field, @start) - @start)
                     )
;

Open in new window

0
 
LVL 2

Author Comment

by:SonicVoom
ID: 36906386
OK, I tried it and it works sometimes.

html_field = text and new_field=blog_ID

I failed to mention that this URL is buried in other html, and sometimes it might not exist. So sometimes it properly finds what I want, and sometimes not.

I modified the third line to include more of the actual url incase there was another p= somewhere in the database.
@start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,

It seems that when it couldn't find the search string, it just started with the 26th character of the first line.

The first line of almost every `text` field is:
<html xmlns:o="urn:schemas-microsoft-com:office:office"
so the result in many rows was:
s-microsoft-com:office:office

So I first need to make sure the string exists, and just abort if it's not present. What do I need to add?

Thanks
0
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 500 total points
ID: 36906849
Just add a WHERE statement:
WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'

Open in new window

0
 
LVL 2

Author Closing Comment

by:SonicVoom
ID: 36907235
Thank you. It took me a few minutes where to put WHERE, but it did exactly what I needed.

final:

WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'
UPDATE articles
SET `post_ID`=SUBSTR(`text`,
                     @start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,
                     (LOCATE('"', `text`, @start) - @start)
                     ) WHERE `text` LIKE '%writingitreal.com/blog/?p=%'
;
0

Featured Post

Get Database Help Now w/ Support & Database Audit

Keeping your database environment tuned, optimized and high-performance is key to achieving business goals. If your database goes down, so does your business. Percona experts have a long history of helping enterprises ensure their databases are running smoothly.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When crafting your “Why Us” page, there are a plethora of pitfalls to avoid. Follow these five tips, and you’ll be well on your way to creating an effective page.
Recently I was talking with Tim Sharp, one of my colleagues from our Technical Account Manager team about MongoDB’s scalability. While doing some quick training with some of the Percona team, Tim brought something to my attention...
The viewer will learn how to dynamically set the form action using jQuery.
Any person in technology especially those working for big companies should at least know about the basics of web accessibility. Believe it or not there are even laws in place that require businesses to provide such means for the disabled and aging p…

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question