MySQL: extract data from BLOB, modify, place into another table

In a particular DB table, I need to locate a URL in one field, extract part of it, and copy it to another field.

The source field is (seemingly for no good reason) a MEDIUMBLOB, however it is simply html. In it is:

http://domain.com/blog/?p=XXXX

In the rows containing this URL in the field, how can I extract XXXX from the URL and copy it into another field?
LVL 2
SonicVoomAsked:
Who is Participating?
 
nemws1Connect With a Mentor Database AdministratorCommented:
Ah... still doable in a single SQL statement, just gets a slightly more complicated.

So, this grabs a portion from "html_field" and puts it into "new_field" (SUBSTR).  It does that by finding the first occurrence of "p=" in the field and starts looking 2 characters afterwards (to skip the "p=") - this value is stored in "@start".

Then it figures out the ending point by starting at the @start position (3rd argument to the 2nd LOCATE()) and looking for a double-quote.  However, this is an end point, and SUBSTR wants a length, so we just subtract our starting point (@start) and we have the length of part we want.
UPDATE my_table
SET new_field=SUBSTR(html_field,
                     @start:=LOCATE('p=', html_field)+2,
                     (LOCATE('"', html_field, @start) - @start)
                     )
;

Open in new window

0
 
johanntagleCommented:
Think you're better of selecting the needed columns into a file then processing it using Perl, Sed, Awk, etc.  MySQL has regular expressions, but afaik it's only for matching, not extracting.  Do something like:

select primary_key_column, medium_blob_column from tablename
where medium_blob_column regex 'http:\/\/domain.com\/blog';

Note: obviously untested :)

0
 
nemws1Database AdministratorCommented:
From what you describe, use SUBSTR and INSTR:
UPDATE my_table
SET new_field=SUBSTR(html_field, INSTR(html_field, 'p=')+2)
;

Open in new window

0
Improve Your Query Performance Tuning

In this FREE six-day email course, you'll learn from Janis Griffin, Database Performance Evangelist. She'll teach 12 steps that you can use to optimize your queries as much as possible and see measurable results in your work. Get started today!

 
johanntagleCommented:
newws1 - that might work, but I think you need to chop off also any text after xxxx since I think the URL is not the only text in the column.

SonicVoom - would help if we know what comes before and after the url, i.e. is it part of an href?
0
 
nemws1Database AdministratorCommented:
SonicVoom didn't say there was anything after, so I assume they want everything after the 'p=' portion of the URL.  If there is more, I can provide more. ;-)
0
 
SonicVoomAuthor Commented:
yes, it is an html link.

<a Href =" URL " >text<a>
0
 
SonicVoomAuthor Commented:
OK, I tried it and it works sometimes.

html_field = text and new_field=blog_ID

I failed to mention that this URL is buried in other html, and sometimes it might not exist. So sometimes it properly finds what I want, and sometimes not.

I modified the third line to include more of the actual url incase there was another p= somewhere in the database.
@start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,

It seems that when it couldn't find the search string, it just started with the 26th character of the first line.

The first line of almost every `text` field is:
<html xmlns:o="urn:schemas-microsoft-com:office:office"
so the result in many rows was:
s-microsoft-com:office:office

So I first need to make sure the string exists, and just abort if it's not present. What do I need to add?

Thanks
0
 
nemws1Connect With a Mentor Database AdministratorCommented:
Just add a WHERE statement:
WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'

Open in new window

0
 
SonicVoomAuthor Commented:
Thank you. It took me a few minutes where to put WHERE, but it did exactly what I needed.

final:

WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'
UPDATE articles
SET `post_ID`=SUBSTR(`text`,
                     @start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,
                     (LOCATE('"', `text`, @start) - @start)
                     ) WHERE `text` LIKE '%writingitreal.com/blog/?p=%'
;
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.