Solved

MySQL: extract data from BLOB, modify, place into another table

Posted on 2011-10-01
9
383 Views
Last Modified: 2012-06-27
In a particular DB table, I need to locate a URL in one field, extract part of it, and copy it to another field.

The source field is (seemingly for no good reason) a MEDIUMBLOB, however it is simply html. In it is:

http://domain.com/blog/?p=XXXX

In the rows containing this URL in the field, how can I extract XXXX from the URL and copy it into another field?
0
Comment
Question by:SonicVoom
  • 4
  • 3
  • 2
9 Comments
 
LVL 24

Expert Comment

by:johanntagle
Comment Utility
Think you're better of selecting the needed columns into a file then processing it using Perl, Sed, Awk, etc.  MySQL has regular expressions, but afaik it's only for matching, not extracting.  Do something like:

select primary_key_column, medium_blob_column from tablename
where medium_blob_column regex 'http:\/\/domain.com\/blog';

Note: obviously untested :)

0
 
LVL 23

Expert Comment

by:nemws1
Comment Utility
From what you describe, use SUBSTR and INSTR:
UPDATE my_table
SET new_field=SUBSTR(html_field, INSTR(html_field, 'p=')+2)
;

Open in new window

0
 
LVL 24

Expert Comment

by:johanntagle
Comment Utility
newws1 - that might work, but I think you need to chop off also any text after xxxx since I think the URL is not the only text in the column.

SonicVoom - would help if we know what comes before and after the url, i.e. is it part of an href?
0
 
LVL 23

Expert Comment

by:nemws1
Comment Utility
SonicVoom didn't say there was anything after, so I assume they want everything after the 'p=' portion of the URL.  If there is more, I can provide more. ;-)
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 2

Author Comment

by:SonicVoom
Comment Utility
yes, it is an html link.

<a Href =" URL " >text<a>
0
 
LVL 23

Accepted Solution

by:
nemws1 earned 500 total points
Comment Utility
Ah... still doable in a single SQL statement, just gets a slightly more complicated.

So, this grabs a portion from "html_field" and puts it into "new_field" (SUBSTR).  It does that by finding the first occurrence of "p=" in the field and starts looking 2 characters afterwards (to skip the "p=") - this value is stored in "@start".

Then it figures out the ending point by starting at the @start position (3rd argument to the 2nd LOCATE()) and looking for a double-quote.  However, this is an end point, and SUBSTR wants a length, so we just subtract our starting point (@start) and we have the length of part we want.
UPDATE my_table
SET new_field=SUBSTR(html_field,
                     @start:=LOCATE('p=', html_field)+2,
                     (LOCATE('"', html_field, @start) - @start)
                     )
;

Open in new window

0
 
LVL 2

Author Comment

by:SonicVoom
Comment Utility
OK, I tried it and it works sometimes.

html_field = text and new_field=blog_ID

I failed to mention that this URL is buried in other html, and sometimes it might not exist. So sometimes it properly finds what I want, and sometimes not.

I modified the third line to include more of the actual url incase there was another p= somewhere in the database.
@start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,

It seems that when it couldn't find the search string, it just started with the 26th character of the first line.

The first line of almost every `text` field is:
<html xmlns:o="urn:schemas-microsoft-com:office:office"
so the result in many rows was:
s-microsoft-com:office:office

So I first need to make sure the string exists, and just abort if it's not present. What do I need to add?

Thanks
0
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 500 total points
Comment Utility
Just add a WHERE statement:
WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'

Open in new window

0
 
LVL 2

Author Closing Comment

by:SonicVoom
Comment Utility
Thank you. It took me a few minutes where to put WHERE, but it did exactly what I needed.

final:

WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'
UPDATE articles
SET `post_ID`=SUBSTR(`text`,
                     @start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,
                     (LOCATE('"', `text`, @start) - @start)
                     ) WHERE `text` LIKE '%writingitreal.com/blog/?p=%'
;
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Envision that you are chipping away at another e-business site with a team of pundit developers and designers. Everything seems, by all accounts, to be going easily.
This tutorial walks through the best practices in adding a local business to Google Maps including how to properly search for duplicates, marker placement, and inputing business details. Login to your Google Account, then search for "Google Mapmaker…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now