Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

MySQL: extract data from BLOB, modify, place into another table

Posted on 2011-10-01
9
Medium Priority
?
407 Views
Last Modified: 2012-06-27
In a particular DB table, I need to locate a URL in one field, extract part of it, and copy it to another field.

The source field is (seemingly for no good reason) a MEDIUMBLOB, however it is simply html. In it is:

http://domain.com/blog/?p=XXXX

In the rows containing this URL in the field, how can I extract XXXX from the URL and copy it into another field?
0
Comment
Question by:SonicVoom
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
9 Comments
 
LVL 24

Expert Comment

by:johanntagle
ID: 36898209
Think you're better of selecting the needed columns into a file then processing it using Perl, Sed, Awk, etc.  MySQL has regular expressions, but afaik it's only for matching, not extracting.  Do something like:

select primary_key_column, medium_blob_column from tablename
where medium_blob_column regex 'http:\/\/domain.com\/blog';

Note: obviously untested :)

0
 
LVL 23

Expert Comment

by:nemws1
ID: 36898221
From what you describe, use SUBSTR and INSTR:
UPDATE my_table
SET new_field=SUBSTR(html_field, INSTR(html_field, 'p=')+2)
;

Open in new window

0
 
LVL 24

Expert Comment

by:johanntagle
ID: 36898228
newws1 - that might work, but I think you need to chop off also any text after xxxx since I think the URL is not the only text in the column.

SonicVoom - would help if we know what comes before and after the url, i.e. is it part of an href?
0
RHCE - Red Hat OpenStack Prep Course

This course will provide in-depth training so that students who currently hold the EX200 & EX210 certifications can sit for the EX310 exam. Students will learn how to deploy & manage a full Red Hat environment with Ceph block storage, & integrate Ceph into other OpenStack service

 
LVL 23

Expert Comment

by:nemws1
ID: 36898547
SonicVoom didn't say there was anything after, so I assume they want everything after the 'p=' portion of the URL.  If there is more, I can provide more. ;-)
0
 
LVL 2

Author Comment

by:SonicVoom
ID: 36900374
yes, it is an html link.

<a Href =" URL " >text<a>
0
 
LVL 23

Accepted Solution

by:
nemws1 earned 2000 total points
ID: 36900439
Ah... still doable in a single SQL statement, just gets a slightly more complicated.

So, this grabs a portion from "html_field" and puts it into "new_field" (SUBSTR).  It does that by finding the first occurrence of "p=" in the field and starts looking 2 characters afterwards (to skip the "p=") - this value is stored in "@start".

Then it figures out the ending point by starting at the @start position (3rd argument to the 2nd LOCATE()) and looking for a double-quote.  However, this is an end point, and SUBSTR wants a length, so we just subtract our starting point (@start) and we have the length of part we want.
UPDATE my_table
SET new_field=SUBSTR(html_field,
                     @start:=LOCATE('p=', html_field)+2,
                     (LOCATE('"', html_field, @start) - @start)
                     )
;

Open in new window

0
 
LVL 2

Author Comment

by:SonicVoom
ID: 36906386
OK, I tried it and it works sometimes.

html_field = text and new_field=blog_ID

I failed to mention that this URL is buried in other html, and sometimes it might not exist. So sometimes it properly finds what I want, and sometimes not.

I modified the third line to include more of the actual url incase there was another p= somewhere in the database.
@start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,

It seems that when it couldn't find the search string, it just started with the 26th character of the first line.

The first line of almost every `text` field is:
<html xmlns:o="urn:schemas-microsoft-com:office:office"
so the result in many rows was:
s-microsoft-com:office:office

So I first need to make sure the string exists, and just abort if it's not present. What do I need to add?

Thanks
0
 
LVL 23

Assisted Solution

by:nemws1
nemws1 earned 2000 total points
ID: 36906849
Just add a WHERE statement:
WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'

Open in new window

0
 
LVL 2

Author Closing Comment

by:SonicVoom
ID: 36907235
Thank you. It took me a few minutes where to put WHERE, but it did exactly what I needed.

final:

WHERE 'text' LIKE '%writingitreal.com/blog/?p=%'
UPDATE articles
SET `post_ID`=SUBSTR(`text`,
                     @start:=LOCATE('writingitreal.com/blog/?p=', `text`)+26,
                     (LOCATE('"', `text`, @start) - @start)
                     ) WHERE `text` LIKE '%writingitreal.com/blog/?p=%'
;
0

Featured Post

Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

By, Vadim Tkachenko. In this article we’ll look at ClickHouse on its one year anniversary.
In this article, we’ll look at how to deploy ProxySQL.
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question