Solved

I need to replace similar text in 4600+ rows

Posted on 2011-03-13
11
773 Views
Last Modified: 2012-05-11
Dear experts i have 10000 posts that i'm manually editing, but i need a faster way.

The problem:
4600+ posts contain has a string that i need to be removed.



How am i proceeding at the moment:

in phpMyAdmin i run the query
SELECT post_content FROM wp_posts WHERE post_content like '%. Written by %'

Open in new window


And i get what you can see in this image
http://i.imgur.com/iNeu5.png

Then i edit one of these posts in order to copy-paste the string "Written by Author"
http://i.imgur.com/tEYXF.png

so i can create for example the sql instruction
update wp_posts set post_content = replace(post_content,' Written by Alfred Jingle','');

Open in new window


and run it in the hope that i can remove as more text strings as possible (each author usually wrote 1 to 50 posts). if i'm lucky i can remove 50 text strings in a sql run, else just 7 like in this case (that's frustrating)
http://i.imgur.com/Mo6g5.png

Luckily the content of all posts has the same structure: after the name of the author i find the text
"</span></p><p>

Open in new window

so i think that there could be a way to find the name of all authors and create a query to remove them.



What i need:

A way to obtain a sql query containing all the instructions ready to be run like this (hand made) example:

update wp_posts set `post_content` = replace(post_content,' Written by Alfred Jingle','');
update wp_posts set `post_content` = replace(post_content,' Written by Steve Rahaman','');
update wp_posts set `post_content` = replace(post_content,' Written by Corey Hatch','');
update wp_posts set `post_content` = replace(post_content,' Written by Huggo','');
update wp_posts set `post_content` = replace(post_content,' Written by Leo &#x3C;mleonar1@ic3.ithaca.edu&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Tad Dibbern &#x3C;DIBBERN_D@a1.mscf.upenn.edu&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by &#x3C;jhailey@hotmail.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Steve Richer &#x3C;sricher@sympatico.ca&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Murray Chapman &#x3C;muzzle@cs.uq.oz.au&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Anonymous','');
update wp_posts set `post_content` = replace(post_content,' Written by Abatoli Maksimov','');
update wp_posts set `post_content` = replace(post_content,' Written by Orange','');
update wp_posts set `post_content` = replace(post_content,' Written by Ed Sutton &#x3C;esutton@mindspring.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Jim Beaver &#x3C;jumblejim@prodigy.net&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by John Vogel &#x3C;jlvogel@comcast.net&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by FilmFanUK','');
update wp_posts set `post_content` = replace(post_content,' Written by Film_Fan','');
update wp_posts set `post_content` = replace(post_content,' Written by Sujit R. Varma','');
update wp_posts set `post_content` = replace(post_content,' Written by Sujit R. Varma','');
update wp_posts set `post_content` = replace(post_content,' Written by Anthony Pereyra {hypersonic91@yahoo.com}','');
update wp_posts set `post_content` = replace(post_content,' Written by KGF Vissers','');
update wp_posts set `post_content` = replace(post_content,' Written by Jes Beard &#x3C;jesbeard@bellsouth.net&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by anonymous','');
update wp_posts set `post_content` = replace(post_content,' Written by IMDb Editors','');
update wp_posts set `post_content` = replace(post_content,' Written by Ryan McIntosh &#x3C;Save_Ferris85@hotmail.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Sean Daly','');
update wp_posts set `post_content` = replace(post_content,' Written by matt-282','');
update wp_posts set `post_content` = replace(post_content,' Written by Grand Ilusions','');
update wp_posts set `post_content` = replace(post_content,' Written by Jonathan Ruskin &#x3C;JonRuskin@aol.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by PhatBleek','');
update wp_posts set `post_content` = replace(post_content,' Written by Julian Reischl &#x3C;julianreischl@mac.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Jeff Hole &#x3C;jeffhole@aol.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Michael Feller &#x3C;reb@magna.com.au&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Tim Kretschmann &#x3C;Tim.K@VirComm.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by Tony B','');
update wp_posts set `post_content` = replace(post_content,' Written by Stephen Currence &#x3C;billyc@erols.com&#x3E;','');
update wp_posts set `post_content` = replace(post_content,' Written by wb','');
update wp_posts set `post_content` = replace(post_content,' Written by MGM Studios','');

Open in new window



Thank you for reading



0
Comment
Question by:fario
  • 5
  • 3
  • 2
  • +1
11 Comments
 
LVL 8

Assisted Solution

by:dmeeren
dmeeren earned 50 total points
ID: 35121234
I always program a little php file for this kind of actions, here is what i use:

## Read the database
#
$sql = "SELECT `url`, `gal_id` FROM `" . $tabel . "` WHERE `url` LIKE '%&amp;%'";

$progress .=$lan_sqlcommand . ":" . $sql . "<br>\n";
$res3 = mysql_query($sql) or die(mysql_error());
$num3 = mysql_num_rows($res3);

if (empty($num3))
  {
  	$progress .="No Gallery's found";
  	## Nothing found
	#
  }
else
  {
	while($row = mysql_fetch_object($res3))
	  {
		$url="$row->url";
		$gal_id="$row->gal_id";
		
		$new_url = str_replace("&amp;", "&", $url);
		
		$sql = "UPDATE `" . $tabel . "` SET `url`='" . $new_url . "' WHERE `gal_id`='" . $gal_id . "'";
		
		$progress .=$lan_sqlcommand . ":" . $sql . "<br>\n";	
		$res2 = mysql_query($sql) or die(mysql_error());
	  }
  }

print $progress;

Open in new window

0
 
LVL 34

Assisted Solution

by:Beverley Portlock
Beverley Portlock earned 50 total points
ID: 35122106
I would a regular expression for this. The key point is to establish the end of pattern. It clearly starts with "Written by" and it seems to end with a double quote. On that basis we can take code like this sample
<?php

$data = 'adasdsadsad. sadd sa . Written by Ryan McIntosh &#x3C;Save_Ferris85@hotmail.com&#x3E;" dsdsadsd';

$pattern = '#(.*?)Written by[^"]+(.*?)#s';

$newData = preg_replace( $pattern, '$1$2', $data );

echo $newData;
?>

Open in new window


and adapt it along the lines of this UNTESTED code. Always remeber to have a backup of data before doing this sort of update

$pattern = '#(.*?)Written by[^"]+(.*?)#s';

$rs = mysql_query("Select * from wp_posts where post_content REGEXP '.*Written by [^"]+.*' ");

if ( $rs )
     while ( $rw = mysql_fetch_assoc( $rs ) ) {

          $newData = preg_replace( $pattern, '$1$2', $rw['post_content'] );

          mysql_query("update wp_posts set post_content = '$newData' where ....some unique key....");
     }

Open in new window

0
 
LVL 1

Author Comment

by:fario
ID: 35137448
Parse error: syntax error, unexpected ']' in /home/watchon3/public_html/correzione-autori/correzione2.php on line 34

This is my code

<?php
include_once 'master_config.inc.php';
include_once 'db.class.inc.php';
$db=new db($DB_LOCALHOST,$DB_DATABASE,$DB_USERNAME,$DB_PASSWORD);

/*
$data = 'adasdsadsad. sadd sa . Written by Ryan McIntosh &#x3C;Save_Ferris85@hotmail.com&#x3E;" dsdsadsd';
echo 'starting string:<br>'.$data.'<br><br>';

$pattern = '#(.*?)Written by[^"]+(.*?)#s';

$newData = preg_replace( $pattern, '$1$2', $data );

echo 'new text:<br>'.$newData.'<br>';
echo $newData;


$pattern = '#(.*?)Written by[^"]+(.*?)#s';

$rs = mysql_query("Select * from wp_posts where post_content REGEXP '.*Written by [^"]+.*' ");

if ( $rs )
     while ( $rw = mysql_fetch_assoc( $rs ) ) {

          $newData = preg_replace( $pattern, '$1$2', $rw['post_content'] );

          mysql_query("update wp_posts set post_content = '$newData' where ....some unique key....");
     }
*/


$pattern = '#(.*?)Written by[^"]+(.*?)#s';

$query = "Select * from wp_posts where post_content REGEXP '.*Written by [^"]+.*' LIMIT 100";

$rs = mysql_query($query);

if ( $rs )
     while ( $rw = mysql_fetch_assoc( $rs ) ) {

          $newData = preg_replace( $pattern, '$1$2', $rw['post_content'] );

          //mysql_query("update wp_posts set post_content = '$newData' where ....some unique key....");
          echo("update wp_posts set post_content = '$newData' where ....some unique key....");
     }


?>

Open in new window

0
 
LVL 3

Expert Comment

by:CarlsbergFTW
ID: 35137995
I'VE Tried developing an sql statement to help you with your update unfortunately i don't have a mysql database available and worked with oracle one.

I'm not sure if the fuctions have the same correspondence in mysql but i'm sure that if they don't have the same name ,  they exist in some kind.

here's what i've came to till now, maybe it will serve as an example.

and here is the code i've used:

SELECT   REPLACE (
            text,
            (SUBSTR (
                text,
                INSTR (text, 'Written by'),
                (INSTR (text, '"</span></p><p>'))
                - (INSTR (text, 'Written by'))
             )),
            'THIS TO BE REPLACED'
         )
  FROM   POST

Open in new window



Again this is tested under oracle SQL.

Have a nice day.

sql-example.JPG
0
 
LVL 34

Expert Comment

by:Beverley Portlock
ID: 35138423
It's the double quote. Change line 34 from

$query = "Select * from wp_posts where post_content REGEXP '.*Written by [^"]+.*' LIMIT 100";

to

$query = "Select * from wp_posts where post_content REGEXP '.*Written by [^\"]+.*' LIMIT 100";
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 3

Expert Comment

by:CarlsbergFTW
ID: 35138518
i'd like to add thet the example i've provided will not work for all possibilities, i guess that the "written by part is in most cases at the end of the article -as per your example- so then we begin searching from the end and should only update the last "written by bla bla bla "</span></p><p>" in your table :

SELECT   REPLACE (
            post_content,
            (SUBSTR (
                post_content,
                INSTR (post_content, 'Written by',-1),
                (INSTR (post_content, '"</span></p><p>',-1))
                - (INSTR (post_content, 'Written by',-1))
             )),
            ''
         )
  FROM   wp_posts

Open in new window


The update code should be:
update wp_posts set post_content=(  REPLACE (
            post_content,
            (SUBSTR (
                post_content,
                INSTR (post_content, 'Written by',-1),
                (INSTR (post_content, '"</span></p><p>',-1))
                - (INSTR (post_content, 'Written by',-1))
             )),
            ''
         ))

Open in new window

Done some research on google and mysql's substr is equivalent to "substring()"

IMPORTANT: please test  -back up your table / database- this before you run it

sql-example.JPG
0
 
LVL 1

Author Comment

by:fario
ID: 35138903
#1305 - FUNCTION watchon3_wrdp1.SUBSTR does not exist



removing the spaces after the SUBSTR and INSTR i get

#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '-1)) - (INSTR(post_content, '. Written by',-1)) )), ''))' at line 1
0
 
LVL 1

Author Comment

by:fario
ID: 35138984
Forget about the automatic update of the DB

How can i use the REGEXP and my Query

SELECT post_content FROM wp_posts WHERE post_content like '%. Written by %'

Open in new window



to simply list all the author name ?
0
 
LVL 3

Accepted Solution

by:
CarlsbergFTW earned 400 total points
ID: 35139127
use substring for mysql
0
 
LVL 1

Assisted Solution

by:fario
fario earned 0 total points
ID: 35139689
OK thanks

You helped me to find out this

SELECT SUBSTRING(post_content FROM LOCATE('. Written by ', post_content) FOR (LOCATE('"</span></p><p>', post_content) - LOCATE('. Written by ', post_content)))  FROM wp_posts WHERE post_content like '%. Written by %'

Open in new window

0
 
LVL 1

Author Closing Comment

by:fario
ID: 35174536
It's not an accurate solution, but helped thanks
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
This article discusses four methods for overlaying images in a container on a web page
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now