Avatar of Mike Miller
Mike Miller
Flag for United States of America asked on

Rapidshare Link Regular Expression

I am working on a project and the client wants to be able to check rapidshare links to tell if they are valid. So I need some help with a regular expression function for PHP that will pull all the rapidshare.com or rapidshare.de links out of the page source it gets as a variable, and return an array with them.

Does that make sense?
Regular ExpressionsPHP

Avatar of undefined
Last Comment
b0lsc0tt

8/22/2022 - Mon
b0lsc0tt

qlogix,

You just need help getting the links with a regex right?  If so please provide a sample of the html that you would get which contains the links.

Let me know if you have any questions or need more information.

b0lsc0tt
Mike Miller

ASKER
Here is the example:
<td valign="top" class="postbody"><div class="postbody_div">
<hr />
<img src="http://i213.photobucket.com/albums/cc98/warezsharez_december/i-sound.gif" alt="Image" title="Image" border="0" />
<br />
 
<br />
 
<br />
<span style="font-weight:bold">i-Sound WMA MP3 Recorder</span> turn your computer into complete home recording studio. You can record streaming audio into MP3, OGG, WMA, APE, WAV format sound file directly without costing any other disk space. Built-in scheduler allows you to record streaming audio from specified URL at predefined time. VOX system automatically monitors the input source and activates streaming recording when the input volume reaches a specified level. The recording automatically stops once the audio level drops below a specified threshold. Typical applications:
 
<br />
 
<br />
    * Convert <span style="font-weight:bold">Cassette</span> or <span style="font-weight:bold">LP</span> to MP3
<br />
    * <span style="font-weight:bold">Record Radio</span> with built-in scheduler
 
<br />
    * Streaming Audio Recorder:<span style="font-weight:bold">Capture Streaming Audio</span> to MP3,WMA,OGG
<br />
    * Record Lectures with <span style="font-weight:bold">Voice-Activation</span>
<br />
    * Real-time noise reduction
<br />
    * <span style="font-weight:bold">Record Skype Calls</span> (both sides)
 
<br />
    * Record protected <span style="font-weight:bold">M4P, WMA</span> and <span style="font-weight:bold">AAC</span> files to MP3 format legally.
<br />
    * Record <span style="font-weight:bold">MIDI to MP3</span> format
<br />
 
<br />
 
<br />
<span style="font-size:18px; line-height:normal"><span style="font-weight:bold">Download:</span></span>
<br />
<table width="90%" cellspacing="1" cellpadding="3" border="0" align="center">
<tr>
<td><span class="genmed"><strong>Code:</strong></span></td>
</tr>
<tr>
<td class="code">http&#58;//rapidshare.com/files/80236549/iSound_MP3_WMA_Recorder_Pro_v6.8.2.0.rar</td></tr></table>
<br />
<span style="font-weight:bold"><span style="font-size:10px; line-height:normal">Link checked on Mon Jul 21, 2008 11:38 am [WBB_Linkchecker_Bot]</span></span></div></td>
 
</tr>
<tr>
<td height="40" valign="bottom" class="genmed"><br />_________________<br />Removed, Signature May Not Be Bigger Than 500 x 200 ~ ashmo ~
<br />
 
<br />
 
<br />
<span style="font-weight:bold">Please Download This As A Free User:</span>
<br />
<table width="90%" cellspacing="1" cellpadding="3" border="0" align="center">
<tr>
<td><span class="genmed"><strong>Code:</strong></span></td>
</tr>
 
<tr>
<td class="code">http&#58;//rapidshare.com/files/79526037/Thank_You.mp3</td></tr></table><span class="postdetails"></span></td>
</tr>
</table>

Open in new window

b0lsc0tt

Thanks for the sample?  A complex regex to id ANY url could be more than we need.  Will all the urls you want be http?  Do you want all urls or do you want just those in the anchor tag, just in the table cell, or where?  Will they all be rapidshare.com?

bol
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
b0lsc0tt

Oops.  I didn't need a question mark on that first sentence.  Sorry. :)

bol
Mike Miller

ASKER
Yes I only want the rapidshare.com links, they will all be http:// however there may be either rapidshare.com or rapidshare.de links.  I would like them to be matchable from anywhere in the source since alot of people dont place them in the "Code" tag on the forum.
b0lsc0tt

OK.  Thanks.  I realize I overlooked the answer to one of my questions.  The code below should do it.

preg_match_all('/http(?::|&#58;)\/\/rapidshare\.(?:com|de)\/[-A-Z0-9.]+\/[-A-Z0-9+&@#\/%=~_|!:,.;]*/i', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];

It assumes the html is in the variable $subject.  It will then put the resulting array in a variable named $result.  Let me know if you have a question or need more info.

bol
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
SOLUTION
ddrudik

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Mike Miller

ASKER
Umm, neither of those seem to work. I tested them on another site that is similar to my clients and it said that there was no matches for either of them.

the page I tested on was : http://www.warez-bb.org/viewtopic.php?t=1061942
ASKER CERTIFIED SOLUTION
b0lsc0tt

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Mike Miller

ASKER
Ok that worked, apparently my server is having a problem accessing the site. Thanks!!
Mike Miller

ASKER
Thanks for your help!!
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
b0lsc0tt

Your welcome!  If the problem doesn't go away when the site works again then it may be the contents of that page.  If there is a big difference in the html in each page then it could make it so the script won't find a match.

I'm glad I could help.  Thanks for the grade, the points and the fun question.

bol