Solved

PHP Changing url in text to html link

Posted on 2008-11-01
10
706 Views
Last Modified: 2012-08-14
If I have 200 characters of text that contain a url, what is the fastest way to search the text for a url and, if there is one, wrap it in an anchor tag?

Actually I'll have an array of text snippets that I need to loop through, if that will matter.

foreach($temptext as $key => $node) {

$text[$key] = $node->textContent;

}

At some point, I need to search for urls in each $node->textContent and convert them to clickable links.

I can figure out how to do it with a series of string functions that searches for http, extracts it, wraps it in an anchor tag, etc. but I'm sure there is a better (more efficient) way to do it.
0
Comment
Question by:gwkg
  • 5
  • 5
10 Comments
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859545
Can you show a sample?
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859549
Also, is this http/https urls only?
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859561
Yes, http only.

I'm not sure if you want to process the text before or after it is assigned to the $text array, but here is a sample after it is assigned.

$text[1] = "I'm sitting at my desk typing.";
$text[2] = "http://mydomain.com";
$text[3] = "Check out this site I found http://thissiteifound.com";

I want to change $text[2] into <a href="http://mydomain.com">http://mydomain.com</a>
and $text[3] into Check out this site I found <a href="http://thissiteifound.com">http://thissiteifound.com</a>


0
 
LVL 31

Author Comment

by:gwkg
ID: 22859649
I pulling the node from an xml file I grab with curl, if it matters where $temptext is coming from

$xml = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$temptext = $xpath->evaluate("//status/text");
0
 
LVL 39

Accepted Solution

by:
Roger Baklund earned 500 total points
ID: 22859659
Ok, I took a shot at it. Note that this regexp is not properly tested, I just wrote it now. Test it and see if it does what you need.
foreach($temptext as $key => $node) {

  $text[$key] = preg_replace(

    '!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!',

    '\1<a href="\2">\2</a>\3\4',

    $node->textContent);

}

Open in new window

0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 39

Assisted Solution

by:Roger Baklund
Roger Baklund earned 500 total points
ID: 22859665
...and I allready found a weakness... it does not match UPPERCASE letters... add an i at the end:

'!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!i'
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859710
Worked just like I need it to, thanks!

Quick question... what's the difference between your expression and this one produced by RegexBuddy

\b(https?)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]

There comment is "The final character class makes sure that if an URL is part of some text, punctuation such as a comma or full stop after the URL is not interpreted as part of the URL."

I know the basics, like [a-z] and wildcards, so maybe I should be more specific.

How is theres case insenstive without the trailing i?

Will yours include the punctuation in the link if the text is "Check out http://mydomain.com."?


I ask so next time I can try and get it right on my own.
0
 
LVL 31

Author Closing Comment

by:gwkg
ID: 31512405
I asked a follow up question if you have the time.  Thanks!
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859791
The RegexBuddy version does not include pattern modifers. It could be enclosed in ! ... !i  or similar when used.

My version was weak when it comes to catching the end of the url. It requires a space, or to be at the end of the string: ([ ]|$). You can easily add the dot: ([ \.]|$), but specifying any character NOT in the list of valid chars is probably best: ([^a-z0-9\.\-/\?&\+~_=%#;@$]|$).
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859814
I'm using it to pull my twitter feed so I can control whether or not there is a space.  

I'll need that extra info if I want to pull someone elses feed, though

Thanks again
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
JavaScript has plenty of pieces of code people often just copy/paste from somewhere but never quite fully understand. Self-Executing functions are just one good example that I'll try to demystify here.
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.
HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now