Solved

PHP Changing url in text to html link

Posted on 2008-11-01
10
740 Views
Last Modified: 2012-08-14
If I have 200 characters of text that contain a url, what is the fastest way to search the text for a url and, if there is one, wrap it in an anchor tag?

Actually I'll have an array of text snippets that I need to loop through, if that will matter.

foreach($temptext as $key => $node) {

$text[$key] = $node->textContent;

}

At some point, I need to search for urls in each $node->textContent and convert them to clickable links.

I can figure out how to do it with a series of string functions that searches for http, extracts it, wraps it in an anchor tag, etc. but I'm sure there is a better (more efficient) way to do it.
0
Comment
Question by:gwkg
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 5
10 Comments
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859545
Can you show a sample?
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859549
Also, is this http/https urls only?
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859561
Yes, http only.

I'm not sure if you want to process the text before or after it is assigned to the $text array, but here is a sample after it is assigned.

$text[1] = "I'm sitting at my desk typing.";
$text[2] = "http://mydomain.com";
$text[3] = "Check out this site I found http://thissiteifound.com";

I want to change $text[2] into <a href="http://mydomain.com">http://mydomain.com</a>
and $text[3] into Check out this site I found <a href="http://thissiteifound.com">http://thissiteifound.com</a>


0
Transaction Monitoring Vs. Real User Monitoring

Synthetic Transaction Monitoring Vs. Real User Monitoring: When To Use Each Approach? In this article, we will discuss two major monitoring approaches: Synthetic Transaction and Real User Monitoring.

 
LVL 31

Author Comment

by:gwkg
ID: 22859649
I pulling the node from an xml file I grab with curl, if it matters where $temptext is coming from

$xml = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$temptext = $xpath->evaluate("//status/text");
0
 
LVL 39

Accepted Solution

by:
Roger Baklund earned 500 total points
ID: 22859659
Ok, I took a shot at it. Note that this regexp is not properly tested, I just wrote it now. Test it and see if it does what you need.
foreach($temptext as $key => $node) {
  $text[$key] = preg_replace(
    '!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!',
    '\1<a href="\2">\2</a>\3\4',
    $node->textContent);
}

Open in new window

0
 
LVL 39

Assisted Solution

by:Roger Baklund
Roger Baklund earned 500 total points
ID: 22859665
...and I allready found a weakness... it does not match UPPERCASE letters... add an i at the end:

'!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!i'
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859710
Worked just like I need it to, thanks!

Quick question... what's the difference between your expression and this one produced by RegexBuddy

\b(https?)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]

There comment is "The final character class makes sure that if an URL is part of some text, punctuation such as a comma or full stop after the URL is not interpreted as part of the URL."

I know the basics, like [a-z] and wildcards, so maybe I should be more specific.

How is theres case insenstive without the trailing i?

Will yours include the punctuation in the link if the text is "Check out http://mydomain.com."?


I ask so next time I can try and get it right on my own.
0
 
LVL 31

Author Closing Comment

by:gwkg
ID: 31512405
I asked a follow up question if you have the time.  Thanks!
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859791
The RegexBuddy version does not include pattern modifers. It could be enclosed in ! ... !i  or similar when used.

My version was weak when it comes to catching the end of the url. It requires a space, or to be at the end of the string: ([ ]|$). You can easily add the dot: ([ \.]|$), but specifying any character NOT in the list of valid chars is probably best: ([^a-z0-9\.\-/\?&\+~_=%#;@$]|$).
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859814
I'm using it to pull my twitter feed so I can control whether or not there is a space.  

I'll need that extra info if I want to pull someone elses feed, though

Thanks again
0

Featured Post

Don't Cry: How Liquid Web is Ensuring Security

WannaCry is just the start. Read how Liquid Web is protecting itself and its customers against new threats.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Styling your websites can become very complex. Here I'll show how SASS can help you better organize, maintain and reuse your CSS code.
3 proven steps to speed up Magento powered sites. The article focus is on optimizing time to first byte (TTFB), full page caching and configuring server for optimal performance.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question