Solved

PHP Changing url in text to html link

Posted on 2008-11-01
10
717 Views
Last Modified: 2012-08-14
If I have 200 characters of text that contain a url, what is the fastest way to search the text for a url and, if there is one, wrap it in an anchor tag?

Actually I'll have an array of text snippets that I need to loop through, if that will matter.

foreach($temptext as $key => $node) {

$text[$key] = $node->textContent;

}

At some point, I need to search for urls in each $node->textContent and convert them to clickable links.

I can figure out how to do it with a series of string functions that searches for http, extracts it, wraps it in an anchor tag, etc. but I'm sure there is a better (more efficient) way to do it.
0
Comment
Question by:gwkg
  • 5
  • 5
10 Comments
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859545
Can you show a sample?
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859549
Also, is this http/https urls only?
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859561
Yes, http only.

I'm not sure if you want to process the text before or after it is assigned to the $text array, but here is a sample after it is assigned.

$text[1] = "I'm sitting at my desk typing.";
$text[2] = "http://mydomain.com";
$text[3] = "Check out this site I found http://thissiteifound.com";

I want to change $text[2] into <a href="http://mydomain.com">http://mydomain.com</a>
and $text[3] into Check out this site I found <a href="http://thissiteifound.com">http://thissiteifound.com</a>


0
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 31

Author Comment

by:gwkg
ID: 22859649
I pulling the node from an xml file I grab with curl, if it matters where $temptext is coming from

$xml = curl_exec($ch);
$dom = new DOMDocument();
@$dom->loadXML($xml);
$xpath = new DOMXPath($dom);
$temptext = $xpath->evaluate("//status/text");
0
 
LVL 39

Accepted Solution

by:
Roger Baklund earned 500 total points
ID: 22859659
Ok, I took a shot at it. Note that this regexp is not properly tested, I just wrote it now. Test it and see if it does what you need.
foreach($temptext as $key => $node) {
  $text[$key] = preg_replace(
    '!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!',
    '\1<a href="\2">\2</a>\3\4',
    $node->textContent);
}

Open in new window

0
 
LVL 39

Assisted Solution

by:Roger Baklund
Roger Baklund earned 500 total points
ID: 22859665
...and I allready found a weakness... it does not match UPPERCASE letters... add an i at the end:

'!(.*?)(https?://[a-z0-9\.\-/\?&\+~_=%#;@$]+)([ ]|$)(.*)!i'
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859710
Worked just like I need it to, thanks!

Quick question... what's the difference between your expression and this one produced by RegexBuddy

\b(https?)://[-A-Z0-9+&@#/%?=~_|!:,.;]*[-A-Z0-9+&@#/%=~_|]

There comment is "The final character class makes sure that if an URL is part of some text, punctuation such as a comma or full stop after the URL is not interpreted as part of the URL."

I know the basics, like [a-z] and wildcards, so maybe I should be more specific.

How is theres case insenstive without the trailing i?

Will yours include the punctuation in the link if the text is "Check out http://mydomain.com."?


I ask so next time I can try and get it right on my own.
0
 
LVL 31

Author Closing Comment

by:gwkg
ID: 31512405
I asked a follow up question if you have the time.  Thanks!
0
 
LVL 39

Expert Comment

by:Roger Baklund
ID: 22859791
The RegexBuddy version does not include pattern modifers. It could be enclosed in ! ... !i  or similar when used.

My version was weak when it comes to catching the end of the url. It requires a space, or to be at the end of the string: ([ ]|$). You can easily add the dot: ([ \.]|$), but specifying any character NOT in the list of valid chars is probably best: ([^a-z0-9\.\-/\?&\+~_=%#;@$]|$).
0
 
LVL 31

Author Comment

by:gwkg
ID: 22859814
I'm using it to pull my twitter feed so I can control whether or not there is a space.  

I'll need that extra info if I want to pull someone elses feed, though

Thanks again
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I found this questions asking how to do this in many different forums, so I will describe here how to implement a solution using PHP and AJAX. The logical flow for the problem should be: Write an event handler for the first drop down box to get …
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question