Link to home
Start Free TrialLog in
Avatar of Fernanditos
Fernanditos

asked on

Function to convert Text URL into HTML URL

I recently opened this question here:
https://www.experts-exchange.com/questions/27447649/Converting-TEXT-urls-into-HTML-urls.html

But after testing several situations I noted that the function had some "leaks".

In the previous solution, the link get broken if there is a white space. This seems to be hard to do because even EE editor has this issue.

Example I have this text URLs in a variable:

$urls = "
http://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
http://www.wupload.com/file/209058300/TEST Voice 640-461 Test Cert Guide.epub
http://www.wupload.com/file/209003757/TE+ST Voice 6_40-461 Test Cert Guide.pdf

"

The result should be one link by line:
http://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
etc...

Can some expert help me out with this function?
Avatar of amigura
amigura
Flag of United Kingdom of Great Britain and Northern Ireland image

$urls = "
http://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
https://www.wupload.com/file/209058300/TEST Voice 640-461 Test Cert Guide.epub
http://www.wupload.com/file/209003757/TE+ST Voice 6_40-461 Test Cert Guide.pdf

";

$urls = str_replace('https://', 'http://', $urls);
$urls= explode('http://', $urls);


   foreach ($urls as $formatted_str)
   { 
   $formatted_str=trim($formatted_str);
   if(!empty($formatted_str))
   {
echo "<a href=\"http://$formatted_str\">http://$formatted_str</a><br>";  
}
   }

Open in new window

Avatar of Fernanditos
Fernanditos

ASKER

thank you. https: are being ignored and replaced by http: which is not good. Some fix for that ? Also, could you please wrap this into a function text_to_link() ? Thank would be great!

Thank you.
$urls = "
https://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
https://www.wupload.com/file/209058300/TEST Voice 640-461 Test Cert Guide.epub
http://www.wupload.com/file/209003757/TE+ST Voice 6_40-461 Test Cert Guide.pdf

";

function text_to_link($urls){

$fd=array('http://','https://');$rp=array('##http://','##https://');
$urls = str_replace($fd, $rp, $urls);
$urls= explode('##', $urls);

   foreach ($urls as $formatted_str)
   { 
   $formatted_str=trim($formatted_str);
   if(!empty($formatted_str))
   {
echo "<a href=\"$formatted_str\">$formatted_str</a><br>";  
}
   }

}

Open in new window

Although DaveBaldwin mentioned it in the other question, conspicuous by its absence on this page is any mention of this function:
http://php.net/manual/en/function.urlencode.php

Please read the description of that function.  You were lucky enough to get away without using it in the other question because you gave us an incomplete test data set.  It only had URL strings that did not contain spaces or special characters.  The most important thing a programmer can have is a robust test data set.

Please use the code snippet and re-post the data string you want us to operate on.  I will try to use what is posted above, but the way EE creates links from client text may cause troubles.  The code snippet is a much more dependable way to post code.

Thanks, ~Ray
Given what we have to work with, this is what I would do.  You can use your browser's view source to see the completed links.  It should be fairly easy for you to encapsulate this code into a function or class method.
http://www.laprbass.com/RAY_temp_fernanditos.php

A few notes...

In the fourth URL you will see TE+ST.  The plus sign has a special meaning in a URL.  It is decoded into a blank.  That may or may not be what you want.  Line 10.

The regular expression is designed to work on a single, trimmed, line of text.  Note the use of trim() in the function call to preg_match() on line 32.

We use urlencode() on the part of the URL that needs to be encoded (line 35).

We use htmlentities() to escape the part of the output that might be potentially dangerous if we wrote it to the browser in clear text.  Generally speaking you want to escape any browser output that might have come from an external source like a data base, a cookie, or client input (line 38).

At this point we have an array with the correctly encoded URLs in the key positions and the escaped link text in the corresponding value positions.  We construct the links in the $new array by combining HTML tags, the URLs, and the link text (line 42).

Hope that helps, ~Ray
<?php // RAY_temp_fernanditos.php
error_reporting(E_ALL);
echo "<pre>";

// TEST DATA FROM THE POST AT EE
$urls = "
http://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
http://www.wupload.com/file/209058300/TEST Voice 640-461 Test Cert Guide.epub
http://www.wupload.com/file/209003757/TE+ST Voice 6_40-461 Test Cert Guide.pdf

";

// MAKE A REGEX TO EXTRACT THE INTACT URL STRINGS
$rgx
= '#'            // REGEX DELIMITER
. '^'            // START OF STRING
. '('            // START GROUP
. 'https?.*? '   // PROTOCOL PLUS THROUGH BLANK
. ')'            // END GROUP
. '(.*?)'        // GROUP - ANYTHING OR NOTHING
. '$'            // END OF STRING
. '#'            // REGEX DELIMITER
;

// TRIM AWAY THE NOISE AND MAKE AN ARRAY FROM THE LINES
$arr = explode(PHP_EOL, trim($urls));

// ITERATE OVER THE ARRAY TO EXTRACT THE URL PARTS
foreach ($arr as $str)
{
    preg_match($rgx, trim($str), $mat);

    // URLENCODE THE URL ARGUMENTS (PLUS = BLANK)
    $url = $mat[1] . '+' . urlencode($mat[2]);

    // ESCAPE THE TEXT VERSION THAT GOES TO THE BROWSER
    $out[$url] = htmlentities($mat[2]);
}

// ITERATE OVER THE URL PARTS TO CREATE LINKS
foreach ($out as $url => $str)
{
    $new[]
    = '<a '
    . 'href="'
    . $url
    . '">'
    . $str
    . '</a>'
    ;
}

// SHOW THE WORK PRODUCT
print_r($new);

Open in new window

Hi Ray,

This is the best solution I've tried, it work perfect with my test data a all type of links I tried. Also your explanation was super!

Just one thing and please accept my apologize because I forgot something on the test data that I had included on my other question.

See my exact test data type attached. Would be possible to allow single text line?

Thank you for the great support!
$urls = "
filesocinc
http://www.filesonic.com/file/2185085531/TEST Voice 640-461 Test Cert Guide.epub
http://www.filesonic.com/file/2185085481/TEST Voice (640)+461 Test Cert Guide.pdf
wpupload
http://www.wupload.com/file/209058300/TEST Voice 640-461 Test Cert Guide.epub
https://www.wupload.com/file/209003757/TE+ST Voice 6_40-461 Test Cert Guide.pdf

";

Open in new window

Not sure I understand the question.  What would you want to do with a line that said "filesocinc" ?
I just want to ignore "filesonic" or the other non url string.

I tried my last attached test data and I got:
Notice:  Undefined offset: 1 in Untitled-1.php on line 37

I hope it is clear now.

Thank you for your help!
The result I mean would look like this:
result.jpg
ASKER CERTIFIED SOLUTION
Avatar of Ray Paseur
Ray Paseur
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial

thank you. https: are being ignored and replaced by http: which is not good. Some fix for that ? Also, could you please wrap this into a function text_to_link() ? Thank would be great!

Thank you.

-------------------
mine does as you said, but rays is best solution.

you need to be specific instead of moving the goal post

Awesome solution as usual! Thank you.