Reg Exp to extract Url from string asp

robrodp
robrodp used Ask the Experts™
on
I need the code to extract  Url from string asp Reg Exp
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Big MontyWeb Ninja at large

Commented:
can you give us an example of what you're looking to do?
robrodpProgrammer

Author

Commented:
I have a long string (and I have to do it for many)

I need a regular expression that will extract the url following url=

Say for example

<html>
<head>
    <title>Omgili Redirection</title>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <meta http-equiv="refresh" content="5; url=https://www.elsoldemexico.com.mx/mexico/619673-advierten-a-empresas-mexicanas-por-costo-social-si-van-por-el-muro" />
    <script>
        (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
            (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
                m=s.get

I need to extract the url:

https://www.elsoldemexico.com.mx/mexico/619673-advierten-a-empresas-mexicanas-por-costo-social-si-van-por-el-muro

with http:// or https://
Big MontyWeb Ninja at large

Commented:
try the following:

"http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"
Introduction to R

R is considered the predominant language for data scientist and statisticians. Learn how to use R for your own data science projects.

robrodpProgrammer

Author

Commented:
Hi thx

I have the expression what I need is a working asp code:
Set regEx = New RegExp
    regEx.Pattern = "http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"
    regEx.Global = true
    Set RegExResults = regEx.Execute(strTarget)
    Set regEx = Nothing

Set arrResults = RegExResults(pagina)

Open in new window


I need to extract the exact url from as´p
Big MontyWeb Ninja at large

Commented:
clarifying what exactly you need will usually help in the long run...

function getURL( str ) 
    Set regEx = New RegExp
    regEx.Pattern = "http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"
    regEx.Global = true
    Set RegExResults = regEx.Execute( str )
    Set regEx = Nothing

   getURL = RegExResults
end function

url = getURL( strToCheck )

Open in new window

robrodpProgrammer

Author

Commented:
Thx

I am getting this:

<font face="Arial" size=2>Wrong number of arguments or invalid property assignment</font>
<p>
<font face="Arial" size=2>/xstandard/httpreg.asp</font><font face="Arial" size=2>, line 23</font>

Any ideas?
robrodpProgrammer

Author

Commented:
This is the string where the url is and I want to extract the url.

<html>
<head>
    <title>Omgili Redirection</title>
    <meta http-equiv="content-type" content="text/html;charset=utf-8">
    <meta http-equiv="refresh" content="5; url=https://www.elsoldemexico.com.mx/mexico/619673-advierten-a-empresas-mexicanas-por-costo-social-si-van-por-el-muro" />
    <script>
        (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
            (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
                m=s.get

What am I doing wrong?
Big MontyWeb Ninja at large

Commented:
can you post your full code, including a string to test on? something that I can throw in a page (so code without any dependencies) for testing?
robrodpProgrammer

Author

Commented:
pagina is the string from which http://www.lasalud.mx/permalink/18633.htm is to be extracted
<%
pagina="<html>'<head>'    <title>Omgili Redirection</title>'    <meta http-equiv='content-type' content='text/html;charset=utf-8'>'    <meta http-equiv='' content='5; url=http://www.lasalud.mx/permalink/18633.html' />' "


Set regEx = New RegExp
    regEx.Pattern = "/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/"
    regEx.Global = true
    Set RegExResults = regEx.Execute(pagina)
    Set regEx = Nothing

Set arrResults = RegExResults

response.write arrResults
%>
Web Ninja at large
Commented:
give this a shot, as it works for me. it IS leaving a trailing single quote, not sure why, as it's listed in the pattern, but you should be able to figure that part out :)

<%
pagina="<html>'<head>'    <title>Omgili Redirection</title>'    <meta http-equiv='content-type' content='text/html;charset=utf-8'>'    <meta http-equiv='' content='5; url=http://www.lasalud.mx/permalink/18633.html' />' "
    
Function RegExResults(strTarget)

    Set regEx = New RegExp
    regEx.Pattern = "http(s)?://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&amp;\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?"
    regEx.Global = true
    Set RegExResults = regEx.Execute(strTarget)
    Set regEx = Nothing

End Function

'Pass the original string and pattern into the function and get a collection object back'
Set arrResults = RegExResults(pagina)

'In your pattern the answer is the first group, so all you need is'
For each result in arrResults
    Response.Write(result.Value)
Next

Set arrResults = Nothing
%>

Open in new window

robrodpProgrammer

Author

Commented:
Thx 1,000,000

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial