Link to home
Start Free TrialLog in
Avatar of psmithphil
psmithphilFlag for United States of America

asked on

Can't get Microsoft Regular Expression to work

I have urls that may start with "http" or "https" where I want to get the last parts of the url.  

For example, in "http://pear.nadd.tron.com/engg/resources/dwglogs/sum/sumpdf/joesl.pdf", I want to get just "resources/dwglogs/sum/sumpdf/joesl.pdf".   The first part is always one of two ways depending on whether the url has "http" or "https", and that is always followed by "://pear.nadd.tron.com/engg/".  I want everything after that.  I can't see a way to extract the last part of the url I want with SubString and/or IndexOf, so I tried using regular expressions.  However, I can't get the example in the Microsoft help file to work (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/html/cpconexampleextractingurlinformation.asp).

In the example below, I would want the function to return "resources/dwglogs/sum/sumpdf/joesl.pdf".  Can you spot where I'm missing it?  Thanks!


Function Extension(ByVal Url As String) As String
    Dim r As New Regex("^(http[s]?://)pear.nadd.tron.com/engrdocs/(?<port>:.+)$", RegexOptions.Compiled)
    Return r.Match(url).Result("${port}")
End Function
Avatar of jjardine
jjardine
Flag of United States of America image

Why is it that you couldn't get the substing to work?     Would this work with the substring?

Function Extension(byVal url as string) as String
  If url.startsWith("https") Then
    Return url.Substring(31)
  ElseIf url.StartsWith("http:") Then
    Return url.Substring(30)
  End If
End Function

Check the starting index  they might be off by a few.
ASKER CERTIFIED SOLUTION
Avatar of ZeonFlash
ZeonFlash

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of Fernando Soto
Fernando Soto
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi psmithphil;

I did a little bench mark executing the call in the function 1000 times placing the code in the function in a for loop not calling the function 1000

This code which you posted with ZeonFlash's correction ran 5.148 seconds
Function Extension(ByVal Url As String) As String
    Dim r As New Regex("^(http[s]?://)pear.nadd.tron.com/engrdocs/(?<port>.+)$", RegexOptions.Compiled)
    Return r.Match(url).Result("${port}")
End Function

Using the code I posted where output is a string:

Ran 1000 times in a loop 1 millisecond.
     output = Url.Substring(url.IndexOf("engg/") + 5)

Now if you want to make the Regex run faster you can do the following.

    ' Make the Regex a class level variable
    Private r As New Regex("^(http[s]?://)pear.nadd.tron.com/engrdocs/(?<port>.+)$", RegexOptions.Compiled)

Function Extension(ByVal Url As String) As String
    Return r.Match(url).Result("${port}")
End Function

The above change will run the 1000 executions in 7 milliseconds. Now that time is just executing the code in the function 1000 times and not 1000 calls to the function.

The difference in time between your code and the modified code I posted above is due to the fact that I compile the Regex once and reuse it where you re-compile it every time you call the function.

Fernando
Avatar of psmithphil

ASKER

ZeonFlash, you had the answer to my problem as I just took the colon out and it worked.   I looked in the Help and online and never found how to use this.  I will use this in many applications - thank you!

FernandoSoto, I like the way you use the substring and I am actually going to use your way in this particular application.

You both helped immensely so I am splitting the points.  Thank you both!