• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2236
  • Last Modified:

Microsoft Access Extract Domain Name from Hyperlink or text field

Hi,

I'm working on an access database that stores lists of URL's

My challenge (and I'm really hoping you can help) is to append 2 additional columns to the table that contains the URL's and in that column extract the domain and domain suffix.
.
For example, the table may contain the following column
http://www.abc.com
www.xyz.com
123.mno.net
http://www.abc.com/category
www.abc.net/category/sub-category/index.html
xyz.info/index.php
www.abc.com:8082/

What I'd like as the output for each of these rows is
Domain                             Suffix
www.abc.com                   com
www.xyz.com                   com
123.mno.net                        net
www.abc.com                   com
www.abc.net                     net
xyz.info                               info
www.abc.com                    com

Thanks!
M Gandhi
0
fixyfoxy
Asked:
fixyfoxy
1 Solution
 
Patrick MatthewsCommented:
Hello fixyfoxy,

Add the UDFs below to a regular VBA module, and run a query like this:

SELECT URL, RegExpReplace(URL, "http://|[:/].*", "", True, False) AS Domain,
      Mid(RegExpFind(RegExpReplace(URL, "http://|[:/].*", "", True, False), "\.[a-z]{2,}$", 1, False), 2) AS Suffix
FROM SomeTable

Regards,

Patrick
Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True) 
    ' For more info see: http://vbaexpress.com/kb/getarticle.php?kb_id=841 
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 0                   : the last match
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' If Pos is greater than the number of matches, is negative, or is non-numeric, the function
    ' returns an empty string.  If no match is found, the function returns an empty string
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer() As String
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Create instance of RegExp object
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
    End With
        
    ' Test to see if there are any matches
    If RegX.test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        Set TheMatches = RegX.Execute(LookIn)
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1) As String
            For Counter = 0 To UBound(Answer)
                Answer(Counter) = TheMatches(Counter)
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        Else
            Select Case Pos
                Case 0                          ' Last match
                    RegExpFind = TheMatches(TheMatches.Count - 1)
                Case 1 To TheMatches.Count      ' Nth match
                    RegExpFind = TheMatches(Pos - 1)
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    Else
        RegExpFind = ""
    End If
    
    ' Release object variables
    Set TheMatches = Nothing
    
End Function
 
Function RegExpReplace(LookIn As String, PatternStr As String, Optional ReplaceWith As String = "", _
    Optional ReplaceAll As Boolean = True, Optional MatchCase As Boolean = True) 
    ' This function uses Regular Expressions to parse a string, and replace parts of the string
    ' matching the specified pattern with another string.  The optional argument ReplaceAll controls
    ' whether all instances of the matched string are replaced (True) or just the first instance (False)
    
    ' By default, RegExp is case-sensitive in pattern-matching.  To keep this, omit MatchCase or
    ' set it to True
    
    ' If you use this function from Excel, you may substitute range references for all the arguments
    
    Static RegX As Object
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = ReplaceAll
        .IgnoreCase = Not MatchCase
    End With
    
    RegExpReplace = RegX.Replace(LookIn, ReplaceWith)
    
    Set RegX = Nothing
    
End Function

Open in new window

0
 
fixyfoxyAuthor Commented:
Patrick,

Super solution! Worked exactly as I needed it to....

MGandhi
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now