?
Solved

RegEx Help

Posted on 2009-02-23
14
Medium Priority
?
258 Views
Last Modified: 2012-05-06
I need the reg regex to identify the image like below:

Image tag on the website:
<img width="142" height="185" alt="Agent Info" src="http://domain.com/image.jpg">

Using this code:
Dim mch As System.Text.RegularExpressions.Match = _
            System.Text.RegularExpressions.Regex.Match(mystring, "<img.*id=\x22Master_Leftnav1_AgentContactInfo_AgentPhoto\x22*.?src=\x22(.*?)\x22")
0
Comment
Question by:azyet24
  • 7
  • 4
  • 3
14 Comments
 
LVL 8

Expert Comment

by:theplonk
ID: 23718363
Try the following regex:
Dim mch As System.Text.RegularExpressions.Match = _
                    System.Text.RegularExpressions.Regex.Match(Me.TextBox1.Text, "(?<=img.*src\=[\x27\x22])(?<Url>[^\x27\x22]*)(?=[\x27\x22])")

Open in new window

0
 

Author Comment

by:azyet24
ID: 23718385
alt="Agent Info" is what I'll have to look for so this should be in the expression.  What I want to get is the url so I can download the image.
0
 

Author Comment

by:azyet24
ID: 23718406
I was trying something like this but it doesn't work and I don't know regex:

"<img.*alt=\x22Agent.*Info\x22*.?src=\x22(.*?)\x22"
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 8

Expert Comment

by:theplonk
ID: 23718462
Sorry try this:
System.Text.RegularExpressions.Regex.Match(Me.TextBox1.Text, "(?<=img.*alt\=[\x27\x22]Agent\sInfo[\x27\x22].*src\=[\x27\x22])(?<Url>[^\x27\x22]*)(?=[\x27\x22])")

Open in new window

0
 
LVL 8

Expert Comment

by:theplonk
ID: 23718490
Otherwise if the format of the img tag changes in order/ is inconsistant. i.e. the order of "alt" and "src" are differnt.

You can try the solution that is posted here:
http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_23964786.html
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 23718502
Hi azyet24;

The Regex pattern in the code snippet should do what you want.

Fernando
Imports System.Text.RegularExpressions
 
'======================================================================================================
' Test Data
Dim inputData As String = "Image tag on the website:" & _
        "<img width=""142"" height=""185"" alt=""Agent Info"" src=""http://domain.com/image.jpg""> " & _
        "Using this code:"
'======================================================================================================
 
Dim pattern As String = "<img.*?alt=\x22([^\x22]+)\x22.*?src=\x22([^\x22]+)\x22"
Dim m As Match = Regex.Match(inputData, pattern)
 
'======================================================================================================
' Display results
If m.Success Then MessageBox.Show("alt = " & m.Groups(1).Value & "  URL = " & m.Groups(2).Value)
'======================================================================================================

Open in new window

0
 

Author Comment

by:azyet24
ID: 23722049
Fernando, you code works with your sample data, but is not working with the website so I'm not sure what the problem is.  Here is an example link:
"http://www.domain.com/Websites/transform.php?agent=SomeName"
The code should download the agents picture.
Please grab this b/c I will need to have this removed from EE.
0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 23734686
Hi azyet24;

Can you post the code you used.

Also how many of these type of <img ... > will ever be in the data?

Fernando
0
 

Author Comment

by:azyet24
ID: 23734749
There will be tons of img tags in the data, but only one will have alt="Agent Info"
Try
 
            Dim client As New System.Net.WebClient()
            Dim mystring As String = client.DownloadString(agenturl)
 
            Console.WriteLine(name & " " & counter)
            Dim mch As System.Text.RegularExpressions.Match = _
                  System.Text.RegularExpressions.Regex.Match(mystring, "<img.*?alt=\x22([^\x22]+)\x22.*?src=\x22([^\x22]+)\x22")
 
            Dim pattern As String = "<img.*?alt=\x22([^\x22]+)\x22.*?src=\x22([^\x22]+)\x22"
            Dim m As Match = Regex.Match("http://SomeName.domain.com/Websites/transform.php?agent=SomeName", pattern)
 
            If m.Success Then Console.WriteLine("alt = " & m.Groups(1).Value & "  URL = " & m.Groups(2).Value)
 
            If mch.Success Then
                If mch.Groups(1).Value <> "" Then
                    Dim WebC As New WebClient
                    Try
 
                        WebC.DownloadFile(mch.Groups(2).Value, localfolder & "RBI_" & name & id & ".jpg")
                        Console.WriteLine(mch.Groups(2).Value & " photo downloaded")
                       
                    Catch ex As Exception
                        Console.WriteLine("Error " & ex.Message)
                    End Try
                End If
            End If
        Catch ex As Exception
            Console.WriteLine("Error " & ex.Message)
        End Try

Open in new window

0
 
LVL 64

Expert Comment

by:Fernando Soto
ID: 23735118
Hi azyet24;

I have modified your code and it works for me, try it this way.

Fernando
Dim pattern As String = "<img.*?alt=\x22([^\x22]+)\x22.*?src=\x22([^\x22]+)\x22"
Try
 
    Dim client As New System.Net.WebClient()
    Dim mystring As String = client.DownloadString(agenturl)
 
    Console.WriteLine(Name & " " & counter)
    Dim mch As System.Text.RegularExpressions.Match = _
          System.Text.RegularExpressions.Regex.Match(mystring, pattern)
 
    If mch.Success Then Console.WriteLine("alt = " & mch.Groups(1).Value & "  URL = " & mch.Groups(2).Value)
 
    If mch.Success Then
        If mch.Groups(1).Value <> "" Then
            Dim WebC As New WebClient
            Try
 
                WebC.DownloadFile(mch.Groups(2).Value, localfolder & "RBI_" & Name & id & ".jpg")
                Console.WriteLine(mch.Groups(2).Value & " photo downloaded")
 
            Catch ex As Exception
                Console.WriteLine("Error " & ex.Message)
            End Try
        End If
    End If
Catch ex As Exception
    Console.WriteLine("Error " & ex.Message)
End Try

Open in new window

0
 

Author Comment

by:azyet24
ID: 23735283
In your code, where are you isolating alt="Agent Info"?
0
 
LVL 64

Accepted Solution

by:
Fernando Soto earned 2000 total points
ID: 23735500
In the regex pattern below this part, alt=\x22([^\x22]+)\x22", matches and img tag that has a alt parameter.

Dim pattern As String = "<img.*?alt=\x22([^\x22]+)\x22.*?src=\x22([^\x22]+)\x22"

If the string in the alt parameter is always going to be "Agent Info" then you can use this regex pattern in place of the above one.

Dim pattern As String = "<img.*?alt=\x22(Agent Info)\x22.*?src=\x22([^\x22]+)\x22"

0
 

Author Comment

by:azyet24
ID: 23735580
not working for me.  complete code below.
Getrbiimages(""http://www.domain.com/Websites/transform.php?agent=SomeName"", "User Name", 1)
 
    Sub Getrbiimages(ByVal agenturl As String, ByVal name As String, ByVal id As Integer)
        Dim pattern As String = "<img.*?alt=\x22(Agent Info)\x22.*?src=\x22([^\x22]+)\x22"
        Try
 
            Dim client As New System.Net.WebClient()
            Dim mystring As String = client.DownloadString(agenturl)
 
            Console.WriteLine(name & " " & counter)
            Dim mch As System.Text.RegularExpressions.Match = _
                  System.Text.RegularExpressions.Regex.Match(mystring, pattern)
 
            If mch.Success Then Console.WriteLine("alt = " & mch.Groups(1).Value & "  URL = " & mch.Groups(2).Value)
 
            If mch.Success Then
                If mch.Groups(1).Value <> "" Then
                    Dim WebC As New WebClient
                    Try
 
                        WebC.DownloadFile(mch.Groups(2).Value, localfolder & "RBI_" & name & id & ".jpg")
                        Console.WriteLine(mch.Groups(2).Value & " photo downloaded")
 
                    Catch ex As Exception
                        Console.WriteLine("Error " & ex.Message)
                    End Try
                End If
            End If
        Catch ex As Exception
            Console.WriteLine("Error " & ex.Message)
        End Try
 
        Console.ReadLine()
    End Sub

Open in new window

0
 

Author Comment

by:azyet24
ID: 23735591
Sorry, that did work
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial demonstrates one way to create an application that runs without any Forms but still has a GUI presence via an Icon in the System Tray. The magic lies in Inheriting from the ApplicationContext Class and passing that to Application.Ru…
Well, all of us have seen the multiple EXCEL.EXE's in task manager that won't die even if you call the .close, .dispose methods. Try this method to kill any excels in memory. You can copy the kill function to create a check function and replace the …
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Despite its rising prevalence in the business world, "the cloud" is still misunderstood. Some companies still believe common misconceptions about lack of security in cloud solutions and many misuses of cloud storage options still occur every day. …
Suggested Courses

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question