?
Solved

Trying to extract data from a web page

Posted on 2013-11-12
15
Medium Priority
?
577 Views
Last Modified: 2013-11-13
The company uses a third party to process their credit card payments. I am using a MS Access program and the Web Browser control and would like to extract the transaction id from the third party confirmation web page. I have been able to get the other data I need except for the Transaction ID that I am not sure how to reference so that I can grab it.
Attached is a pdf file of the source  and highlighted is the information I am trying grab.
EECRS3-1-.pdf
0
Comment
Question by:dlord54
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 7
15 Comments
 
LVL 85
ID: 39644690
There's really no way to reliably get this information from the webpage, since there is no "id" assigned to the tablecell where the value resides. You could perhaps parse the source file for the value of "TransactionDetail.aspx?toNet2", and then grab everything from that point to the next "&" (which seems to indicate the end of that value, plus a few extraneous characters), but that's tenuous at best, and it likely to change if they change the way their website manages these things.

In fact, given that this is a CC payment situation, your processor should provide you with some sort of "return" value regarding the status of the payment. This should include the Transaction Number.

Better yet - don't do this from Access, and instead create the program in .NET, which can send/receive web-based communications much more easily than Access.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39645030
Once you have the HTML, parse it with the following regular expression to get the value you seek.
TransactionDetail.aspx\?toNet2=(\d*)

Open in new window


Example:
Dim oRE As Object
Dim oMatches As Object
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*).*"
Set oMatches = oRE.Execute(yourHTMLtextvariablenamegoeshere)
Debug.Print "transaction id: " & oMatches(0).Submatches(0)

Open in new window

0
 

Author Comment

by:dlord54
ID: 39645150
Thanks aikimark for your suggestion. I am at a lost to find the HTMLtextvariablename.  Is that something that I have assigned or is it in the source or am I just not understanding? Excuse me for my ignorance but I don't work with Web Pages much nor the WebBrowser control.
0
How Blockchain Is Impacting Every Industry

Blockchain expert Alex Tapscott talks to Acronis VP Frank Jablonski about this revolutionary technology and how it's making inroads into other industries and facets of everyday life.

 
LVL 46

Expert Comment

by:aikimark
ID: 39645181
You wrote that you are using a web browser control and you posted a PDF of the HTML source.  I assume you know how to get the HTML of the web page you are accessing.  That name is meant to refer to the variable or object property or method invocation result that contains the HTML.

How did you post the HTML in the PDF document you attached with your question text?
0
 

Author Comment

by:dlord54
ID: 39645226
When I viewed the source of the web page within the browser that was the name that it brought up in notepad. It is also the name of the pdf that I sent. Does it need to be enclosed with quotation marks or should also have the .html extension included?
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39645239
please post the code you are using with the web control
0
 

Author Comment

by:dlord54
ID: 39645271
I have attached the code that I am using to retrieve the data from that web page. I have included the code that you recommended.
Code.txt
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39645293
Please try this:
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = True
        oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*).*"
        Set oMatches = oRE.Execute(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext)
        TransID = oMatches(0).Submatches(0)
        sText = "Transaction ID: " & TransID

Open in new window

0
 

Author Comment

by:dlord54
ID: 39645328
I am getting a "Invalid procedure call or argument" error message on the
 "TransID = oMatches(0).Submatches(0)" line.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39645380
This version of the code should not experience that error, but I concerned that the HTML you posted is not the HTML that is in the innertext property.
        Set oRE = CreateObject("vbscript.regexp")
        oRE.Global = True
        oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*)"
        If oRE.Test(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext) Then
                Set oMatches = oRE.Execute(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext)
                TransID = oMatches(0).Submatches(0)
                sText = "Transaction ID: " & TransID
        Else
                msgbox "Transaction ID not found"
        End If

Open in new window

0
 

Author Comment

by:dlord54
ID: 39645420
When debugging I execute the line " Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext" the result is:

Confirmation/Receipt
Payer Information:
Name: Test Test
Address: 555 W 6000 N
  Layton, UT 84041
Phone: 801-555-5555
Additional Information:
Account #: 123456
Payment Detail:
Payment DatePayment AmountTransaction IDStatus
11/13/2013 $125.00 4192557Approved

The 4192257 is the data I am trying to get to. It is the result of the pattern that we are using.
0
 
LVL 46

Accepted Solution

by:
aikimark earned 2000 total points
ID: 39645449
There is bound to be some property of some part of (or all of) the document.

Please look at .innerHTML instead of .innertext properties and then work your way up the document object tree until you get to the HTML that includes the transaction ID data.
0
 

Author Comment

by:dlord54
ID: 39645539
That worked. Thank you so much for your help.
0
 
LVL 46

Expert Comment

by:aikimark
ID: 39645719
Thanks for the points.

1. Please note that I have posted a simplified version of the regular expression pattern than the one you are currently using.  It might be faster.

2. I do not know what change you made to get the posted code to work, but you might want to consider tweaking the code to resemble the following example:
Dim oDoc As Object

Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True

If Me.ocxWebBrowser.Document Is Nothing Then
    MsgBox "No Text to display"
Else
    'cmdGetText.Caption = "Create Memo entry"
    Set oDoc = Me.ocxWebBrowser.Document.Frames(2).Document
    If vTest = 0 Then
        PmtDate = oDoc.GetElementByID("ctl00_ContentPlaceHolder1_grdReports1_ctl02_lblDate")
.innertext
        AMT = oDoc.GetElementByID("ctl00_ContentPlaceHolder1_grdReports1_ctl02_lblAmount").innertext
        oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*)"
        If oRE.Test(oDoc.Body.innerhtml) Then
                Set oMatches = oRE.Execute(oDoc.Body.innerhtml)
                TransID = oMatches(0).Submatches(0)
                sText = "Transaction ID: " & TransID
        Else
                msgbox "Transaction ID not found"
        End If        
    Else
        'Use the regexp object to do your parsing here
        PmtDate = StringBetweenStrings(oDoc.Body.innertext, "Date (Pacific Time)", vbCrLf)
        AMT = StringBetweenStrings(oDoc.Body.innertext, "Amount", vbCrLf)
        TransID = StringBetweenStrings(oDoc.Body.innertext, "E-Check ID Number: ", ".")
        sText = "E-Check ID Number: " & TransID
    End If
End If

Open in new window

Changes:
* defined and instantiated an oDoc variable to reference the document
* use GetElementByID().innertext rather than assigning the element object to a variable.
* moved the instantiation of the regular expression object earlier so that it might be used in both the If and Else sections.
* suggest that the regular expression object might replace the StringBetweenStrings() function.

Since I didn't have an example of the HTML when vTest <> 0, I am not able to show you the patterns for that parsing.
0
 

Author Comment

by:dlord54
ID: 39645734
Thanks for the added changes.
0

Featured Post

Fill in the form and get your FREE NFR key NOW!

Veeam® is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever visit a website where you spotted a really cool looking Font, yet couldn't figure out which font family it belonged to, or how to get a copy of it for your own use? This article explains the process of doing exactly that, as well as showing how…
Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
In Microsoft Access, learn how to use Dlookup and other domain aggregate functions and one method of specifying a string value within a string. Specify the first argument, which is the expression to be returned: Specify the second argument, which …
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question