dlord54
asked on
Trying to extract data from a web page
The company uses a third party to process their credit card payments. I am using a MS Access program and the Web Browser control and would like to extract the transaction id from the third party confirmation web page. I have been able to get the other data I need except for the Transaction ID that I am not sure how to reference so that I can grab it.
Attached is a pdf file of the source and highlighted is the information I am trying grab.
EECRS3-1-.pdf
Attached is a pdf file of the source and highlighted is the information I am trying grab.
EECRS3-1-.pdf
Once you have the HTML, parse it with the following regular expression to get the value you seek.
Example:
TransactionDetail.aspx\?toNet2=(\d*)
Example:
Dim oRE As Object
Dim oMatches As Object
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*).*"
Set oMatches = oRE.Execute(yourHTMLtextvariablenamegoeshere)
Debug.Print "transaction id: " & oMatches(0).Submatches(0)
ASKER
Thanks aikimark for your suggestion. I am at a lost to find the HTMLtextvariablename. Is that something that I have assigned or is it in the source or am I just not understanding? Excuse me for my ignorance but I don't work with Web Pages much nor the WebBrowser control.
You wrote that you are using a web browser control and you posted a PDF of the HTML source. I assume you know how to get the HTML of the web page you are accessing. That name is meant to refer to the variable or object property or method invocation result that contains the HTML.
How did you post the HTML in the PDF document you attached with your question text?
How did you post the HTML in the PDF document you attached with your question text?
ASKER
When I viewed the source of the web page within the browser that was the name that it brought up in notepad. It is also the name of the pdf that I sent. Does it need to be enclosed with quotation marks or should also have the .html extension included?
please post the code you are using with the web control
ASKER
I have attached the code that I am using to retrieve the data from that web page. I have included the code that you recommended.
Code.txt
Code.txt
Please try this:
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*).*"
Set oMatches = oRE.Execute(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext)
TransID = oMatches(0).Submatches(0)
sText = "Transaction ID: " & TransID
ASKER
I am getting a "Invalid procedure call or argument" error message on the
"TransID = oMatches(0).Submatches(0)" line.
"TransID = oMatches(0).Submatches(0)"
This version of the code should not experience that error, but I concerned that the HTML you posted is not the HTML that is in the innertext property.
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*)"
If oRE.Test(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext) Then
Set oMatches = oRE.Execute(Me.ocxWebBrowser.Document.Frames(2).Document.Body.innertext)
TransID = oMatches(0).Submatches(0)
sText = "Transaction ID: " & TransID
Else
msgbox "Transaction ID not found"
End If
ASKER
When debugging I execute the line " Me.ocxWebBrowser.Document. Frames(2). Document.B ody.innert ext" the result is:
Confirmation/Receipt
Payer Information:
Name: Test Test
Address: 555 W 6000 N
Layton, UT 84041
Phone: 801-555-5555
Additional Information:
Account #: 123456
Payment Detail:
Payment DatePayment AmountTransaction IDStatus
11/13/2013 $125.00 4192557Approved
The 4192257 is the data I am trying to get to. It is the result of the pattern that we are using.
Confirmation/Receipt
Payer Information:
Name: Test Test
Address: 555 W 6000 N
Layton, UT 84041
Phone: 801-555-5555
Additional Information:
Account #: 123456
Payment Detail:
Payment DatePayment AmountTransaction IDStatus
11/13/2013 $125.00 4192557Approved
The 4192257 is the data I am trying to get to. It is the result of the pattern that we are using.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
That worked. Thank you so much for your help.
Thanks for the points.
1. Please note that I have posted a simplified version of the regular expression pattern than the one you are currently using. It might be faster.
2. I do not know what change you made to get the posted code to work, but you might want to consider tweaking the code to resemble the following example:
* defined and instantiated an oDoc variable to reference the document
* use GetElementByID().innertext rather than assigning the element object to a variable.
* moved the instantiation of the regular expression object earlier so that it might be used in both the If and Else sections.
* suggest that the regular expression object might replace the StringBetweenStrings() function.
Since I didn't have an example of the HTML when vTest <> 0, I am not able to show you the patterns for that parsing.
1. Please note that I have posted a simplified version of the regular expression pattern than the one you are currently using. It might be faster.
2. I do not know what change you made to get the posted code to work, but you might want to consider tweaking the code to resemble the following example:
Dim oDoc As Object
Set oRE = CreateObject("vbscript.regexp")
oRE.Global = True
If Me.ocxWebBrowser.Document Is Nothing Then
MsgBox "No Text to display"
Else
'cmdGetText.Caption = "Create Memo entry"
Set oDoc = Me.ocxWebBrowser.Document.Frames(2).Document
If vTest = 0 Then
PmtDate = oDoc.GetElementByID("ctl00_ContentPlaceHolder1_grdReports1_ctl02_lblDate")
.innertext
AMT = oDoc.GetElementByID("ctl00_ContentPlaceHolder1_grdReports1_ctl02_lblAmount").innertext
oRE.Pattern = "TransactionDetail.aspx\?toNet2=(\d*)"
If oRE.Test(oDoc.Body.innerhtml) Then
Set oMatches = oRE.Execute(oDoc.Body.innerhtml)
TransID = oMatches(0).Submatches(0)
sText = "Transaction ID: " & TransID
Else
msgbox "Transaction ID not found"
End If
Else
'Use the regexp object to do your parsing here
PmtDate = StringBetweenStrings(oDoc.Body.innertext, "Date (Pacific Time)", vbCrLf)
AMT = StringBetweenStrings(oDoc.Body.innertext, "Amount", vbCrLf)
TransID = StringBetweenStrings(oDoc.Body.innertext, "E-Check ID Number: ", ".")
sText = "E-Check ID Number: " & TransID
End If
End If
Changes:* defined and instantiated an oDoc variable to reference the document
* use GetElementByID().innertext
* moved the instantiation of the regular expression object earlier so that it might be used in both the If and Else sections.
* suggest that the regular expression object might replace the StringBetweenStrings() function.
Since I didn't have an example of the HTML when vTest <> 0, I am not able to show you the patterns for that parsing.
ASKER
Thanks for the added changes.
In fact, given that this is a CC payment situation, your processor should provide you with some sort of "return" value regarding the status of the payment. This should include the Transaction Number.
Better yet - don't do this from Access, and instead create the program in .NET, which can send/receive web-based communications much more easily than Access.