ReneGe
asked on
Extract data from a specific website
Hi there,
I urgently need a VBScript that will extract data from the following factual website.
https://aDomain.com/data/X
X is a consective number from 0 to 200000
DATA TO BE EXTRACTED ARE THE FOLLOWINGS (For example, CompanyName1 is the data value, not a variable):
-CompanyName1
-CompanyName2
-CompanyName2City
-CompanyName2State
-CompanyName2Zip
-CompanyName2Country
FROM:
<div id="companyName_id" class="box">CompanyName1</ div>
<label>Name:</label><div class="box">CompanyName2</ div>
<label>City:</label><div class="box">CompanyName2Ci ty</div>
<label>State:</label><div class="box">CompanyName2St ate</div>
<label>Zip:</label><div class="box">CompanyName2Zi p</div>
<label>Country:</label><di v class="box">CompanyName2Co untry</div >
The loop is to be executed at randum times between .5 and 2 seconds.
The output I need is:
"CompanyName1","CompanyNam e2","Compa nyName2Cit y","Compan yName2Stat e","Compan yName2Zip" ,"CompanyN ame2Countr y"
The output file is to be in the same folder where the script is located, and with the name "data.csv".
Thanks for your help and your support.
Cheers,
Rene
I urgently need a VBScript that will extract data from the following factual website.
https://aDomain.com/data/X
X is a consective number from 0 to 200000
DATA TO BE EXTRACTED ARE THE FOLLOWINGS (For example, CompanyName1 is the data value, not a variable):
-CompanyName1
-CompanyName2
-CompanyName2City
-CompanyName2State
-CompanyName2Zip
-CompanyName2Country
FROM:
<div id="companyName_id" class="box">CompanyName1</
<label>Name:</label><div class="box">CompanyName2</
<label>City:</label><div class="box">CompanyName2Ci
<label>State:</label><div class="box">CompanyName2St
<label>Zip:</label><div class="box">CompanyName2Zi
<label>Country:</label><di
The loop is to be executed at randum times between .5 and 2 seconds.
The output I need is:
"CompanyName1","CompanyNam
The output file is to be in the same folder where the script is located, and with the name "data.csv".
Thanks for your help and your support.
Cheers,
Rene
You can modify this VBA code:
/gustav
Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" ( _
ByVal pCaller As Long, _
ByVal szURL As String, _
ByVal szFileName As String, _
ByVal dwReserved As Long, _
ByVal lpfnCB As Long) _
As Long
Public Function DownloadFile( _
ByVal strURL As String, _
ByVal strLocalFilename As String) _
As Long
' Download file or page with public access from the web.
' 2004-12-17. Cactus Data ApS, CPH.
' Usage, download a file:
' lngRet = DownloadFile("http://www.databaseadvisors.com/Graphics/conf2002/2002ConferencePicsbySmolin/images/dba02smolin27.jpg", "c:\happybassett.jpg")
'
' Usage, download a page:
' lngRet = DownloadFile("http://www.databaseadvisors.com/conf2002/conf200202.asp", "c:\dbaconference.htm")
' Returns 0 if success, error code if not.
' Error codes:
' -2146697210 "file not found".
' -2146697211 "domain not found".
' Limitation.
' Does not check if local file was created successfully.
Dim lngRetVal As Long
lngRetVal = URLDownloadToFile(0, strURL & vbNullChar, strLocalFilename & vbNullChar, 0, 0)
DownloadFile = lngRetVal
End Function
It should be quite easy to adopt for VBscript./gustav
Are the pages ending in "html" or "htm" on the website?
Also, is the data to be appended to the text file each time?
Also, is the data to be appended to the text file each time?
I get an SSL error message when I try to look at that domain.
Forget my earlier question....I am wrting code now and see what you are after.
Stand by....
Cheers Chris
Stand by....
Cheers Chris
ASKER
Hi guys,
Thanks for helping.
By the way, the website I used in my example is a fake one.
Cheers
Thanks for helping.
By the way, the website I used in my example is a fake one.
Cheers
That's what I thought with "a domain". But to clarify, you could use the normal "example.com".
/gustav
/gustav
@ReneGe
What is the real URL?
What is the real URL?
Almost finished...just testing.
Cheers
Chris
Cheers
Chris
Why is the loop run at random times (between 0.5 and 2 seconds?
(If for processing purposes by external processes on your PC, that can be handles by giving way to the processes using "DoEvents" call.)
If you have 200000 files, 0.5 seconds would not be enough time to process them I would think (or even 2 seconds!)
Why not just have it run continuously?
If it is to run continuously, we will need to put in a way of stopping the process (via keyboard input say by a button displayed on a small form).
Also, after going through loop once, surely you do not want the same data output appended to the ever growing output file....I gather you would want to write to the file only data that is not already in it.
This would mean scanning the output file prior to writing to it
to ensure no "doubling up"
If a record is to be deleted that couldnot be handled by this process
(unless there was a "Delete" glag of some sort placed in a html file,
and an "edit" wouldsimply produce a new record in the output.
I gather all "adds" and "edits" will need to be done by processing the output file rather than ging via the input files.
Please clarify...It is working but I need to know exactlyhow the porocessing of the output file is handled.
Also,(and this is important) is there only one record per html file or could there be multiple records?
Cheers
Chris (Australia - 4,53am and heavens.....am I tired! :-)
(If for processing purposes by external processes on your PC, that can be handles by giving way to the processes using "DoEvents" call.)
If you have 200000 files, 0.5 seconds would not be enough time to process them I would think (or even 2 seconds!)
Why not just have it run continuously?
If it is to run continuously, we will need to put in a way of stopping the process (via keyboard input say by a button displayed on a small form).
Also, after going through loop once, surely you do not want the same data output appended to the ever growing output file....I gather you would want to write to the file only data that is not already in it.
This would mean scanning the output file prior to writing to it
to ensure no "doubling up"
If a record is to be deleted that couldnot be handled by this process
(unless there was a "Delete" glag of some sort placed in a html file,
and an "edit" wouldsimply produce a new record in the output.
I gather all "adds" and "edits" will need to be done by processing the output file rather than ging via the input files.
Please clarify...It is working but I need to know exactlyhow the porocessing of the output file is handled.
Also,(and this is important) is there only one record per html file or could there be multiple records?
Cheers
Chris (Australia - 4,53am and heavens.....am I tired! :-)
ASKER
Hi Chris,
Thanks for all you help.
The delay between each execution is not that important. This delay is more to be applied between each loop.
It is indeed favorable that the script runs continously, between 0 to 200000. At 200000, it would stop executing automatically. I will not need to stop it while it is runnung. However, it would be nice that I see what number it is at.
All of the data needs to be output to the csv file. However, you have a point. That csv file will be huge. Making a different csv file for every "companyName_id" should resolve this, where the file name would be the data contained in "companyName_id". So therefore, double-ups are a GOOD thing for me.
No delete flag required. However, I dont need the data to be output if the page does not contain the data. For example: IF "%companyName%" == "" Dont export data and go to the next loop.
If a record is to be deleted that couldnot be handled by this process
(unless there was a "Delete" glag of some sort placed in a html file,
and an "edit" wouldsimply produce a new record in the output.
I gather all "adds" and "edits" will need to be done by processing the output file rather than ging via the input files.
I dont understand what you mean here ;(
Please clarify...It is working but I need to know exactly how the porocessing of the output file is handled.
I'm not sure I understand what you are asking here. The script loads the page, reads the data and exports it to the CSV file.
Also,(and this is important) is there only one record per html file or could there be multiple records?
Good question, there is indeed only one record per HTML pages.
Cheers
Chris (Australia - 4,53am and heavens.....am I tired! :-)
Go to bed ;)
Thanks for all you help.
The delay between each execution is not that important. This delay is more to be applied between each loop.
It is indeed favorable that the script runs continously, between 0 to 200000. At 200000, it would stop executing automatically. I will not need to stop it while it is runnung. However, it would be nice that I see what number it is at.
All of the data needs to be output to the csv file. However, you have a point. That csv file will be huge. Making a different csv file for every "companyName_id" should resolve this, where the file name would be the data contained in "companyName_id". So therefore, double-ups are a GOOD thing for me.
No delete flag required. However, I dont need the data to be output if the page does not contain the data. For example: IF "%companyName%" == "" Dont export data and go to the next loop.
If a record is to be deleted that couldnot be handled by this process
(unless there was a "Delete" glag of some sort placed in a html file,
and an "edit" wouldsimply produce a new record in the output.
I gather all "adds" and "edits" will need to be done by processing the output file rather than ging via the input files.
I dont understand what you mean here ;(
Please clarify...It is working but I need to know exactly how the porocessing of the output file is handled.
I'm not sure I understand what you are asking here. The script loads the page, reads the data and exports it to the CSV file.
Also,(and this is important) is there only one record per html file or could there be multiple records?
Good question, there is indeed only one record per HTML pages.
Cheers
Chris (Australia - 4,53am and heavens.....am I tired! :-)
Go to bed ;)
ASKER
Hi Gustav Brock,
Thanks for sharing. does it do what I need or it needs to be tailored?
Cheers,
Rene
Thanks for sharing. does it do what I need or it needs to be tailored?
Cheers,
Rene
I have the code working, but it is taking AGES to loop 20000 while allowing the PC to handle processes externally.
It currently works on 2 dummy files I set up at website "www.ecbc.net.au/data/" called "0.html" and "1.html"
(See code below)
The "if then" code could perhaps be handled differently, but it is easy to read and amend if required.
We could create an array of webpages by looping 20000 times (if that can be thought of as quick) and collecting
the names of the URLs that actually exist, then process the "If then" testing on each existing web page, but it will effectively be the same as testing if each page exists (one by one) then processing the existing ones. The way the code is at the moment, it saves having to visit the "site" file twice.
I have not put any 0.5 or 2 second timer in yet until I clarify exactly want you envisage the process will be.
I have put in "DoEvents" in the loop so the program does not "hog" all the processing when it runs.
Without that, your PC is locked within the loop and will not free up the PC until 20,000 is reached
(which takes ages).
It would be great if the loop cold be cut short when an input file is not found. This assumes all files are numered consecutively and none will be missing. If that was the case it would stop as soon as a URL was invalid (not found).
As it is, it now loops 20000 times (even though there are only 2 files) and runs for far too long.
Unfortunately there is no quick way to test if a url exists or not - you have to attempt to connect to the URL first to see whether it exists (and connection attempts, whether successful or not, slow things down)
I am now off to bed for a bit after about 5 hours work on this (which was interrupted by a visit to a scheduled 2 hour online seminar on MediaShout run at 3.00am (Melbourne time) in Los Angeles)....yawn :-)
Chris (Australia 6.02am
It currently works on 2 dummy files I set up at website "www.ecbc.net.au/data/" called "0.html" and "1.html"
(See code below)
The "if then" code could perhaps be handled differently, but it is easy to read and amend if required.
We could create an array of webpages by looping 20000 times (if that can be thought of as quick) and collecting
the names of the URLs that actually exist, then process the "If then" testing on each existing web page, but it will effectively be the same as testing if each page exists (one by one) then processing the existing ones. The way the code is at the moment, it saves having to visit the "site" file twice.
I have not put any 0.5 or 2 second timer in yet until I clarify exactly want you envisage the process will be.
I have put in "DoEvents" in the loop so the program does not "hog" all the processing when it runs.
Without that, your PC is locked within the loop and will not free up the PC until 20,000 is reached
(which takes ages).
It would be great if the loop cold be cut short when an input file is not found. This assumes all files are numered consecutively and none will be missing. If that was the case it would stop as soon as a URL was invalid (not found).
As it is, it now loops 20000 times (even though there are only 2 files) and runs for far too long.
Unfortunately there is no quick way to test if a url exists or not - you have to attempt to connect to the URL first to see whether it exists (and connection attempts, whether successful or not, slow things down)
I am now off to bed for a bit after about 5 hours work on this (which was interrupted by a visit to a scheduled 2 hour online seminar on MediaShout run at 3.00am (Melbourne time) in Los Angeles)....yawn :-)
Chris (Australia 6.02am
Option Explicit
'In VBA editor you must include references to the following (under "Tools/References" in the VBA editor):
' 1. Visual Basic for Applications
' 2. Microsoft Excel 14.0 Object Library (or otehr Office library for the application which is running VBA)
' 3. Microsoft Internet Controls
' 4. Microsoft HTML Object Library
Public Sub PickUpData()
Dim ie As Object
Dim nLine As Integer
Dim strLine As String
Dim oDoc As HTMLDocument
Dim strDivs() As String
Dim strEle() As String
Dim strSiteBase As String
Dim strPage As String
Dim nPage As Long
Dim nDiv As Integer
Dim nFile As Integer
Dim StrFileOut As String
Dim strUrl As String
Dim strOutPut(6) As String
Dim x As Integer
Const CompanyName1 = 1
Const CompanyName2 = 2
Const CompanyName2City = 3
Const CompanyName2State = 4
Const CompanyName2Zip = 5
Const CompanyName2Country = 6
Set ie = New InternetExplorerMedium
ie.Visible = False
StrFileOut = "c:\data.csv"
'strSiteBase = "https://aDomain.com/data/"
'the following is a testing site that will work! (It only has file 0and 1 at the moment but more can be put there
'upon request.
strSiteBase = "http://www.ecbc.net.au/data/"
nFile = FreeFile
Open StrFileOut For Append As #nFile
For nPage = 0 To 20000
DoEvents 'to allow normal processes outside this program to run.
strUrl = strSiteBase & CStr(nPage) & ".html"
If URLExists(strUrl) Then
ie.Navigate strUrl
While ie.Busy
DoEvents
Wend
If ie.LocationName = strUrl Then
Set oDoc = ie.Document
strPage = oDoc.documentElement.innerHTML
strDivs = Split(strPage, "</div>")
For nDiv = 0 To UBound(strDivs)
strLine = LCase(strDivs(nDiv))
If InStr(strLine, "companyname_id") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName1) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>name:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>city:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2City) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>state:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2State) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>zip:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2Zip) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>country:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2Country) = strEle(UBound(strEle))
End If
End If
End If
End If
End If
End If
Next
strLine = ""
For nLine = 1 To 6
strLine = strLine & strOutPut(nLine) & IIf(nLine < 6, ",", "")
Next
Print #nFile, strLine
End If
End If
NextFile:
Next
Close #nFile
MsgBox "This will not be in final code" & vbCrLf & "Just here to let you know the process is finished!"
End Sub
Function URLExists(url As String) As Boolean
Dim Request As Object
Dim ff As Integer
Dim rc As Variant
On Error GoTo EndNow 'if error occurs, then site will be considered un-reachable at the moment
'(even if it does exist)
Set Request = CreateObject("WinHttp.WinHttpRequest.5.1")
With Request
.Open "GET", url, False
.send
URLExists = InStr(LCase(.responseText), "unavailable link") = 0
End With
Set Request = Nothing
EndNow:
End Function
I just saw your few observations but I am too tired to fix the code just now...will get about 6 hours sleep before resuming. I guess having a differnt output file for each different company name (companyname1)
I gather by your notes that this is a "once off" procedure to gather all your data (in which case my mention of "edits" and "deletes" does not apply. I will amend the code when I get up to send to different output files, and put a few more dummy files on the "test" site with the campanyname1 appearing more than once.
I will need to tweak the code, then to look open the output file that applies to companyname1, rather than just
"data.csv"
I will also put in place some indiocator so you can see how far through the process it is.
Stand by.....
(I suyppose it is almost your bed time now!) :-)
Cheers
Chris (Australia)....6.50am (snore)
I gather by your notes that this is a "once off" procedure to gather all your data (in which case my mention of "edits" and "deletes" does not apply. I will amend the code when I get up to send to different output files, and put a few more dummy files on the "test" site with the campanyname1 appearing more than once.
I will need to tweak the code, then to look open the output file that applies to companyname1, rather than just
"data.csv"
I will also put in place some indiocator so you can see how far through the process it is.
Stand by.....
(I suyppose it is almost your bed time now!) :-)
Cheers
Chris (Australia)....6.50am (snore)
ASKER
Thanks Chris!
Go to bed!!
;)
Go to bed!!
;)
I can't help myse;lf (I'm addicted).
I just changed the code a bit for the multiple outputs. Will look at the "progess" indicator later in the day (LOL)
I will definately go off now (at 65 years of age, these late nights are getting to me.......)
Cheers
Chris (Australia)....6.30am
I just changed the code a bit for the multiple outputs. Will look at the "progess" indicator later in the day (LOL)
I will definately go off now (at 65 years of age, these late nights are getting to me.......)
Cheers
Chris (Australia)....6.30am
Option Explicit
'In VBA editor you must include references to the following (under "Tools/References" in the VBA editor):
' 1. Visual Basic for Applications
' 2. Microsoft Excel 14.0 Object Library (or otehr Office library for the application which is running VBA)
' 3. Microsoft Internet Controls
' 4. Microsoft HTML Object Library
Public Sub PickUpData()
Dim ie As Object
Dim nLine As Integer
Dim strLine As String
Dim oDoc As HTMLDocument
Dim strDivs() As String
Dim strEle() As String
Dim strSiteBase As String
Dim strPage As String
Dim nPage As Long
Dim nDiv As Integer
Dim nFile As Integer
Dim StrFileOut As String
Dim strUrl As String
Dim strOutPut(6) As String
Dim x As Integer
Const CompanyName1 = 1
Const CompanyName2 = 2
Const CompanyName2City = 3
Const CompanyName2State = 4
Const CompanyName2Zip = 5
Const CompanyName2Country = 6
Set ie = New InternetExplorerMedium
ie.Visible = False
'StrFileOut = "c:\data.csv"
'the following is a testing site that will work! (It only has file 0and 1 at the moment but more can be put there
'upon request.
'strSiteBase = "https://aDomain.com/data/"
strSiteBase = "http://www.ecbc.net.au/data/"
For nPage = 0 To 20000
DoEvents 'to allow normal processes outside this program to run.
strUrl = strSiteBase & CStr(nPage) & ".html"
If URLExists(strUrl) Then
ie.Navigate strUrl
While ie.Busy
DoEvents
Wend
If ie.LocationName = strUrl Then
Set oDoc = ie.Document
strPage = oDoc.documentElement.innerHTML
strDivs = Split(strPage, "</div>")
For nDiv = 0 To UBound(strDivs)
strLine = LCase(strDivs(nDiv))
If InStr(strLine, "companyname_id") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName1) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>name:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>city:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2City) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>state:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2State) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>zip:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2Zip) = strEle(UBound(strEle))
Else
If InStr(strLine, "<label>country:</label>") > 0 Then
strEle = Split(strLine, ">")
strEle(UBound(strEle)) = Replace(strEle(UBound(strEle)), "</div", "")
strOutPut(CompanyName2Country) = strEle(UBound(strEle))
End If
End If
End If
End If
End If
End If
Next
strLine = ""
For nLine = 1 To 6
strLine = strLine & strOutPut(nLine) & IIf(nLine < 6, ",", "")
Next
StrFileOut = Application.Path & "\" & strOutPut(CompanyName1) & ".csv"
nFile = FreeFile
Open StrFileOut For Append As #nFile 'if file does not exist it will be created, else it will append
Print #nFile, strLine
Close #nFile
End If
End If
NextFile:
Next
MsgBox "This will not be in final code" & vbCrLf & "Just here to let you know the process is finished!"
End Sub
Function URLExists(url As String) As Boolean
Dim Request As Object
Dim ff As Integer
Dim rc As Variant
On Error GoTo EndNow 'if error occurs, then site will be considered un-reachable at the moment
'(even if it does exist)
Set Request = CreateObject("WinHttp.WinHttpRequest.5.1")
With Request
.Open "GET", url, False
.send
URLExists = InStr(LCase(.responseText), "unavailable link") = 0
End With
Set Request = Nothing
EndNow:
End Function
ASKER
Hi aikimark,
I am sorry but I cannot disclose the web site address.
I know its not very practical but I'm not authorized to. Also, in order to access it, a login is required.
Oops, I forgot to mention this! I hope it will not be a problem.
Cheers
I am sorry but I cannot disclose the web site address.
I know its not very practical but I'm not authorized to. Also, in order to access it, a login is required.
Oops, I forgot to mention this! I hope it will not be a problem.
Cheers
For testing purposes I am using my own website and I have created a subfolder on the site that contains just two test files (at the moment).
I run a Bridge club called ECBC and so I am taking advantahge of being able to post data to it as I wish.
The files are: www.ecbc.net.au/data/o.html and
www.ecbc.net.au/data/1.html
The main website is www.ecbc.net.au and you can see my name there as owner/manager of the bridge club (privately owned)....I hope that is alright with the Zone advisor :-)
Cheers
Chris
I run a Bridge club called ECBC and so I am taking advantahge of being able to post data to it as I wish.
The files are: www.ecbc.net.au/data/o.html and
www.ecbc.net.au/data/1.html
The main website is www.ecbc.net.au and you can see my name there as owner/manager of the bridge club (privately owned)....I hope that is alright with the Zone advisor :-)
Cheers
Chris
ASKER
@aikimark:
-Following Chris' answer, I think I understand that there may be a concern.
-Please be assured that the data collected by this script is owned by my client. We just dont have it in our database.
@Chris:
-I just realized that in "www.ecbc.net.au/data/o.html", we find the letter "o" not the number "0"
Cheers,
Rene
-Following Chris' answer, I think I understand that there may be a concern.
-Please be assured that the data collected by this script is owned by my client. We just dont have it in our database.
@Chris:
-I just realized that in "www.ecbc.net.au/data/o.html", we find the letter "o" not the number "0"
Cheers,
Rene
OK...I looked at it, and filedefinately ends in "0.html" (i.e. the number zero). I just mistyped the link I gave you (sorry...sleepy eyes)
(Yes I am still awake) :-)
Chris.....7.40am
(Yes I am still awake) :-)
Chris.....7.40am
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I'm a bit confused. You asked for VBscript code, but now it is VBA and Excel. That makes a huge difference.
Could you clarify, please?
Also, I don't understand the delay. Wouldn't you just download the files at maximum speed?
Using URLDownloadToFile is in general 4-5 times faster than any other method, but it cannot do a login except, of course, if your account or a token somehow can be included in the URL.
/gustav
Could you clarify, please?
Also, I don't understand the delay. Wouldn't you just download the files at maximum speed?
Using URLDownloadToFile is in general 4-5 times faster than any other method, but it cannot do a login except, of course, if your account or a token somehow can be included in the URL.
/gustav
The original question asked for data to be extracted from website pages, not downloaded and then extracted (which actually would take longer). The web pages may not exist, which would add to problems in file transfer and retries by the downloader (upon failure) would add to time.
I overlooked the fact that VB script was requested, but since the output is in comma seperated values (.CSV) format, I assumed the originator could handle VBA. It can be converted to an executable of course in VB by using VBA references, therby avoiding the VBA or VBScript interface completely.
I assume that if VBScript MUST be used for some reason then the originator of the question should have no trouble in moving it into VBScript. (I never use it myself) :-)
Cheers
Chris (Australia)
I overlooked the fact that VB script was requested, but since the output is in comma seperated values (.CSV) format, I assumed the originator could handle VBA. It can be converted to an executable of course in VB by using VBA references, therby avoiding the VBA or VBScript interface completely.
I assume that if VBScript MUST be used for some reason then the originator of the question should have no trouble in moving it into VBScript. (I never use it myself) :-)
Cheers
Chris (Australia)
@Chris
2. If ReneGe did not know how to convert your VBA solution into (the required) VBScript, could you help him if you've never used VBScript?
the originator of the question should have no trouble in moving it into VBScript. (I never use it myself)1. That's quite an assumption of skill level.
2. If ReneGe did not know how to convert your VBA solution into (the required) VBScript, could you help him if you've never used VBScript?
I did not say I have never used it, I just say I dont use it now. Gustuv Brock made the same assumption with his response which supplied VBA code and stated "it should be quite easy to adopt for VBscript."
I will leave it up to Rene if he want to take me up on my offer to walk through the code for him.
Cheers
Chris
I will leave it up to Rene if he want to take me up on my offer to walk through the code for him.
Cheers
Chris
ASKER
I would have prefered VBscript because I just started Learning it, and I know how to run it. However VBA would be ok because the main focus here is to have it do it's thing.
You have spent much effort in helping me here, so if you just show me how to execute it, that would be great.
For the sake of Learning, I would open another question to convert the script to VBScript.
Thanks and cheers
You have spent much effort in helping me here, so if you just show me how to execute it, that would be great.
For the sake of Learning, I would open another question to convert the script to VBScript.
Thanks and cheers
OK...I will put together a video for you.
Will that be OK (provided you can understand my Australian accent)
Chris
Will that be OK (provided you can understand my Australian accent)
Chris
Just curious, but how can any code/script be written/work without knowing the actual URL of interest?
I've done quite a bit of thus stuff myself in the past and one of the main things I've learned is there is no 'one size fits all' code.
I've done quite a bit of thus stuff myself in the past and one of the main things I've learned is there is no 'one size fits all' code.
*thus = this.
ASKER
Hi Chris,
I ran it as per your instructions.
Here is what i get. See attachment.
Also, the data is not extracted and not either exported in the Excel spread sheet nor output to a csv file.
companyName_id field name is not displayed.
Am I missing something?
Thanks and cheers,
Rene
EE1.jpg
I ran it as per your instructions.
Here is what i get. See attachment.
Also, the data is not extracted and not either exported in the Excel spread sheet nor output to a csv file.
companyName_id field name is not displayed.
Am I missing something?
Thanks and cheers,
Rene
EE1.jpg
ASKER
Hi Norie,
I understand your concern. The fact the Chris created test pages on his website is good enough for me.
Anyway, my client's website is secured so he would not be able to test it there anyway.
Thanks and cheers,
Rene
I understand your concern. The fact the Chris created test pages on his website is good enough for me.
Anyway, my client's website is secured so he would not be able to test it there anyway.
Thanks and cheers,
Rene
ASKER
Hi aikimark,
Thanks for your concerns. As long it works, I'm fine. Once this question is resolved, I'll create a new question to convert this script to VBscript just for the sake of Learning.
Thanks and cheers.
Thanks for your concerns. As long it works, I'm fine. Once this question is resolved, I'll create a new question to convert this script to VBscript just for the sake of Learning.
Thanks and cheers.
The browser should not be showing. I suspect that you are missing a reference in your VB Editor.
Can you hand five while Iput together this video for you?
Cheers
Chris
Can you hand five while Iput together this video for you?
Cheers
Chris
ASKER
Hi Chris,
What does "hand five" mean?
Thanks and cheers,
Rene
What does "hand five" mean?
Thanks and cheers,
Rene
@ReneGe
I assume he meant to type "hang five" - hang on about five time units (seconds, minutes, hours, days, etc.)
I assume he meant to type "hang five" - hang on about five time units (seconds, minutes, hours, days, etc.)
ASKER
Makes sense :)
I meant to say "Hang Five"
It's an old surfing phrase (from the 1960's) which was when you came in on your baord on one foot
(hanging on with 5 toes I guess)! (Well, that is my interpretation anyway....LOL)
It actually is slang in Australia for waiting for a small time (like 5 minutes), although I better make it a bit longer, since it is now 10 minutes to 1 in the morning and I have to start teaching later today at 9.30am I also teaching another class in the afternoon at 1.30pm so I am not really free again until 4.00pm. (In about 15 hours). I have to sleep in that time too, so I hope the wait until then is OK. The video will probably run for about 5 minutes or so. It will be just showing how to run a macro within an Office VBA environment (Excel in this case).
If you want a detailed walkthrough the code for a learning experience I could do that, but it will take a bit more preparation and time.
Cheers
Chris
It's an old surfing phrase (from the 1960's) which was when you came in on your baord on one foot
(hanging on with 5 toes I guess)! (Well, that is my interpretation anyway....LOL)
It actually is slang in Australia for waiting for a small time (like 5 minutes), although I better make it a bit longer, since it is now 10 minutes to 1 in the morning and I have to start teaching later today at 9.30am I also teaching another class in the afternoon at 1.30pm so I am not really free again until 4.00pm. (In about 15 hours). I have to sleep in that time too, so I hope the wait until then is OK. The video will probably run for about 5 minutes or so. It will be just showing how to run a macro within an Office VBA environment (Excel in this case).
If you want a detailed walkthrough the code for a learning experience I could do that, but it will take a bit more preparation and time.
Cheers
Chris
ASKER
Hi Chris,
No worries. Takes all the time you need!
Maybe I can spare you the video!
To run the script:
-I opened your xlsm file
-Disabled security features
-Enabled Developer Ribbon
-Click on Macros (Alt F8)
-Selected the macro, the click Run
Was there something else/more to do?
Thanks and cheers,
Rene
No worries. Takes all the time you need!
Maybe I can spare you the video!
To run the script:
-I opened your xlsm file
-Disabled security features
-Enabled Developer Ribbon
-Click on Macros (Alt F8)
-Selected the macro, the click Run
Was there something else/more to do?
Thanks and cheers,
Rene
You must make sure you have added the references under "Tools/References" in the VBE menu
Read the code in module 1 to see what you need to add. You will need to include Microsoft Frms 2.0 Object library as well (I forgot to include that in the code because we were not using a form at that stage).
I suspect you already have the form reference added since a form has appeared.
I must be off nows orry...two nights without sleep would be too much.
Cheers
Chris
Read the code in module 1 to see what you need to add. You will need to include Microsoft Frms 2.0 Object library as well (I forgot to include that in the code because we were not using a form at that stage).
I suspect you already have the form reference added since a form has appeared.
I must be off nows orry...two nights without sleep would be too much.
Cheers
Chris
ASKER
Hi Chris,
Microsoft Frms 2.0 Object library was indeed included.
Please find attached the selected references.
Sleep well mate :)
Cheers,
Rene
Screenshot-2015-03-14-17.45.21.png
Microsoft Frms 2.0 Object library was indeed included.
Please find attached the selected references.
Sleep well mate :)
Cheers,
Rene
Screenshot-2015-03-14-17.45.21.png
Earlier you said "companyName_id field name is not displayed."
It is displayed: It is (in the sample caode) simply called "Company0".
Your html code does not have a Label tag before CompanyName_id.
It just does not have a label (as indicated by your data definition).
I will put a few more items up with more realistic data items.
(Why is your browser showing in the screen capture? It should not appear when you run the macro).
It is displayed: It is (in the sample caode) simply called "Company0".
Your html code does not have a Label tag before CompanyName_id.
It just does not have a label (as indicated by your data definition).
I will put a few more items up with more realistic data items.
(Why is your browser showing in the screen capture? It should not appear when you run the macro).
ASKER
Oh I think I should feel stupid right now.
I'm currently in bed.
I'll review tomorrow Sunday and update you.
No need to add more data.
Cheers
I'm currently in bed.
I'll review tomorrow Sunday and update you.
No need to add more data.
Cheers
ASKER
Hi Chris,
Sorry for my very late follow-up.
I tried a few times and I am still am unable to generate the csv file, containing the required data.
I value all the time, effort, dedication and heart you invested in this project.
If you feel that we are almost there, please let me know. We'll continue from here.
Or else, I will not mind creating a new question and be more precise in the requirements.
Don't worry, I'll be honored to give you all the credits for your contributions and efforts.
What do you think?
Thanks for everything :)
Cheers,
Rene
Sorry for my very late follow-up.
I tried a few times and I am still am unable to generate the csv file, containing the required data.
I value all the time, effort, dedication and heart you invested in this project.
If you feel that we are almost there, please let me know. We'll continue from here.
Or else, I will not mind creating a new question and be more precise in the requirements.
Don't worry, I'll be honored to give you all the credits for your contributions and efforts.
What do you think?
Thanks for everything :)
Cheers,
Rene
Let's stick with it since it is generating at my end.
Where are you having errors (if any) or explain exactly what you are doing, with what output (if any).
I think we are almost there :-)
Cheers
Chris
Where are you having errors (if any) or explain exactly what you are doing, with what output (if any).
I think we are almost there :-)
Cheers
Chris
ASKER
Hi Chris,
Sorry for taking so long to reply.
I should come back to this question in a few days.
Cheerz
Sorry for taking so long to reply.
I should come back to this question in a few days.
Cheerz
ASKER
Hi Chris,
I will close this question and give you the points.
I'll create a new one when ready to conclude it and refer to this one.
I'll let you know through this one when.
Thanks and cheers,
Rene
I will close this question and give you the points.
I'll create a new one when ready to conclude it and refer to this one.
I'll let you know through this one when.
Thanks and cheers,
Rene
OK, thanks Rene
Cheers
Chris
Cheers
Chris
Cheers
Chris (Australia)