Need to download the contents of a URL into a string with MS Access

I need a new way to read an URL into a variable.

This would be on a Win10 computer running MS Access 2016
Does anyone have any code to share?
pcalabriaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Scott McDaniel (Microsoft Access MVP - EE MVE )Infotrakker SoftwareCommented:
Do you mean read the actual URL into a variable? Or retrieve something from that URL into a variable?
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
You mean the web page that results?  One way is to use automation with IE and load the page through that:

Here's an example of that:

    Dim i As Long
    Dim URL As String
    Dim IE As Object
    Dim objElement As Object
    Dim objCollection As Object
 
    'Create InternetExplorer Object
    Set IE = CreateObject("InternetExplorer.Application")
 
    'Set IE.Visible = True to make IE visible, or False for IE to run in the background
    IE.Visible = True
 
    'Define URL
    URL = "https://www.automateexcel.com/excel/"
 
    'Navigate to URL
    IE.Navigate URL
 
    ' Statusbar let's user know website is loading
    Application.StatusBar = URL & " is loading. Please wait..."
 
    ' Wait while IE loading...
    'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
    Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
    Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until
 
    'Webpage Loaded
    Application.StatusBar = URL & " Loaded"
    
    'Unload IE
    Set IE = Nothing
    Set objElement = Nothing
    Set objCollection = Nothing

Open in new window


 Once you have that, you can work with the document object in various ways depending on what your trying to do (ie. use GetElementxxxx calls)

Jim.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pcalabriaAuthor Commented:
Yes.. the webpage that results.

@Jim... all I need to do is load the html code from the resulting webpage into a text variable that I can txtURLpage... how would I do that?

Also... do you think this code will work when compiled with Access 2K as mde files and run on XP machines which have Access 2K loaded?

I also have a few Wn10 machines on the network now... but we still compile the source with Access 2K then copy the mde file to the Win10 machine.

If necessary.. I can compile with access 2016, but I don't  hunk that code would run on the computer machines.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
@Jim... all I need to do is load the html code from the resulting webpage into a text variable that I can txtURLpage... how would I do that?

 With the above code, when the web page was done loading, you would do:

  txtURLpage = IE.Document.DocumentElement.innerHTML

  to get the HTML for the page (think I got that right).

Also... do you think this code will work when compiled with Access 2K as mde files and run on XP machines which have Access 2K loaded?

  As far as Access, yes but this code relies on using IE (Internet Explorer).   I would use late binding with IE to avoid version problems.

Jim.
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
What is it exactly that your doing?   If going after one site, it's possible that it has an API that you could work with directly and avoid IE.

Jim.
0
pcalabriaAuthor Commented:
@Jim Thanks... but how can I tell... or otherwise kill time if I have to wait... and then how do I know when the webpage is loaded?
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
and then how do I know when the webpage is loaded?

 That's already in that code sample I posted:

    ' Wait while IE loading...
    'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
    Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
    Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until

Open in new window


 Please read through the code I posted.

 and BTW that's not my code, but an example that I swiped off the net.   Was faster doing that then me digging out what I had done in the past.

Jim.
0
Gustav BrockCIOCommented:
The simplest and fastest method is to use a function like DownloadFile found here:

Show pictures directly from URLs in Access forms and reports
0
pcalabriaAuthor Commented:
@Gustav I'm happy to hear from you.  Although I suspect you may attached the wrong link in this message, you provided me with instructions to use DownloadFile many years ago. The code you provided still works well on our XP machines, however, XP does not allow us to install any version of Internet Explorer higher than 8.0, and the real problem is that IE8 no longer works with the website we are scraping.

When we execute our Access code (that includes the DownloadFile code) as an mde file on a Win10 machine, the DownloadFile routine does not work.  I don't know why.  Every other line of code in the 300MB source code module works on the Win10 machine... except DownloadFile.

So that's the real problem.  We need a version of DownloadFile, or equivalent program, that we can run on the Win10 machine, or a completely new way to duplicate the function of the program, such as what Jim has suggested.

I believe you may have attached the wrong link.  Would this me new code, or the same code you sent me ten years ago?


@Jim  We are visiting a website, and then downloading the resulting page to a file on the disk.  Then we read the file into a string variable (text) and then call a very complicated subroutine that extracts the info we need an stores it in appropriate tables.

Basically:

Call ProcessOurData(txtURLPageText)

ProcessOurData is about 100 pages of code and ten different suproutines.  All I need to do is sent in the HTML code that we download from the URL and everything is happy.
0
Gustav BrockCIOCommented:
Yes, the link is correct - the function is the very first code block. It was revised in 2017, so it may be brushed up a little compared to your version.

However, I see no reason why any version shouldn't work with an mde, though I have only tested it with a normal accdb on Windows 10.
Could it be, that you run 64-bit Access, not 32-bit? It is not tested with 64-bit Access.
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<< Then we read the file into a string variable (text) and then call a very complicated subroutine that extracts the info we need an stores it in appropriate tables.>>

 With the example I posted, all your doing is skipping the disk file part.   You can get the page into a variable.   However, the solution also depends on having IE installed.  

<<ProcessOurData is about 100 pages of code and ten different suproutines.  All I need to do is sent in the HTML code that we download from the URL and everything is happy.>>

 That would still be doable.   I think though if you can avoid using IE, I would and since you are already using gustav's approach, I would look at getting that to work rather than going to a new approach.

Jim.
0
pcalabriaAuthor Commented:
I plan to write new code today to see if I can get it to work on the Win10 Machine. Yes 32-bit A2016.

The real problem here are the XP machines as the latest IE I can install is IE8 and IE8 no longer works with many of our websites.
What I really need is an solution that is browser independent... perhaps something that works with Chrome?
Gustav's approach requires adding a reference to Microsoft Internet Controls... I'm not sure buy I suspect this is the core of IE8 on these machines.


So both solutions would only work on our two Win10 mahcines... Whereas we have an office full of XP machines.
Any other thoughts before I dive into this?  If I get it working on Win10 we will need to only run this code from specific mahines...or upgrade machines.
Thanks
0
Gustav BrockCIOCommented:
My function doesn't require anything special.
You can fetch and run the flag demo right away.
0
pcalabriaAuthor Commented:
@Gustav Good morning.  I just finished crawling through your code and I understand the new revision, which should not effect me.

As I recall, the problem we are having on the new machine is that this is not a public website.  We pay a membership fee and must log in.  With this said, once we log in using IE8 the first time, we do not need to log in again.  Apparently it uses a cookie.

On our xo machines we log in using IE8, and then your code works perfectly.

On the Win10 machine, if we log in with the version if ie installed by win10, then try to use the code, the code does not recognize that we are already logged in.  That’s the problem that has prevented us from using the code on win10 machines.

Perhaps the code is using a different IE engine, and all we need is to log in with a different browser?
0
Gustav BrockCIOCommented:
Oh, that explains.

Windows 10 is much more strict regarding security than XP, so this may be the single reason.

Another could be, that the browser will run in its 64-bit version, while Access is running 32-bit.
If you have Access 64-bit running on the Windows 10 machines, you could check out the revised code (the attached PictureUrl.zip file) at the link above to the article. I have updated it a few minutes ago.
0
pcalabriaAuthor Commented:
Gustav, keep in mind we are running an MDE file that was compiled using Access2K on an XP machine, and then copied to the Win10 machine.

I"m wondering what's going on with the references.  I can open the MDB file with Access 2016 and take a look... the XP version points Microsoft Internet Controls to ieframe
0
pcalabriaAuthor Commented:
@Gustav  Are you saying that the code you provided requires using the 64-bit version of Access with win10 because the browser used with Win10 is 64 bit?  It seems like if that was the case you would have already run into it, am I missing something?
0
Gustav BrockCIOCommented:
No. It will work in both 32- and 64-bit Access.
Windows 64-bit will run both, while Windows 32-bit only will run Access 32-bit.
0
pcalabriaAuthor Commented:
I'm running 64-bit Win 10 Pro with 32-bit access and the browsers installed by windows.. ie and edge... your routine should work... correct?
0
Gustav BrockCIOCommented:
Yes, but it will only download "what is" ... it will not force an update or something else of the web page.
0
pcalabriaAuthor Commented:
@Gustav Okay.. I'll try it today on my Win10 machine.  Thanks
0
pcalabriaAuthor Commented:
@Gustav  I have your code working win the Win10 machine... but I still have a problem.

One of our sites requires that we first log in, however, we only have to log in once.  Even if we close the browser, we do not have to log in a second time.

If I go to this site with IE8 on my XP machine and log in, I can then use Access and the code provided (by Gustave) to automatically download as if the site did not require password access.  Everything works fine in XP.  If I view the Access references, the Microsoft Internet Controls points to ieframe.dll.

On the Win10 machine, when I log into the site, IE 11 is used, because this is the latest version of IE on my machine.  However, when I use the code provided (Gustav's code) the system does not let me get past the log in screen.  If I view the references, the Microsoft Internet Controls points to the ieframe.dll in the sysWOW64 folder.

I suspect the reference used is a different version of IE... perhaps IE 10???  I'm pretty sure the code needs to be using the same IE as we use in the browser.
I'm using the 64 bit version of Windows but the 32 bit version of Access 2016.  What would you try?  Its very important that we be able to access this website programmatically.
0
Gustav BrockCIOCommented:
I don't think you have control over the bit-version of the browser in Windows 10 64-bit. It will use, whatever it finds best.
0
pcalabriaAuthor Commented:
I suspect my problem is that I invoke a different Internet Explorer when I start IE manually from Win10 Pro and when I start IE from Access 2016 running in 32-bit mode.

Access 2016 is set with the reference  to Microsoft Internet Control set to ieFrame.dll in the sysWow64 folder.

Can anyone help?  All I want to do at this point is to be able to start the same browser manually as I start programmatically when I use Access 2016 on Win10.
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<<On the Win10 machine, if we log in with the version if ie installed by win10, then try to use the code, the code does not recognize that we are already logged in. >>

<<I suspect the reference used is a different version of IE... perhaps IE 10???  I'm pretty sure the code needs to be using the same IE as we use in the browser.>>

  I would think it would have more to do with settings in IE. i.e. not being able to run Active X controls, not a trusted site, not accepting cookies, etc.  Some setting related to security  more than anything.

 Reading back through, I'm a little lost on where things are at.    Gustav's solution is using an API call and not relying on IE; is that not a good approach?

 I'm not sure why were talking about versions of IE.

Jim.
0
pcalabriaAuthor Commented:
@Jim  Gustavs solution works great on XP as well as Win10 with the First website we need to access... which is one that does not require sign in... so it gets me half way there.  The second website is the problem.  The second website requires that we first log in.   If we log into the website with Chrome of Safari using XP, and then use Gustav's routine, we can not get past the Please Sign In page.  However, on XP if when we log into the Second website with IE 8, everything works perfectly, which leads me to believe Gustav's routine is also in some way using IE 8???... I noticed a reference for Microsoft Internet Controls is set to ieframe.dll so I assumed Gustav's code was using this reference???

Using Win10 Gustaves  routine captures the please log in page regardless of whether we are logged in with IE 11 or not.  My guess is that a different IE 11 is launched manual compared to whatever Gustav's routine is doing...

Since the Microsoft Internet Controls reference is set for ieFrame in the sysWow64 folder, my theory is that Gustav's rouine is using the 32 bit version of IE, and clicking the link launches the 64 bit version, and that's why the website does nor recognize the sign in?

Do you think I'm barking up the wrong tree?

,
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<<Do you think I'm barking up the wrong tree?>>

 Yes.   I think it has more to do with security settings in place for each of the browsers.

<<Since the Microsoft Internet Controls reference is set for ieFrame in the sysWow64 folder, my theory is that Gustav's rouine is using the 32 bit version of IE, and clicking the link launches the 64 bit version, and that's why the website does nor recognize the sign in?>>

  You don't need this reference.   Gustav's call is a direct API call.

Jim.
0
pcalabriaAuthor Commented:
@Jim   <<Yes.   I think it has more to do with security settings in place for each of the browsers.>>

Thanks for getting back to me so quickly, I'm actively working on this now.  I'm not sure which security settings I can change???
If I open Internet Explorer manually, and then log into the website, it DOES remember me the next time I open the browser.
I can close the browser, even reboot the machine, and when I return manually, it remembers my creditilals...  Apparently the security settins for the browser are fine... when started manually...

As far as Gustav's code... is there any way to change security settings?  I'm not aware of any way to make any changes.
0
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<<Thanks for getting back to me so quickly, I'm actively working on this now.  I'm not sure which security settings I can change???>>

  I'm not sure either.  I would suspect it would deal with the saving of cookies, or in running scripts.   For a clue on that, I would check the web site your logging into and see if they list any requirements.

  The other issue that your bumping into is executing things outside of a browser.   Many of the settings you find in the browser for "Internet settings" affects how Windows behaves with certain other functions.   But in this case, it sounds like URLDownloadToFile () doesn't look at those settings at all.   So if this web site relies on cookies for the login, that may or may not work with URLDownloadToFile ()   (and it seems like not).

   I would suspect that to get this working the way you want, you're going to need to resort to using the Web Browser control on a form.   I would do this rather than trying to control IE through automation, as you won't know what's installed.

   One problem with the Web Browser control though is that default mode is for IE8 emulation.  You'll need to add some registry entries to get IE11 emulation, and that setting will apply to all applications, not just your individual DB.   there are docs/blogs all over the place covering this.  i.e.:

https://weblog.west-wind.com/posts/2011/May/21/Web-Browser-Control-Specifying-the-IE-Version

Jim.
1
pcalabriaAuthor Commented:
Thanks for the help.  I never did get Gustav’s code working with the site that required a password.  The best guess is that the api call was not working correctly with cookies.  The code provided by Jim however did work with the website that requires sign in.  I first meet to sign in with IE but the code then allows me to grab the correct pages.  I split the points because Gustav code will be used in the routine that does not require web sign in.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Access

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.