Link to home
Start Free TrialLog in
Avatar of VeeVan
VeeVan

asked on

Need to programmatically capture information from a website.

I am trying to gather information from a website (that I don't own)

I log in manually, and then using EFGrabber and WinBatch, I have created a macro that scrolls through records that I have searched and extracts the information I am looking for into an excel file (without my having to type each record one by one....)

The challenge is this -- Because it's a click location based macro, if the next button is even slightly off, the macro hangs.

I am wondering if there is a better way to do this.... Can i use XML, or write a script, or something like that?

Any direction would be greatly appreciated. I'm kind of at a loss on how to proceed.

FYI: I am proficient in ASP and .NET, VB, and VBScript, but could use another language :sigh: if absolutely necessary.

Thanks.
V
Avatar of rdivilbiss
rdivilbiss
Flag of United States of America image

Have permission?
SOLUTION
Avatar of fruhj
fruhj

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of VeeVan
VeeVan

ASKER

Can I use the XML to jump from one page to the next?

Here's the scenario:
1.  I manually do a search for the information that I want.
2.  It then comes up in a list format that has partial information.
3.  I click on one of the detail records and pull out a couple of fields into Excel using EFGrabber.
4.  Then, I click next to go to the next detail page.
5.  I repeat steps 3 and 4 until I have all the info that I want.

BTW: The stuff I'm searching is in public domain (County Property Appraiser's Website) I'm just trying to avoid the manual process of copying, pasting, or typing. Takes a really long time.

So what I need is a method to both grab the information: for which it seems XML should work nicely.....and also forward to the next page programmatically -- and I don't know if that's possible.

Thanks again for all your assistance.

Vee
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of VeeVan

ASKER

That's exactly what I was looking for. Thanks for your help.

I have dabbled in XML in the past, and had a sneaking suspicion that it would do what I wanted, but wasn't sure.

One last simple Q -- Do you know, can I use XML in .NET? I think I can. (I think I can, I think I can....)

I appreciate your help!

V
Okay, this is an object that happes to be able to pull a document via HTTP like a browser, which would include an XML document, bit in this case you are only retrieving the HTML source of the page.  Not XML.

Just to clarify.

Yes you can use it in .NET, PHP, ASP calssig, JScript, JavaScript, etc. etc.  It has become quite ubiquitous (sp?)

Have fun,
Rod
Avatar of VeeVan

ASKER

Thanks!
V