Manipulate Remote Dropdown

I am working on a project for school and I need to access and change the value of a dropdown menu on a remote page before I get the file contents of the page with PHP. I need to do this to a remote page on another domain.

The page I am trying to get this from is: http://cad.chp.ca.gov/ This dropdown is the one to the far left that when clicked shows a list of citys. Also see attached.

In the end what I will do is I will scrap the information from the table for each city region, and insert them into a mySQL database to maniplulate the data later.

Thanks in advanced.
dropdownpic.JPG
LVL 2
paulppAsked:
Who is Participating?
 
Ray PaseurConnect With a Mentor Commented:
I want to be respectful of your in-school status here.  The rules of engagement at EE prohibit us from "doing" school assignments, so all I can do is tell you what I see and how I might go about it if it were my school assignment.

The <form> on this page makes an AJAX POST-method request to the action script, then dynamically reloads the body of the page.  Like so many government web sites, this is a technically incompetent design.  Years ago they were convinced to use Microsoft web structures to build the sites, and like all things in the Government they will never get rid of the old, bad practices.  The correct design for data retrieval is to use a GET method request so you can simply put the query arguments into the URL.  Since this site does not update the government data model, a GET method request would have given you an easy way to make use of the information.  This is information that the taxpayers paid for, and the design makes is very difficult for taxpayers to get access to the information.  In addition, the site fails validation.
http://validator.w3.org/check?uri=cad.chp.ca.gov&charset=%28detect+automatically%29&doctype=Inline&group=0

But why complain?  It won't do any good.  It's the government.

So here is what you will have to do.  You will have to write a script that acts like a well-behaved web browser, reading the HTML stream, accepting and returning cookies, following redirect headers, parsing the HTML, setting the raw post strings and retrieving the contents of the browser output stream.  If you've never done it before, you should give yourself perhaps as much as a week to write this script.  I don't know your level of PHP expertise, but if it were my assignment I would want two or three days of uninterrupted concentration.  At least that is what it took me the last time I had to write one of these scrapers against a POST method, tokenized dynamic page.  It's a big project.

There is something in the HTML that looks like this:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKLTQ2MDU1MDEzOQ9kFgICAw9kFiZm...

Open in new window

That is probably a base-64 encoded form token, a thing devised to make it hard for clients to get at the data in any way other than by using the web page.  It can be defeated, but to do so requires a bit of effort.  You have to extract the value in that tag and return it to the server.  The value may be different on every submission of the data.

The way it should work would be something like this (perhaps with some additional options to specify the level of detail in the response):
http://cad.chp.ca.gov/?ddlComCenter=GGCC

But unfortunately you can't just do that.  So here is my recommendation.  The page has a link that says, "Please forward Comments and Suggestions to CAD Web Master" and the email address in the link is CADPAGE@chp.ca.gov.  My recommendation is to click that link, write to the "Web Master" and ask for a RESTful API so the citizens can query the government data base.  You may need to file a Freedom-of-Information request.

Suggest you leave this question open a little longer and maybe one of the other experts can offer some help.  Best of luck with the project, ~Ray
0
 
paulppAuthor Commented:
Thanks Ray,

That was all very helpful and sounds like I am overachieving what I really need to do. I don't have that much time do this project. :)

Thanks again.
0
 
Ray PaseurCommented:
Thanks for the points.  If only they had an API or a URL you could populate with the request data.  :-(

Best of luck with it, ~Ray
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.