Solved

Manipulate Remote Dropdown

Posted on 2012-03-26
3
283 Views
Last Modified: 2012-03-27
I am working on a project for school and I need to access and change the value of a dropdown menu on a remote page before I get the file contents of the page with PHP. I need to do this to a remote page on another domain.

The page I am trying to get this from is: http://cad.chp.ca.gov/ This dropdown is the one to the far left that when clicked shows a list of citys. Also see attached.

In the end what I will do is I will scrap the information from the table for each city region, and insert them into a mySQL database to maniplulate the data later.

Thanks in advanced.
dropdownpic.JPG
0
Comment
Question by:paulpp
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
3 Comments
 
LVL 110

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 37771215
I want to be respectful of your in-school status here.  The rules of engagement at EE prohibit us from "doing" school assignments, so all I can do is tell you what I see and how I might go about it if it were my school assignment.

The <form> on this page makes an AJAX POST-method request to the action script, then dynamically reloads the body of the page.  Like so many government web sites, this is a technically incompetent design.  Years ago they were convinced to use Microsoft web structures to build the sites, and like all things in the Government they will never get rid of the old, bad practices.  The correct design for data retrieval is to use a GET method request so you can simply put the query arguments into the URL.  Since this site does not update the government data model, a GET method request would have given you an easy way to make use of the information.  This is information that the taxpayers paid for, and the design makes is very difficult for taxpayers to get access to the information.  In addition, the site fails validation.
http://validator.w3.org/check?uri=cad.chp.ca.gov&charset=%28detect+automatically%29&doctype=Inline&group=0

But why complain?  It won't do any good.  It's the government.

So here is what you will have to do.  You will have to write a script that acts like a well-behaved web browser, reading the HTML stream, accepting and returning cookies, following redirect headers, parsing the HTML, setting the raw post strings and retrieving the contents of the browser output stream.  If you've never done it before, you should give yourself perhaps as much as a week to write this script.  I don't know your level of PHP expertise, but if it were my assignment I would want two or three days of uninterrupted concentration.  At least that is what it took me the last time I had to write one of these scrapers against a POST method, tokenized dynamic page.  It's a big project.

There is something in the HTML that looks like this:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKLTQ2MDU1MDEzOQ9kFgICAw9kFiZm...

Open in new window

That is probably a base-64 encoded form token, a thing devised to make it hard for clients to get at the data in any way other than by using the web page.  It can be defeated, but to do so requires a bit of effort.  You have to extract the value in that tag and return it to the server.  The value may be different on every submission of the data.

The way it should work would be something like this (perhaps with some additional options to specify the level of detail in the response):
http://cad.chp.ca.gov/?ddlComCenter=GGCC

But unfortunately you can't just do that.  So here is my recommendation.  The page has a link that says, "Please forward Comments and Suggestions to CAD Web Master" and the email address in the link is CADPAGE@chp.ca.gov.  My recommendation is to click that link, write to the "Web Master" and ask for a RESTful API so the citizens can query the government data base.  You may need to file a Freedom-of-Information request.

Suggest you leave this question open a little longer and maybe one of the other experts can offer some help.  Best of luck with the project, ~Ray
0
 
LVL 2

Author Comment

by:paulpp
ID: 37772936
Thanks Ray,

That was all very helpful and sounds like I am overachieving what I really need to do. I don't have that much time do this project. :)

Thanks again.
0
 
LVL 110

Expert Comment

by:Ray Paseur
ID: 37773811
Thanks for the points.  If only they had an API or a URL you could populate with the request data.  :-(

Best of luck with it, ~Ray
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this. Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it i…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

724 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question