Solved

Manipulate Remote Dropdown

Posted on 2012-03-26
3
274 Views
Last Modified: 2012-03-27
I am working on a project for school and I need to access and change the value of a dropdown menu on a remote page before I get the file contents of the page with PHP. I need to do this to a remote page on another domain.

The page I am trying to get this from is: http://cad.chp.ca.gov/ This dropdown is the one to the far left that when clicked shows a list of citys. Also see attached.

In the end what I will do is I will scrap the information from the table for each city region, and insert them into a mySQL database to maniplulate the data later.

Thanks in advanced.
dropdownpic.JPG
0
Comment
Question by:paulpp
  • 2
3 Comments
 
LVL 108

Accepted Solution

by:
Ray Paseur earned 500 total points
ID: 37771215
I want to be respectful of your in-school status here.  The rules of engagement at EE prohibit us from "doing" school assignments, so all I can do is tell you what I see and how I might go about it if it were my school assignment.

The <form> on this page makes an AJAX POST-method request to the action script, then dynamically reloads the body of the page.  Like so many government web sites, this is a technically incompetent design.  Years ago they were convinced to use Microsoft web structures to build the sites, and like all things in the Government they will never get rid of the old, bad practices.  The correct design for data retrieval is to use a GET method request so you can simply put the query arguments into the URL.  Since this site does not update the government data model, a GET method request would have given you an easy way to make use of the information.  This is information that the taxpayers paid for, and the design makes is very difficult for taxpayers to get access to the information.  In addition, the site fails validation.
http://validator.w3.org/check?uri=cad.chp.ca.gov&charset=%28detect+automatically%29&doctype=Inline&group=0

But why complain?  It won't do any good.  It's the government.

So here is what you will have to do.  You will have to write a script that acts like a well-behaved web browser, reading the HTML stream, accepting and returning cookies, following redirect headers, parsing the HTML, setting the raw post strings and retrieving the contents of the browser output stream.  If you've never done it before, you should give yourself perhaps as much as a week to write this script.  I don't know your level of PHP expertise, but if it were my assignment I would want two or three days of uninterrupted concentration.  At least that is what it took me the last time I had to write one of these scrapers against a POST method, tokenized dynamic page.  It's a big project.

There is something in the HTML that looks like this:
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKLTQ2MDU1MDEzOQ9kFgICAw9kFiZm...

Open in new window

That is probably a base-64 encoded form token, a thing devised to make it hard for clients to get at the data in any way other than by using the web page.  It can be defeated, but to do so requires a bit of effort.  You have to extract the value in that tag and return it to the server.  The value may be different on every submission of the data.

The way it should work would be something like this (perhaps with some additional options to specify the level of detail in the response):
http://cad.chp.ca.gov/?ddlComCenter=GGCC

But unfortunately you can't just do that.  So here is my recommendation.  The page has a link that says, "Please forward Comments and Suggestions to CAD Web Master" and the email address in the link is CADPAGE@chp.ca.gov.  My recommendation is to click that link, write to the "Web Master" and ask for a RESTful API so the citizens can query the government data base.  You may need to file a Freedom-of-Information request.

Suggest you leave this question open a little longer and maybe one of the other experts can offer some help.  Best of luck with the project, ~Ray
0
 
LVL 2

Author Comment

by:paulpp
ID: 37772936
Thanks Ray,

That was all very helpful and sounds like I am overachieving what I really need to do. I don't have that much time do this project. :)

Thanks again.
0
 
LVL 108

Expert Comment

by:Ray Paseur
ID: 37773811
Thanks for the points.  If only they had an API or a URL you could populate with the request data.  :-(

Best of luck with it, ~Ray
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Deprecated and Headed for the Dustbin By now, you have probably heard that some PHP features, while convenient, can also cause PHP security problems.  This article discusses one of those, called register_globals.  It is a thing you do not want.  …
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now