Avatar of rnicholus
rnicholus

asked on 

POST data to a .NET website (simple screen-scraper)

I am currently working on a simple screen-scraper to obtain some data from a government web-site, and the site uses .NET platform.

----------------------------
RadAJAXControlID      ctl00_RadAjaxManager1
__EVENTARGUMENT            GETNEWDATA
__EVENTTARGET            ctl00:RadAjaxManager1
__VIEWSTATE            .........
httprequest            true
----------------------------

Then I embedded the value above to the URL but it doesn't work:

http://www.kcscout.net/Default.aspx?__EVENTARGUMENT=GETNEWDATA&RadAJAXControlID=ctl00_RadAjaxManager1&__EVENTTARGET=ctl00:RadAjaxManager1&__VIEWSTATE=......

I'm not sure what I'm doing wrong here.
Thanks in advance for all the advice.
Java

Avatar of undefined
Last Comment
rnicholus
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You need to encode the post parameter with URLEncoder
Avatar of rnicholus
rnicholus

ASKER

Why is that?

All the parameters?
Avatar of rnicholus
rnicholus

ASKER

I saw the data that I want using FIREBUG but not using my JAVA program.
I'm looking for the part below.


..............
..............
_RadAjaxResponseScript_try{_ajaxManager =ctl00_RadAjaxManager1;_upDating=false;var bShSpd=false
 
;TgSpd(0);TgDetSt();bShSpd=true;var _mdaSpd=new MultiDimensionalArray(212,5);_aPolyLines=[];_mdaSpd[0
 
][0]='I70 E @ W OF WOODS CHAPEL';_mdaSpd[0][1]='39.034287,-94.30769 39.033824,-94.303797';_mdaSpd[0]
 
[2]='39.034287,-94.30769';_mdaSpd[0][3]=1;_mdaSpd[0][4]=62;_mdaSpd[1][0]='I70 E @ NW 50TH ST';_mdaSpd
 
[1][1]='39.035237,-94.315453 39.034287,-94.30769';_mdaSpd[1][2]='39.035237,-94.315453';_mdaSpd[1][3]
 
=1;_mdaSpd[1][4]=55;_mdaSpd[2][0]='I70 E @ NW SCRIMSHAW RD';
..............
..............

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

>>Why is that?

Because it won't post properly if the params need encoding

>>All the parameters?

If in doubt, encode

See http://exampledepot.com/egs/java.net/Post.html
Avatar of rnicholus
rnicholus

ASKER

I tried to encode already before my second post. I still don't get the data on the response from the server. I'm not sure what's wrong.
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Can you show your code?
Avatar of rnicholus
rnicholus

ASKER

Here you go
			String viewstate = "";
			URLConnection conn = (new URL("http://www.kcscout.net/Default.aspx")).openConnection();
			BufferedReader reader = null;
			String line = "";
			
			reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
			while ( (line = reader.readLine()) != null )
			{
				if ( line.indexOf("VIEWSTATE") >= 0 )
				{
					viewstate = line;	
					break;
				}
			} // end while
			viewstate = viewstate.replaceFirst(".*?\\s+value=\"", "");
			viewstate = viewstate.replaceFirst("\" />", "");
			System.out.println("---> VIEWSTATE: " + viewstate);
			viewstate =  URLEncoder.encode(viewstate,"UTF-8");
			
			reader.close();
			reader = null;
			conn = null;
			
			String radAjaxControlId = URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
			String eventArgument = URLEncoder.encode("GETNEWDATA", "UTF-8"); 
			String eventTarget = URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
			
			//////////////////////////////////////////////////
			// Start reading with VIEWSTATE value.
			//////////////////////////////////////////////////
			conn = (new URL("http://www.kcscout.net/Default.aspx?" + 
				"RadAJAXControlID=" + radAjaxControlId + "&__EVENTARGUMENT=" + eventArgument + 
				"&__EVENTTARGET=" + eventTarget +
				"__VIEWSTATE=" + viewstate)).openConnection();
			reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
			while ( (line = reader.readLine()) != null )
			{
				System.out.println("line: " + line);
			} // end while
			
			reader.close();
			reader = null;
			conn = null;

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You need to setDoOutput. Emulate the example at the link i posted
Avatar of rnicholus
rnicholus

ASKER

I tried to do this below. I got HTTP response 500.
String data = URLEncoder.encode("RadAJAXControlID", "UTF-8") + "=" + URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
data += "&" + URLEncoder.encode("__EVENTARGUMENT", "UTF-8") + "=" + URLEncoder.encode("GETNEWDATA", "UTF-8");
data += "&" + URLEncoder.encode("__EVENTTARGET", "UTF-8") + "=" + URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
data += "&" + URLEncoder.encode("__VIEWSTATE", "UTF-8") + "=" + URLEncoder.encode(viewstate, "UTF-8");
			
URL url = new URL("http://www.kcscout.net/Default.aspx");
conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
while ( (line = reader.readLine()) != null )
{
System.out.println("line: " + line);
} // end while
			
reader.close();
reader = null;
conn = null;

Open in new window

Avatar of Mick Barry
Mick Barry
Flag of Australia image

URLConnection will use GET by default, you need to expicitly tell it to use POST

theres a good discussion here

http://java.sun.com/docs/books/tutorial/networking/urls/readingWriting.html

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

It will use POST when you setDoOutput(true)
Avatar of rnicholus
rnicholus

ASKER

Stlil doesn't work. I'm confused. =(
String data = URLEncoder.encode("RadAJAXControlID", "UTF-8") + "=" + URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
        	data += "&" + URLEncoder.encode("__EVENTARGUMENT", "UTF-8") + "=" + URLEncoder.encode("GETNEWDATA", "UTF-8");
			data += "&" + URLEncoder.encode("__EVENTTARGET", "UTF-8") + "=" + URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
			data += "&" + URLEncoder.encode("__VIEWSTATE", "UTF-8") + "=" + URLEncoder.encode(viewstate, "UTF-8");
			
			URL url = new URL("http://www.kcscout.net/Default.aspx");
			HttpURLConnection httpConn = (HttpURLConnection) url.openConnection();
			httpConn.setDoOutput(true);
			httpConn.setRequestMethod("POST");
			OutputStreamWriter wr = new OutputStreamWriter(httpConn.getOutputStream());
			wr.write(data);
			wr.flush();
			reader = new BufferedReader(new InputStreamReader(httpConn.getInputStream()));
			while ( (line = reader.readLine()) != null )
			{
				System.out.println("line: " + line);
			} // end while
			
			reader.close();
			reader = null;
			httpConn = null;

Open in new window

Avatar of Mick Barry
Mick Barry
Flag of Australia image

does the site 'expect' POST?
May be that its looking for a cookie, or it may check the user-agent
have a look at the request sent via a browser (via a proxy or firefox plugin) to see whats different from your quest

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

That site is quite heavily reliant on JavaScript too, and you possibly need to take that into account
Avatar of rnicholus
rnicholus

ASKER

>>>>>>>>>>>>>>>>
does the site 'expect' POST?
May be that its looking for a cookie, or it may check the user-agent
have a look at the request sent via a browser (via a proxy or firefox plugin) to see whats different from your quest
>>>>>>>>>>>>>>>>>
I use firebug and it gives me the request and response headers. Nothing special there.

Under POST tab on Firebug:
--------------------
City
RadAJAXControlID      ctl00_RadAjaxManager1
__EVENTARGUMENT      GETUPDATE
__EVENTTARGET      ctl00:RadAjaxManager1
__VIEWSTATE .......
httprequest      true
--------------------




Response Headers
----------------
Date	Fri, 18 Jul 2008 14:31:42 GMT
Server	Microsoft-IIS/6.0
X-Powered-By	ASP.NET
X-AspNet-Version	2.0.50727
Cache-Control	no-cache
Pragma	no-cache
Expires	-1
Content-Type	text/javascript; charset=utf-8
Content-Length	70445
 
Request Headers
----------------
Host	www.kcscout.net
User-Agent	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0
Accept	text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language	en-us,en;q=0.5
Accept-Encoding	gzip,deflate
Accept-Charset	ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive	300
Connection	keep-alive
Content-Type	application/x-www-form-urlencoded; charset=UTF-8
Referer	http://www.kcscout.net/
Content-Length	3008
Cookie	sessionStateMapView=City; sessionStateCCTV=off; sessionStateDMS=off; sessionStateDMSblank=off; sessionStateIncident=off; sessionStateSpecial=off; sessionStateEmergency=off; sessionStateScheduled=off; sessionStateLngCenter=-94.582587; sessionStateLatCenter=39.101749; sessionStateZoomLevel=10

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You need to use HttpURLConnection and set all the request headers that your browser does
Avatar of rnicholus
rnicholus

ASKER

The program hangs at:
System.out.println("B");
String radAjaxControlId = URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
			String eventArgument = URLEncoder.encode("GETNEWDATA", "UTF-8"); 
			String eventTarget = URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
			
			//////////////////////////////////////////////////
			// Start reading with VIEWSTATE value.
			//////////////////////////////////////////////////
			HttpURLConnection httpConn = null;
			
			httpConn = (HttpURLConnection)(new URL("http://www.kcscout.net/Default.aspx?" + 
				"RadAJAXControlID=" + radAjaxControlId + "&__EVENTARGUMENT=" + eventArgument + 
				"&__EVENTTARGET=" + eventTarget +
				"__VIEWSTATE=" + viewstate)).openConnection();
			httpConn.setDoOutput(true);
			System.out.println("A");
			httpConn.setRequestProperty("Host", "www.kcscout.net");
			httpConn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0");
			httpConn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,* /*;q=0.8");
			httpConn.setRequestProperty("Accept-Language", "en-us,en;q=0.5");
			httpConn.setRequestProperty("Accept-Encoding", "gzip,deflate");
			httpConn.setRequestProperty("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
			httpConn.setRequestProperty("Keep-Alive", "300");
			httpConn.setRequestProperty("Connection", "keep-alive");
			httpConn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");
			httpConn.setRequestProperty("Referer", "http://www.kcscout.net/");
			httpConn.setRequestProperty("Content-Length", "3008");
			httpConn.setRequestProperty("Cookie", "sessionStateMapView=City; sessionStateCCTV=off; sessionStateDMS=off; " + 
				"sessionStateDMSblank=off; sessionStateIncident=off; sessionStateSpecial=off; sessionStateEmergency=off; " + 
				"sessionStateScheduled=off; sessionStateLngCenter=-94.582587; sessionStateLatCenter=39.101749; sessionStateZoomLevel=10");
			System.out.println("B");
			reader = new BufferedReader(new InputStreamReader(httpConn.getInputStream()));
			System.out.println("C");
			while ( (line = reader.readLine()) != null )
			{
				System.out.println("line: " + line);
			} // end while
			
			reader.close();
			reader = null;
			httpConn = null;

Open in new window

Avatar of rnicholus
rnicholus

ASKER

I think this is the cause of the hung:
httpConn.setRequestProperty("Content-Length", "3008");

I commented it out, run it again. But I still don't get the data that I want.
What i get is the same with when I click "View Source".

But the data that I want is available when I viewed the response using Firebug (code snippet).
I'm so confused. =(



_RadAjaxResponseScript_try{_ajaxManager =ctl00_RadAjaxManager1;_upDating=false;var bShSpd=false
 
;TgSpd(0);TgDetSt();bShSpd=true;var _mdaSpd=new MultiDimensionalArray(212,5);_aPolyLines=[];_mdaSpd[0
 
][0]='I70 E @ W OF WOODS CHAPEL';_mdaSpd[0][1]='39.034287,-94.30769 39.033824,-94.303797';_mdaSpd[0]
 
[2]='39.034287,-94.30769';_mdaSpd[0][3]=1;_mdaSpd[0][4]=62;_mdaSpd[1][0]='I70 E @ NW 50TH ST';_mdaSpd
 
[1][1]='39.035237,-94.315453 39.034287,-94.30769';_mdaSpd[1][2]='39.035237,-94.315453';_mdaSpd[1][3]
 
=1;_mdaSpd[1][4]=55;_mdaSpd[2][0]='I70 E @ NW SCRIMSHAW RD';
..............
..............

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You're now not setting the parameters as POST ones, per the link i posted. You should be
Avatar of rnicholus
rnicholus

ASKER

I added this: httpConn.setRequestMethod("POST");
but then I got HTTP response 411 since I commented httpConn.setRequestProperty("Content-Length", "3008"); that causes the program to hung.
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Don't set the content length unless you've calculated it accurately. Please post your current code
Avatar of rnicholus
rnicholus

ASKER

Here's my current code:
String radAjaxControlId = URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
			String eventArgument = URLEncoder.encode("GETNEWDATA", "UTF-8"); 
			String eventTarget = URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
			
			//////////////////////////////////////////////////
			// Start reading with VIEWSTATE value.
			//////////////////////////////////////////////////
			HttpURLConnection httpConn = null;
			
			httpConn = (HttpURLConnection)(new URL("http://www.kcscout.net/Default.aspx?" + 
				"RadAJAXControlID=" + radAjaxControlId + "&__EVENTARGUMENT=" + eventArgument + 
				"&__EVENTTARGET=" + eventTarget +
				"__VIEWSTATE=" + viewstate)).openConnection();
			httpConn.setDoOutput(true);
			httpConn.setRequestMethod("POST");
			System.out.println("A");
			httpConn.setRequestProperty("Host", "www.kcscout.net");
			httpConn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0");
			httpConn.setRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
			httpConn.setRequestProperty("Accept-Language", "en-us,en;q=0.5");
			httpConn.setRequestProperty("Accept-Encoding", "gzip,deflate");
			httpConn.setRequestProperty("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
			httpConn.setRequestProperty("Keep-Alive", "300");
			httpConn.setRequestProperty("Connection", "keep-alive");
			httpConn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");
			httpConn.setRequestProperty("Referer", "http://www.kcscout.net/");
			// httpConn.setRequestProperty("Content-Length", "3008");
			httpConn.setRequestProperty("Cookie", "sessionStateMapView=City; sessionStateCCTV=off; sessionStateDMS=off; " + 
				"sessionStateDMSblank=off; sessionStateIncident=off; sessionStateSpecial=off; sessionStateEmergency=off; " + 
				"sessionStateScheduled=off; sessionStateLngCenter=-94.582587; sessionStateLatCenter=39.101749; sessionStateZoomLevel=10");
			System.out.println("B");
			reader = new BufferedReader(new InputStreamReader(httpConn.getInputStream()));
			System.out.println("C");
			while ( (line = reader.readLine()) != null )
			{
				System.out.println("line: " + line);
			} // end while
			
			reader.close();
			reader = null;
			httpConn = null;

Open in new window

Avatar of Mick Barry
Mick Barry
Flag of Australia image

> I think this is the cause of the hung:
> httpConn.setRequestProperty("Content-Length", "3008");

you should be using that :)
in fact you're setting lots of params you don't need to be


I'd try using httpclient or httpunit, they handles things like cookies and pretending to be a browser for you.

If its a form submission you're simulating then first load the form, and then submit (using httpclient/httpunit)

Avatar of rnicholus
rnicholus

ASKER

>> you should be using that :)
>> in fact you're setting lots of params you don't need to be
Why do we need the "Content-Length"?
objects, which ones are not important?

by httpclient you mean Apache httpclient, right?
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

What value do you have for viewstate?
Avatar of rnicholus
rnicholus

ASKER

I used Apache HTTPClient now:
- "Content-length" still causes the program to hung.
- I commented "Content-length", the program throws "HTTP 500 Internal Server Error"

Attached is my code snippet:

HttpClient client = new HttpClient();
client.getParams().setSoTimeout(120000);
client.getParams().setConnectionManagerTimeout(120000);
PostMethod method = new PostMethod();
			
String radAjaxControlId = URLEncoder.encode("ctl00_RadAjaxManager1", "UTF-8");
String eventArgument = URLEncoder.encode("GETNEWDATA", "UTF-8"); 
String eventTarget = URLEncoder.encode("ctl00:RadAjaxManager1", "UTF-8");
method = new PostMethod("http://www.kcscout.net/Default.aspx?" + 
"RadAJAXControlID=" + radAjaxControlId + "&__EVENTARGUMENT=" + eventArgument + 
"&__EVENTTARGET=" + eventTarget +
"__VIEWSTATE=" + viewstate);
		
method.setRequestHeader("Host", "www.kcscout.net");
method.setRequestHeader("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0");
method.setRequestHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,* /*;q=0.8");
method.setRequestHeader("Accept-Language", "en-us,en;q=0.5");
method.setRequestHeader("Accept-Encoding", "gzip,deflate");
method.setRequestHeader("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.7");
method.setRequestHeader("Keep-Alive", "300");
method.setRequestHeader("Connection", "keep-alive");
method.setRequestHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");
method.setRequestHeader("Referer", "http://www.kcscout.net/");
//method.setRequestHeader("Content-Length", "3008");
method.setRequestHeader("Cookie", "sessionStateMapView=City; sessionStateCCTV=off; sessionStateDMS=off; " + 
"sessionStateDMSblank=off; sessionStateIncident=off; sessionStateSpecial=off; sessionStateEmergency=off; " + 
"sessionStateScheduled=off; sessionStateLngCenter=-94.582587; sessionStateLatCenter=39.101749; sessionStateZoomLevel=10");
 
System.out.println("Try to execute method.");
int statusCode = client.executeMethod(method);
System.out.println("Done executing method.");
		
if (statusCode != HttpStatus.SC_OK) 
{
System.out.println("Method failed: " + method.getStatusLine());
throw new IOException("Received a bad status code. ");
}
						
// Read the response body.
InputStream responseBody = method.getResponseBodyAsStream();
reader = new BufferedReader(new InputStreamReader(responseBody));	
line = "";
	
// Read each line
while (((line = reader.readLine()) != null) && reader.ready())
{
System.out.println("line: " + line);	
}

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

What value do you have for viewstate?
Avatar of rnicholus
rnicholus

ASKER

I have a chunk of code that I use to generate the viewstate values everytime before the code to connect. Below is an example of viewstate. I'm not sure whether it's going to be always have the same value or not.

----------

/wEPDwUJNjgxMDMxNTUwD2QWAmYPZBYCAgMPZBYGAgIPZBYCZg8WAh4EVGV4dAWEDzxkaXYgaWQ9IkR
pdlNjcm9sbGVyIj4KPHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiPgovLzwhW0NEQVRBWwp2YXIgcGF1c2Vjb250ZW
50PW5ldyBBcnJheSgpOwpwYXVzZWNvbnRlbnRbMF09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+U3RyZWFtaW5nIFZpZ
GVvIE5vdyBBdmFpbGFibGUhPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7
Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iU3RyZWFtaW5nVmlkZW9MaXN0LmFzcHgiIHN0eWxlPSJjb2xvcjojODg4OEZGOyI+Q2x
pY2sgaGVyZTwvYT4gJwpwYXVzZWNvbnRlbnRbMV09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+PHNwYW4gc3R5bGU9Im
NvbG9yOiNGMjBEMEQiPjwvc3Bhbj4gU2NvdXQgQnJvY2h1cmUuPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic
3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iZG93bmxvYWRzL0Fubm91bmNlbWVudHMvc2NvdXQu
cGRmIiBzdHlsZT0iY29sb3I6Izg4ODhGRjsiPkNsaWNrIGhlcmUgZm9yIGluZm9ybWF0aW9uIGluIHBkZjwvYT48YnIvPiA
nCnBhdXNlY29udGVudFsyXT0nIDxwIHN0eWxlPSJjb2xvcjojZmZmZmZmIj48c3BhbiBzdHlsZT0iY29sb3I6I0YyMEQwRC
I+PC9zcGFuPlRyYXZlbCBUaW1lcyB5b3VyICZxdW90O2hlYWRzIHVwJnF1b3Q7IG9uIHRoZSByb2FkPGJyLz4gJm5ic3A7J
m5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iZG93bmxvYWRz
L0Fubm91bmNlbWVudHMvdHJhdmVsdGltZS5wZGYiIHN0eWxlPSJjb2xvcjojODg4OEZGOyI+Q2xpY2sgaGVyZSBmb3IgaW5
mb3JtYXRpb24gaW4gcGRmPC9hPjxici8+ICcKcGF1c2Vjb250ZW50WzNdPScgPHAgc3R5bGU9ImNvbG9yOiNmZmZmZmYiPj
xzcGFuIHN0eWxlPSJjb2xvcjojRjIwRDBEIj48L3NwYW4+SVRTIFN5bXBvc2l1bSBGbHllcjxici8+ICZuYnNwOyZuYnNwO
yZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOzxhIGhyZWY9ImRvd25sb2Fkcy9Bbm5v
dW5jZW1lbnRzL0tDIElUUyBTeW1wb3NpdW0gTWFyayBZb3VyIENhbGVuZGFycy5wZGYiIHN0eWxlPSJjb2xvcjojODg4OEZ
GOyI+Q2xpY2sgaGVyZSBmb3IgaW5mb3JtYXRpb24gaW4gcGRmPC9hPjxici8+ICcKcGF1c2Vjb250ZW50WzRdPScgPHAgc3
R5bGU9ImNvbG9yOiNmZmZmZmYiPk1pc3NvdXJpIFJvYWR3b3JrIDxici8+ICZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuY
nNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOzxhIGhyZWY9Imh0dHA6Ly93d3cubW9kb3Qub3JnL2thbnNhc2Np
dHkvcm9hZF9jb25zdHJ1Y3Rpb24vcm9hZHpvbmVzdG9kYXkuaHRtIiBzdHlsZT0iY29sb3I6Izg4ODhGRjsiPkNsaWNrIGh
lcmUgZm9yIGluZm9ybWF0aW9uPC9hPjwvcD4gJwpwYXVzZWNvbnRlbnRbNV09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZi
I+SGl0dGluZyB5b3VyIGJyYWtlcyB0byByZWFkIFNjb3V0IHNpZ25zPGJyPmlzIG5vdCBvbmx5IHVubmVjZXNzYXJ5LCBpd
CBpcyB1bnNhZmUhPGJyLz4gJwpwYXVzZWNvbnRlbnRbNl09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+S2Fuc2FzIFJv
YWR3b3JrPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A
7PGEgaHJlZj0iaHR0cDovL3d3dy5rc2RvdC5vcmcva2NNZXRyby9sYW5lY2xvc2UuYXNwIiBzdHlsZT0iY29sb3I6Izg4OD
hGRjsiPkNsaWNrIGhlcmUgZm9yIGluZm9ybWF0aW9uPC9hPiAnCgogbmV3IHBhdXNlc2Nyb2xsZXIocGF1c2Vjb250ZW50L
CAicHNjcm9sbGVyMSIsICJteVNjb2xsZXJDbGFzcyIsIDMwMDApCgovL11dPgo8L3NjcmlwdD48L2Rpdj5kAgMPZBYEAgEP
DxYCHwAFFEFmdGVybm9vbiBEcml2ZSBUaW1lZGQCAw8PFgIfAAUiMTowMyBQTSAsIE1vbiwgSnVsIDIxc3QsIDIwMDgsIEN
EVGRkAgwPZBYCAgEPDxYCHwAFBDIwMDhkZBgBBQtjdGwwMCRNZW51MQ8PZAUESG9tZWTZ7lg6KqTDE7NAtQp+zB7KdOfIhg
==



String viewstate = "";
			URLConnection conn = (new URL("http://www.kcscout.net/Default.aspx")).openConnection();
			BufferedReader reader = null;
			String line = "";
			
			reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
			while ( (line = reader.readLine()) != null )
			{
				if ( line.indexOf("VIEWSTATE") >= 0 )
				{
					viewstate = line;	
					break;
				}
			} // end while
			viewstate = viewstate.replaceFirst(".*?\\s+value=\"", "");
			viewstate = viewstate.replaceFirst("\" />", "");
			System.out.println("---> VIEWSTATE: " + viewstate);
			viewstate =  URLEncoder.encode(viewstate,"UTF-8");
			
			reader.close();
			reader = null;
			conn = null;

Open in new window

Avatar of Mick Barry
Mick Barry
Flag of Australia image

> Why do we need the "Content-Length"?

sorry that was a typo, I meant you should *not* be setting it

and you don't need to be setting all those request headers, let the client look after that as it knows what is needed.

Is it a form submission you are doing?
Avatar of rnicholus
rnicholus

ASKER

>>> Is it a form submission you are doing?
I think it's not a form submission.
When I enter the website, I don't have to do form submission then I can already see the data on the website's Google map (the freeway speeds data).

http://www.kcscout.net/

As I mentioned, I can see the speed data (latitude, longitude, and the speed value) being returned when I use Firebug (Firefox extension). I'm looking for a way to do the same thing using a JAVA program.
Avatar of rnicholus
rnicholus

ASKER

Any new idea, guys?
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

If you can give me the shortest possible 'viewstate' variable, i'll give it a try
Avatar of rnicholus
rnicholus

ASKER

CEHJ,

The VIEWSTATE value is always very long. See the code snippet. I think this is ASP thing. I'm not too familiar why this value is needed.

I think the value keep changing. I will give you the chunk of code that will get the VIEWSTATE value before trying to connect using all the parameters.
/wEPDwUJNjgxMDMxNTUwD2QWAmYPZBYCAgMPZBYGAgIPZBYCZg8WAh4EVGV4dAWEDzxkaXYgaWQ9IkR
pdlNjcm9sbGVyIj4KPHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiPgovLzwhW0NEQVRBWwp2YXIgcGF1c2Vjb250ZW
50PW5ldyBBcnJheSgpOwpwYXVzZWNvbnRlbnRbMF09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+U3RyZWFtaW5nIFZpZ
GVvIE5vdyBBdmFpbGFibGUhPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7
Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iU3RyZWFtaW5nVmlkZW9MaXN0LmFzcHgiIHN0eWxlPSJjb2xvcjojODg4OEZGOyI+Q2x
pY2sgaGVyZTwvYT4gJwpwYXVzZWNvbnRlbnRbMV09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+PHNwYW4gc3R5bGU9Im
NvbG9yOiNGMjBEMEQiPjwvc3Bhbj4gU2NvdXQgQnJvY2h1cmUuPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic
3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iZG93bmxvYWRzL0Fubm91bmNlbWVudHMvc2NvdXQu
cGRmIiBzdHlsZT0iY29sb3I6Izg4ODhGRjsiPkNsaWNrIGhlcmUgZm9yIGluZm9ybWF0aW9uIGluIHBkZjwvYT48YnIvPiA
nCnBhdXNlY29udGVudFsyXT0nIDxwIHN0eWxlPSJjb2xvcjojZmZmZmZmIj48c3BhbiBzdHlsZT0iY29sb3I6I0YyMEQwRC
I+PC9zcGFuPlRyYXZlbCBUaW1lcyB5b3VyICZxdW90O2hlYWRzIHVwJnF1b3Q7IG9uIHRoZSByb2FkPGJyLz4gJm5ic3A7J
m5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7PGEgaHJlZj0iZG93bmxvYWRz
L0Fubm91bmNlbWVudHMvdHJhdmVsdGltZS5wZGYiIHN0eWxlPSJjb2xvcjojODg4OEZGOyI+Q2xpY2sgaGVyZSBmb3IgaW5
mb3JtYXRpb24gaW4gcGRmPC9hPjxici8+ICcKcGF1c2Vjb250ZW50WzNdPScgPHAgc3R5bGU9ImNvbG9yOiNmZmZmZmYiPj
xzcGFuIHN0eWxlPSJjb2xvcjojRjIwRDBEIj48L3NwYW4+SVRTIFN5bXBvc2l1bSBGbHllcjxici8+ICZuYnNwOyZuYnNwO
yZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOzxhIGhyZWY9ImRvd25sb2Fkcy9Bbm5v
dW5jZW1lbnRzL0tDIElUUyBTeW1wb3NpdW0gTWFyayBZb3VyIENhbGVuZGFycy5wZGYiIHN0eWxlPSJjb2xvcjojODg4OEZ
GOyI+Q2xpY2sgaGVyZSBmb3IgaW5mb3JtYXRpb24gaW4gcGRmPC9hPjxici8+ICcKcGF1c2Vjb250ZW50WzRdPScgPHAgc3
R5bGU9ImNvbG9yOiNmZmZmZmYiPk1pc3NvdXJpIFJvYWR3b3JrIDxici8+ICZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuY
nNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOyZuYnNwOzxhIGhyZWY9Imh0dHA6Ly93d3cubW9kb3Qub3JnL2thbnNhc2Np
dHkvcm9hZF9jb25zdHJ1Y3Rpb24vcm9hZHpvbmVzdG9kYXkuaHRtIiBzdHlsZT0iY29sb3I6Izg4ODhGRjsiPkNsaWNrIGh
lcmUgZm9yIGluZm9ybWF0aW9uPC9hPjwvcD4gJwpwYXVzZWNvbnRlbnRbNV09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZi
I+SGl0dGluZyB5b3VyIGJyYWtlcyB0byByZWFkIFNjb3V0IHNpZ25zPGJyPmlzIG5vdCBvbmx5IHVubmVjZXNzYXJ5LCBpd
CBpcyB1bnNhZmUhPGJyLz4gJwpwYXVzZWNvbnRlbnRbNl09JyA8cCBzdHlsZT0iY29sb3I6I2ZmZmZmZiI+S2Fuc2FzIFJv
YWR3b3JrPGJyLz4gJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A
7PGEgaHJlZj0iaHR0cDovL3d3dy5rc2RvdC5vcmcva2NNZXRyby9sYW5lY2xvc2UuYXNwIiBzdHlsZT0iY29sb3I6Izg4OD
hGRjsiPkNsaWNrIGhlcmUgZm9yIGluZm9ybWF0aW9uPC9hPiAnCgogbmV3IHBhdXNlc2Nyb2xsZXIocGF1c2Vjb250ZW50L
CAicHNjcm9sbGVyMSIsICJteVNjb2xsZXJDbGFzcyIsIDMwMDApCgovL11dPgo8L3NjcmlwdD48L2Rpdj5kAgMPZBYEAgEP
DxYCHwAFFEFmdGVybm9vbiBEcml2ZSBUaW1lZGQCAw8PFgIfAAUiMTowMyBQTSAsIE1vbiwgSnVsIDIxc3QsIDIwMDgsIEN
EVGRkAgwPZBYCAgEPDxYCHwAFBDIwMDhkZBgBBQtjdGwwMCRNZW51MQ8PZAUESG9tZWTZ7lg6KqTDE7NAtQp+zB7KdOfIhg
==

Open in new window

Avatar of rnicholus
rnicholus

ASKER

Here's the code to get the VIEWSTATE value:
String viewstate = "";
			URLConnection conn = (new URL("http://www.kcscout.net/Default.aspx")).openConnection();
			BufferedReader reader = null;
			String line = "";
			
			reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
			while ( (line = reader.readLine()) != null )
			{
				if ( line.indexOf("VIEWSTATE") >= 0 )
				{
					viewstate = line;	
					break;
				}
			} // end while
			viewstate = viewstate.replaceFirst(".*?\\s+value=\"", "");
			viewstate = viewstate.replaceFirst("\" />", "");
			System.out.println("---> VIEWSTATE: " + viewstate);
			viewstate =  URLEncoder.encode(viewstate,"UTF-8");
			
			reader.close();
			reader = null;
			conn = null;

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

I can only get the response to the POST to return up to the end of
Avatar of Mick Barry
Mick Barry
Flag of Australia image

that page uses a javascript to load the data, thats probably your problem.
Doesn't even work on all browsers, definitely does not look designed to be called from java.
You could try using rhino to execute the javascript.
And you shouldnt be setting viewstate and other vars as thats handled by the page.


Avatar of rnicholus
rnicholus

ASKER

object,

I'm sure that the VIEWSTATE and the other vars are needed.
Consider the same problem that I had before: https://www.experts-exchange.com/questions/23006895/How-to-query-an-aspx-page-using-JAVA-post-request.html

How is RHINO going to help?
Avatar of rnicholus
rnicholus

ASKER

How does firebug does it? Why is it able to view the data? It has the part that I want (code snippet).
_RadAjaxResponseScript_try{_ajaxManager =ctl00_RadAjaxManager1;_upDating=false;var _mdaSpd=new
 
 MultiDimensionalArray(214,5);_aPolyLines=[];_mdaSpd[0][0]='I70 E @ W OF WOODS CHAPEL';_mdaSpd[0][1]
 
='39.030206,-94.308175 39.029743,-94.304282';_mdaSpd[0][2]='39.030206,-94.308175';_mdaSpd[0][3]=1
............................

Open in new window

Avatar of Mick Barry
Mick Barry
Flag of Australia image

they are needed, but they are generated by the server. Let the server do it.

> How is RHINO going to help?

To execute the js.

> How does firebug does it?

it executes the js

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

As i said earlier, the page is heavily reliant on JavaScript. It's even possible that the problem i alluded to in my last is down to the fact that js loads the whole of the body.

I would be inclined to try HttpUnit as it supports js. It will simplify coding considerably too
Avatar of Mick Barry
Mick Barry
Flag of Australia image

> I would be inclined to try HttpUnit as it supports js. It will simplify coding considerably too

which has already been suggested :)

Avatar of rnicholus
rnicholus

ASKER

I tried creating a small code using HttpUnit (code snippet) and got some illegal character exception.
Any idea how to bypass this?
import com.meterware.httpunit.*;
public class TestSite2
{
 
	public static void main(String[] args)
	{
		try
		{
			System.out.println("-----> A");
			WebConversation wc = new WebConversation();
			System.out.println("-----> B");
    		WebResponse   resp = wc.getResponse( "http://www.kcscout.net/" );
    		System.out.println("-----> C");
		}
		catch (Exception ex)
		{
			ex.printStackTrace();	
		}
		
	} // end main(String[])
	
}
 
----------------------------------------------------------------------------
-----> A
-----> B
org.mozilla.javascript.EvaluatorException: illegal character
        at org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:9
8)
        at org.mozilla.javascript.DefaultErrorReporter.error(DefaultErrorReporter.java:85)
        at org.mozilla.javascript.Parser.addError(Parser.java:126)
        at org.mozilla.javascript.TokenStream.getToken(TokenStream.java:810)
        at org.mozilla.javascript.Parser.peekToken(Parser.java:144)
        at org.mozilla.javascript.Parser.primaryExpr(Parser.java:1953)
        at org.mozilla.javascript.Parser.memberExpr(Parser.java:1641)
        at org.mozilla.javascript.Parser.unaryExpr(Parser.java:1507)
        at org.mozilla.javascript.Parser.mulExpr(Parser.java:1436)
        at org.mozilla.javascript.Parser.addExpr(Parser.java:1417)
        at org.mozilla.javascript.Parser.shiftExpr(Parser.java:1397)
        at org.mozilla.javascript.Parser.relExpr(Parser.java:1371)
        at org.mozilla.javascript.Parser.eqExpr(Parser.java:1327)
        at org.mozilla.javascript.Parser.bitAndExpr(Parser.java:1316)
        at org.mozilla.javascript.Parser.bitXorExpr(Parser.java:1305)
        at org.mozilla.javascript.Parser.bitOrExpr(Parser.java:1294)
        at org.mozilla.javascript.Parser.andExpr(Parser.java:1282)
        at org.mozilla.javascript.Parser.orExpr(Parser.java:1270)
        at org.mozilla.javascript.Parser.condExpr(Parser.java:1253)
        at org.mozilla.javascript.Parser.assignExpr(Parser.java:1235)
        at org.mozilla.javascript.Parser.expr(Parser.java:1224)
        at org.mozilla.javascript.Parser.statementHelper(Parser.java:1111)
        at org.mozilla.javascript.Parser.statement(Parser.java:623)
        at org.mozilla.javascript.Parser.parse(Parser.java:355)
        at org.mozilla.javascript.Parser.parse(Parser.java:293)
        at org.mozilla.javascript.Context.compileImpl(Context.java:2238)
        at org.mozilla.javascript.Context.compileString(Context.java:1284)
        at org.mozilla.javascript.Context.compileString(Context.java:1273)
        at org.mozilla.javascript.Context.evaluateString(Context.java:1129)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.
java:92)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.jav
a:88)
        at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151)
        at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:15
0)
        at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131)
        at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:249)
        at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:367)
        at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1015)
        at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:888)
        at org.cyberneko.html.HTMLScanner$SpecialScanner.scan(HTMLScanner.java:2831)
        at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:809)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48)
        at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271)
        at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at TestSite2.main(TestSite2.java:13)
com.meterware.httpunit.ScriptException: Script ')W// JScript File
var testing=false;
if(!testing)
{
    var str=window.location.href;
   if(str.toLowerCase().indexOf("kcscout.org")>-1)
  {
        window.location.href="http://www.kcscout.net";
   }
}' failed: org.mozilla.javascript.EvaluatorException: illegal character
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.handleScriptException(Scriptin
gEngineImpl.java:64)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.
java:95)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.jav
a:88)
        at com.meterware.httpunit.parsing.NekoDOMParser.runScript(NekoDOMParser.java:151)
        at com.meterware.httpunit.parsing.ScriptFilter.getTranslatedScript(ScriptFilter.java:15
0)
        at com.meterware.httpunit.parsing.ScriptFilter.endElement(ScriptFilter.java:131)
        at org.cyberneko.html.filters.DefaultFilter.endElement(DefaultFilter.java:249)
        at org.cyberneko.html.filters.NamespaceBinder.endElement(NamespaceBinder.java:367)
        at org.cyberneko.html.HTMLTagBalancer.callEndElement(HTMLTagBalancer.java:1015)
        at org.cyberneko.html.HTMLTagBalancer.endElement(HTMLTagBalancer.java:888)
        at org.cyberneko.html.HTMLScanner$SpecialScanner.scan(HTMLScanner.java:2831)
        at org.cyberneko.html.HTMLScanner.scanDocument(HTMLScanner.java:809)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:478)
        at org.cyberneko.html.HTMLConfiguration.parse(HTMLConfiguration.java:431)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at com.meterware.httpunit.parsing.NekoHTMLParser.parse(NekoHTMLParser.java:48)
        at com.meterware.httpunit.HTMLPage.parse(HTMLPage.java:271)
        at com.meterware.httpunit.WebResponse.getReceivedPage(WebResponse.java:1301)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at TestSite2.main(TestSite2.java:13)
Press any key to continue...

Open in new window

Avatar of rnicholus
rnicholus

ASKER

I set this: HttpUnitOptions.setExceptionsThrownOnScriptError(false);    
And it ignores the exception I mentioned above. =)

In the System.out.println() line, I got:
-----------------------------------------------------------------------
-----> C: HttpWebResponse [url=http://www.kcscout.net; headers=
   X-ASPNET-VERSION: 2.0.50727
   CONTENT-TYPE: text/html; charset=utf-8
   CONTENT-LENGTH: 60806
   CACHE-CONTROL: private
   X-POWERED-BY: ASP.NET
   SERVER: Microsoft-IIS/6.0
   DATE: Wed, 23 Jul 2008 17:26:42 GMT ]
Press any key to continue...
-----------------------------------------------------------------------


I'm now moving on to how to get the RESPONSE.
WebConversation wc = new WebConversation();
HttpUnitOptions.setExceptionsThrownOnScriptError(false);    		
WebResponse   resp = wc.getResponse( "http://www.kcscout.net" );
System.out.println("-----> C: " + resp);

Open in new window

Avatar of rnicholus
rnicholus

ASKER

I found another interesting thing as I look closely using Firebug. Every time the page loads, it does several requests.
----------------------------------------------------------------------
1. (HTML): GET http://www.kcscout.com/
(JavaScript): GET http://maps.google.com/maps?file=api&v=3.00&key=ABQIAAAAZzE4U...
2. (HTML): GET http://voap.weather.com/weather/oap/64105?template=DRIVH&par=1004147026&unit=0&key=47147271e1edfcf22df384ad02cc886d
3. (JavaScript): GET http://maps.google.com/maps/vp?spn=0.45267,0.821228&z=10&key=ABQIAAAAZzE4UWC4HsX3ylXcfLLsQxRg70jkC_zdHGs1IP8c6ulQYkWZARS9Rbhapz7ShKEqpdnietxeUeCoUg&vp=38.975425,-94.601898
4. (XMLHttpRequest): POST http://www.kcscout.com/Default.aspx
----------------------------------------------------------------------

So, I think:
----------------------------------------------------------------------
- So far I got only through #1 (based on the data that I saw being read by my JAVA program).
- I don't think #2 - #3 are important since they are related to connecting to GOOGLE.COM and WEATHER.COM website.
----------------------------------------------------------------------

How do I execute through #4 with/ without using HttpUnit?
I tried the code in code-snippet, but it still gives me the data from #1 only (same like when you "view source" from browser).
WebConversation wc = new WebConversation();
HttpUnitOptions.setExceptionsThrownOnScriptError(false);    		
WebResponse   resp = wc.getResponse( "http://www.kcscout.com/Default.aspx" );
System.out.println("-----> C: " + resp);
System.out.println("D:" + resp.getText());

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

>>which has already been suggested :)

Where?

>>How do I execute through #4 with/ without using HttpUnit?

You mean how do you do the POST (the main objective i think)?
Avatar of rnicholus
rnicholus

ASKER

>> You mean how do you do the POST (the main objective i think)?
I think so.
I'm actually confused as why even though I already tried to access Default.aspx directly but still not good.
Avatar of rnicholus
rnicholus

ASKER

I changed my code to using PostMethodWebRequest.
WebConversation wc = new WebConversation();
HttpUnitOptions.setExceptionsThrownOnScriptError(false);
WebRequest req = new PostMethodWebRequest( "http://www.kcscout.com/Default.aspx" );
WebResponse  resp = wc.getResponse( req );
System.out.println("----> C: " + resp);
System.out.println("D:" + resp.getText());

Open in new window

Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

You certainly don't believe in choosing easy sites to point your bots at do you? ;-)
Avatar of rnicholus
rnicholus

ASKER

haha .. :D
i'm really confused now. any new idea?
Avatar of Mick Barry
Mick Barry
Flag of Australia image

I mention that what it was doing already :)
The page loads, then executes the js which posts that form which then loads that response in the page.
You could grab the rsponse of the post but you'd need to know how to use it.

Alternativelu use httpunit to load the page and then query the dom.

Avatar of rnicholus
rnicholus

ASKER

>> You could grab the rsponse of the post but you'd need to know how to use it.
This is what I'm trying to do. I don't know how.

>> Alternativelu use httpunit to load the page and then query the dom.
I'm relatively new at HttpUnit (just learned today :D), how exactly I can query the DOM using HttpUnit?
The code that I have seem to return response not like what I want.

WebConversation wc = new WebConversation();
HttpUnitOptions.setExceptionsThrownOnScriptError(false);
WebRequest req = new PostMethodWebRequest( "http://www.kcscout.com/Default.aspx" );
WebResponse  resp = wc.getResponse( req );
System.out.println("----> C: " + resp);
System.out.println("D:" + resp.getText());
Avatar of Mick Barry
Mick Barry
Flag of Australia image

> I'm relatively new at HttpUnit (just learned today :D), how exactly I can query the DOM using HttpUnit?

wc.getCurrentPage().getDOM();

Avatar of rnicholus
rnicholus

ASKER

Thanks, I'll give it a try.

By the way, what is the difference between HtmlUnit and HttpUnit? I'm just curious.
Avatar of rnicholus
rnicholus

ASKER

I have a small code that I use to iterate the DOM. But after I reached the DOCUMENT_NODE, I couldn't go any further. I'm not sure why.
----------------------------------------------------------
Entering Iterate method --> n: com.meterware.httpunit.dom.HTMLDocumentImpl@4a9a7d
--> DOCUMENT_NODE <--
Entering Iterate method --> n: null
Press any key to continue...
----------------------------------------------------------

objects, I just realize, from the result of response.getText(), I can already tell whether the data that I want is there or not, right?



import com.meterware.httpunit.*;
import org.w3c.dom.*;
 
public class TestSite2
{
 
	public static void main(String[] args)
	{
		try
		{
    		WebConversation wc = new WebConversation();
    		WebRequest req = new PostMethodWebRequest( "http://www.kcscout.com/Default.aspx" );
    		HttpUnitOptions.setExceptionsThrownOnScriptError(false);
    		WebResponse  resp = wc.getResponse( req );
    		Document dom = resp.getDOM();
 
    		iterate(dom,1);
    		
		}
		catch (Exception ex)
		{
			ex.printStackTrace();	
		}
		
	} // end main(String[])
	
	/**
	 * Iterate DOM.
	 **/
	private static void iterate(Node n, int indent) 
	{	
		System.out.println("Entering Iterate method --> n: "+ n);
		
  		if (n == null)	return;
 
  		String ind = new String();
  		for (int i=0;i<indent; i++)
   		ind = ind + " ";
  
  		switch (n.getNodeType()) 
  		{
	   		case Node.ELEMENT_NODE:
	   			System.out.println("--> ELEMENT NODE <--");
	    		System.out.println(ind + n.getNodeName());
	    		NodeList kids = n.getChildNodes();
	    		if (kids != null) 
	    		{
	    		 	for (int i=0; i<kids.getLength(); i++)
	 	     			iterate(kids.item(i), indent + 1);
	    		}
	    	break;
    
	   		case Node.TEXT_NODE:
	   			System.out.println("--> TEXT_NODE <--");
	    		System.out.println(ind + n.getNodeValue());
	    	break;
	    
	   		case Node.DOCUMENT_NODE:
	   			System.out.println("--> DOCUMENT_NODE <--");
	    		iterate(((Document)n).getDocumentElement(),indent + 1);
	    	break;
   		} // end switch statement
  } // iterate()
	
} // end class

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of rnicholus
rnicholus

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Java
Java

Java is a platform-independent, object-oriented programming language and run-time environment, designed to have as few implementation dependencies as possible such that developers can write one set of code across all platforms using libraries. Most devices will not run Java natively, and require a run-time component to be installed in order to execute a Java program.

102K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo