Purdue_Pete
asked on
Bypass Validation
Hi.
I am trying to crawl a page using a web crawler. That page exists behinds a validator (struts), i.e. In order to get to the page, a button needs to be clicked. Is there anyway this can be bypassed so web crawler can get to the page without clicking this button?
Code:
<form name="loginForm" method="post" action="/check.do">
<input type="hidden" name="forward" value="target_page">
<input type="submit" name="org.apache.struts.ta glib.html. CANCEL" value="Continue" onclick="bCancel=true;">
</form>
Any help is appreciated. Thanks.
I am trying to crawl a page using a web crawler. That page exists behinds a validator (struts), i.e. In order to get to the page, a button needs to be clicked. Is there anyway this can be bypassed so web crawler can get to the page without clicking this button?
Code:
<form name="loginForm" method="post" action="/check.do">
<input type="hidden" name="forward" value="target_page">
<input type="submit" name="org.apache.struts.ta
</form>
Any help is appreciated. Thanks.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
cannot do. That is exactly why they put that input validation in, to stop you crawling their site. Same as the warped graphics on other sites like google -- requires an input before you can get past that point -- specifically to STOP mass crawling of their websites. This is the biggest problem on the web today -- automatic site crawlers steal 100x to 200x more bandwidth than do legitimate users of their website.
open the page in a browser and view its source, you can then copy and paste that into the w3 validator.
ASKER
Yes, simple JS seemed to get around Struts. I thought Struts would scrutinize more w/ these kind of issues.
Open in new window