Error in parsing html source in Java

Hi,
when I run my java code I get the following error message:

org.mozilla.javascript.EvaluatorException: illegally formed XML syntax (httpunit#6)
        at org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:98)
        at org.mozilla.javascript.DefaultErrorReporter.error(DefaultErrorReporter.java:85)
        at org.mozilla.javascript.Parser.addError(Parser.java:126)
        at org.mozilla.javascript.TokenStream.getNextXMLToken(TokenStream.java:1059)
        at org.mozilla.javascript.TokenStream.getFirstXMLToken(TokenStream.java:910)
        at org.mozilla.javascript.Parser.xmlInitializer(Parser.java:1524)
        at org.mozilla.javascript.Parser.unaryExpr(Parser.java:1501)
        at org.mozilla.javascript.Parser.mulExpr(Parser.java:1436)
        at org.mozilla.javascript.Parser.addExpr(Parser.java:1417)
        at org.mozilla.javascript.Parser.shiftExpr(Parser.java:1397)
        at org.mozilla.javascript.Parser.relExpr(Parser.java:1371)
        at org.mozilla.javascript.Parser.eqExpr(Parser.java:1327)
        at org.mozilla.javascript.Parser.bitAndExpr(Parser.java:1316)
        at org.mozilla.javascript.Parser.bitXorExpr(Parser.java:1305)
        at org.mozilla.javascript.Parser.bitOrExpr(Parser.java:1294)
        at org.mozilla.javascript.Parser.andExpr(Parser.java:1282)
        at org.mozilla.javascript.Parser.orExpr(Parser.java:1270)
        at org.mozilla.javascript.Parser.condExpr(Parser.java:1253)
        at org.mozilla.javascript.Parser.assignExpr(Parser.java:1235)
        at org.mozilla.javascript.Parser.expr(Parser.java:1224)
        at org.mozilla.javascript.Parser.statementHelper(Parser.java:1155)
        at org.mozilla.javascript.Parser.statement(Parser.java:623)
        at org.mozilla.javascript.Parser.parse(Parser.java:355)
        at org.mozilla.javascript.Parser.parse(Parser.java:293)
        at org.mozilla.javascript.Context.compileImpl(Context.java:2238)
        at org.mozilla.javascript.Context.compileString(Context.java:1284)
        at org.mozilla.javascript.Context.compileString(Context.java:1273)
        at org.mozilla.javascript.Context.evaluateString(Context.java:1129)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:92)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88)
        at com.meterware.httpunit.ParsedHTML.interpretScriptElement(ParsedHTML.java:364)
        at com.meterware.httpunit.ParsedHTML$ScriptFactory.recordElement(ParsedHTML.java:533)
        at com.meterware.httpunit.ParsedHTML$2.processElement(ParsedHTML.java:744)
        at com.meterware.httpunit.NodeUtils$PreOrderTraversal.perform(NodeUtils.java:241)
        at com.meterware.httpunit.ParsedHTML.loadElements(ParsedHTML.java:760)
        at com.meterware.httpunit.ParsedHTML.getFrames(ParsedHTML.java:1101)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:199)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at com.myworks.tools.GPrequal.myLogData2g.main(myLogData2g.java:132)
com.meterware.httpunit.ScriptException: Script '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>302 Found</TITLE>
</HEAD><BODY>
<H1>Found</H1>
The document has moved <A HREF="http://www-internal.mywork.com/hub/share/mytools/apps/javascript/standardista_table_sorting/standardista-common.js">here</A>.<P>
</BODY></HTML>' failed: org.mozilla.javascript.EvaluatorException: illegally formed XML syntax (httpunit#6)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.handleScriptException(ScriptingEngineImpl.java:64)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:95)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88)
        at com.meterware.httpunit.ParsedHTML.interpretScriptElement(ParsedHTML.java:364)
        at com.meterware.httpunit.ParsedHTML$ScriptFactory.recordElement(ParsedHTML.java:533)
        at com.meterware.httpunit.ParsedHTML$2.processElement(ParsedHTML.java:744)
        at com.meterware.httpunit.NodeUtils$PreOrderTraversal.perform(NodeUtils.java:241)
        at com.meterware.httpunit.ParsedHTML.loadElements(ParsedHTML.java:760)
        at com.meterware.httpunit.ParsedHTML.getFrames(ParsedHTML.java:1101)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:199)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at com.mywork.tools.GPrequal.myLogData2g.main(myLogData2g.java:132)
Mar 23, 2011 10:36:27 AM org.apache.axis.utils.JavaUtils isAttachmentSupported
WARNING: Unable to find required classes (javax.activation.DataHandler and javax.mail.internet.MimeMultipart). Attachment support is disabled.

Open in new window


In my code, I basically try to parse the html source and and find the third table, read its contents and write them to a variable as the following:

Column1 Column2

ABCDE   plus
FFG        minus

ABCDE, FFG, pass and fail are all in the cells of the table in this web page.

Here is part of my code that does this:

try {
        		    HttpUnitOptions.setScriptingEnabled(true);
        		    WebConversation wc = new WebConversation();
        		    WebResponse wr = wc.getResponse(siteAddress);
        		    WebTable table = wr.getTables()[2];
        		    java.lang.String[][] cells = table.asText();
        		    
        		    for(java.lang.String[] row : cells) {
        				for(java.lang.String cell : row) {
        			    	mycheckActivity += cell;
        				}
        				mycheckActivity += System.getProperty("line.separator");
        			}	
        		}
        		catch(Exception e) {
        		    e.printStackTrace();	
        		}

Open in new window

TolgarAsked:
Who is Participating?
 
BardobraveConnect With a Mentor Commented:
Probably the HTML source from whithin you parse your data is not XHTML valid and so when trying to parse it your system breaks.

Try to check the raw code and modify it to comply with XHTML specifications.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.