Expiring Today—Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Error in parsing html source in Java

Posted on 2011-03-23
1
Medium Priority
?
680 Views
Last Modified: 2012-06-27
Hi,
when I run my java code I get the following error message:

org.mozilla.javascript.EvaluatorException: illegally formed XML syntax (httpunit#6)
        at org.mozilla.javascript.DefaultErrorReporter.runtimeError(DefaultErrorReporter.java:98)
        at org.mozilla.javascript.DefaultErrorReporter.error(DefaultErrorReporter.java:85)
        at org.mozilla.javascript.Parser.addError(Parser.java:126)
        at org.mozilla.javascript.TokenStream.getNextXMLToken(TokenStream.java:1059)
        at org.mozilla.javascript.TokenStream.getFirstXMLToken(TokenStream.java:910)
        at org.mozilla.javascript.Parser.xmlInitializer(Parser.java:1524)
        at org.mozilla.javascript.Parser.unaryExpr(Parser.java:1501)
        at org.mozilla.javascript.Parser.mulExpr(Parser.java:1436)
        at org.mozilla.javascript.Parser.addExpr(Parser.java:1417)
        at org.mozilla.javascript.Parser.shiftExpr(Parser.java:1397)
        at org.mozilla.javascript.Parser.relExpr(Parser.java:1371)
        at org.mozilla.javascript.Parser.eqExpr(Parser.java:1327)
        at org.mozilla.javascript.Parser.bitAndExpr(Parser.java:1316)
        at org.mozilla.javascript.Parser.bitXorExpr(Parser.java:1305)
        at org.mozilla.javascript.Parser.bitOrExpr(Parser.java:1294)
        at org.mozilla.javascript.Parser.andExpr(Parser.java:1282)
        at org.mozilla.javascript.Parser.orExpr(Parser.java:1270)
        at org.mozilla.javascript.Parser.condExpr(Parser.java:1253)
        at org.mozilla.javascript.Parser.assignExpr(Parser.java:1235)
        at org.mozilla.javascript.Parser.expr(Parser.java:1224)
        at org.mozilla.javascript.Parser.statementHelper(Parser.java:1155)
        at org.mozilla.javascript.Parser.statement(Parser.java:623)
        at org.mozilla.javascript.Parser.parse(Parser.java:355)
        at org.mozilla.javascript.Parser.parse(Parser.java:293)
        at org.mozilla.javascript.Context.compileImpl(Context.java:2238)
        at org.mozilla.javascript.Context.compileString(Context.java:1284)
        at org.mozilla.javascript.Context.compileString(Context.java:1273)
        at org.mozilla.javascript.Context.evaluateString(Context.java:1129)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:92)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88)
        at com.meterware.httpunit.ParsedHTML.interpretScriptElement(ParsedHTML.java:364)
        at com.meterware.httpunit.ParsedHTML$ScriptFactory.recordElement(ParsedHTML.java:533)
        at com.meterware.httpunit.ParsedHTML$2.processElement(ParsedHTML.java:744)
        at com.meterware.httpunit.NodeUtils$PreOrderTraversal.perform(NodeUtils.java:241)
        at com.meterware.httpunit.ParsedHTML.loadElements(ParsedHTML.java:760)
        at com.meterware.httpunit.ParsedHTML.getFrames(ParsedHTML.java:1101)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:199)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at com.myworks.tools.GPrequal.myLogData2g.main(myLogData2g.java:132)
com.meterware.httpunit.ScriptException: Script '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML><HEAD>
<TITLE>302 Found</TITLE>
</HEAD><BODY>
<H1>Found</H1>
The document has moved <A HREF="http://www-internal.mywork.com/hub/share/mytools/apps/javascript/standardista_table_sorting/standardista-common.js">here</A>.<P>
</BODY></HTML>' failed: org.mozilla.javascript.EvaluatorException: illegally formed XML syntax (httpunit#6)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.handleScriptException(ScriptingEngineImpl.java:64)
        at com.meterware.httpunit.javascript.ScriptingEngineImpl.runScript(ScriptingEngineImpl.java:95)
        at com.meterware.httpunit.scripting.ScriptableDelegate.runScript(ScriptableDelegate.java:88)
        at com.meterware.httpunit.ParsedHTML.interpretScriptElement(ParsedHTML.java:364)
        at com.meterware.httpunit.ParsedHTML$ScriptFactory.recordElement(ParsedHTML.java:533)
        at com.meterware.httpunit.ParsedHTML$2.processElement(ParsedHTML.java:744)
        at com.meterware.httpunit.NodeUtils$PreOrderTraversal.perform(NodeUtils.java:241)
        at com.meterware.httpunit.ParsedHTML.loadElements(ParsedHTML.java:760)
        at com.meterware.httpunit.ParsedHTML.getFrames(ParsedHTML.java:1101)
        at com.meterware.httpunit.WebResponse.getFrames(WebResponse.java:1285)
        at com.meterware.httpunit.WebResponse.getFrameRequests(WebResponse.java:1024)
        at com.meterware.httpunit.FrameHolder.updateFrames(FrameHolder.java:179)
        at com.meterware.httpunit.WebWindow.updateFrameContents(WebWindow.java:315)
        at com.meterware.httpunit.WebClient.updateFrameContents(WebClient.java:526)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:201)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.updateWindow(WebWindow.java:199)
        at com.meterware.httpunit.WebWindow.getSubframeResponse(WebWindow.java:183)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:158)
        at com.meterware.httpunit.WebWindow.getResponse(WebWindow.java:125)
        at com.meterware.httpunit.WebClient.getResponse(WebClient.java:96)
        at com.mywork.tools.GPrequal.myLogData2g.main(myLogData2g.java:132)
Mar 23, 2011 10:36:27 AM org.apache.axis.utils.JavaUtils isAttachmentSupported
WARNING: Unable to find required classes (javax.activation.DataHandler and javax.mail.internet.MimeMultipart). Attachment support is disabled.

Open in new window


In my code, I basically try to parse the html source and and find the third table, read its contents and write them to a variable as the following:

Column1 Column2

ABCDE   plus
FFG        minus

ABCDE, FFG, pass and fail are all in the cells of the table in this web page.

Here is part of my code that does this:

try {
        		    HttpUnitOptions.setScriptingEnabled(true);
        		    WebConversation wc = new WebConversation();
        		    WebResponse wr = wc.getResponse(siteAddress);
        		    WebTable table = wr.getTables()[2];
        		    java.lang.String[][] cells = table.asText();
        		    
        		    for(java.lang.String[] row : cells) {
        				for(java.lang.String cell : row) {
        			    	mycheckActivity += cell;
        				}
        				mycheckActivity += System.getProperty("line.separator");
        			}	
        		}
        		catch(Exception e) {
        		    e.printStackTrace();	
        		}

Open in new window

0
Comment
Question by:Tolgar
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 19

Accepted Solution

by:
Bardobrave earned 2000 total points
ID: 35199271
Probably the HTML source from whithin you parse your data is not XHTML valid and so when trying to parse it your system breaks.

Try to check the raw code and modify it to comply with XHTML specifications.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Are you developing a Java application and want to create Excel Spreadsheets? You have come to the right place, this article will describe how you can create Excel Spreadsheets from a Java Application. For the purposes of this article, I will be u…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
The viewer will learn how to implement Singleton Design Pattern in Java.

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question