Solved

Problem with SAXParser's InputStream

Posted on 2002-05-30
18
1,515 Views
Last Modified: 2008-03-03
Hi,

I'm having a problem getting the SAXParser to work the way I need it.  It appears that when parsing an InputStream rather than a file, the socket that's providing the InputStream has to close/disconnect before the SAXParser recognizes this as the end of the write.  

The problem here is that I do not want to close the socket to initiate the parsing.  Unfortunately it just hangs until I do.  

So, is there a way of having it parse the data w/o closing the socket?  

I've tried adding a null character and newline character to the end of the stream w/o success.  Here's the relevant parts of the code...  I am using the jaxp 1.1 classes in a Windows 2000 environment.

Thanks - gbulla


--- Client code for writing the data to the class with the XML parser ---
.
.
Socket socket = new Socket( IP, port);
DataOutputStream oStream = new DataOutputStream( socket.getOutputStream));
oStream.writeChars( "This is a test\0\n");
oStream.flush();

socket.close(); //  This is what I DON'T want to do, but is the only thing that makes it work.

--- Server code with the XML parser ---

class XMLSAXParser extends DefaultHandler
{
.
SAXParserFactory factory;
SAXParser saxParser;
factory = SAXParserFactory.newInstance();
saxParser = factory.newSAXParser();
.
  void parseInputStream( InputStream inputStreamIn ) {
     saxParser.parse( inputStreamIn, this);  // this is as far as it goes unless the client closes the socket
  }
.
.
}

0
Comment
Question by:gbulla
  • 7
  • 7
  • 4
18 Comments
 
LVL 92

Expert Comment

by:objects
ID: 7045664
I think the problem is that the parser has no way of knowing eof. A solution would be to read the xml first into a buffer, and then pass this buffer to the parser. This would allow you to determine where eof is.
0
 
LVL 7

Expert Comment

by:yoren
ID: 7045671
You could write code in startElement() and endElement() to keep track of the current depth. When you hit the end of the last element, throw an exception to force the end of the parsing.
0
 

Author Comment

by:gbulla
ID: 7047199
Yes, appears to be a missing EOF issue.  Not a problem when you read from a file, but strange it doesn't have a recognized EOF for streams?

For efficiency, I'm trying to avoid writing the stream to a temp disk file and reading it from there.  I could be receiving up to 10-20 XML records a second, and performing an I/O operation each time would be too slow (not to mention the possible concurrency issues and residual files).

gbulla
0
 

Author Comment

by:gbulla
ID: 7047811
Voren - twas a good idea, but unfortunately it doesnt appear to be doing *anything* (eg, parsing, etc) until it sees the EOF or closed socket.  Arggg!

gbulla
0
 
LVL 7

Expert Comment

by:yoren
ID: 7047826
Have you tried a well-formed XML document? Try this:

oStream.writeChars( "<doc>This is a test</doc");
0
 
LVL 7

Expert Comment

by:yoren
ID: 7047850
Make that oStream.writeChars( "<doc>This is a test</doc>");
0
 
LVL 92

Expert Comment

by:objects
ID: 7047938
> Not a problem when you read from a file, but strange it
> doesn't have a recognized EOF for streams?

The only EOF for a stream is when the stream is closed, having a general EOF marker for all streams would not make sense as they are used to transmit arbitrary data.
Anything else it is up to the application to implement its own policy.

0
 
LVL 92

Expert Comment

by:objects
ID: 7048002
Your problem could be related to this bug:

http://developer.java.sun.com/developer/bugParade/bugs/4484901.html

0
 
LVL 7

Expert Comment

by:yoren
ID: 7048042
Nice catch, objects! gbulla, try a different parser. Piccolo (http://piccolo.sourceforge.net) just does a read() without forcing the buffer to be filled, so it may solve your problem.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:gbulla
ID: 7090724
Objects/Yoren,

Got diverted for a week.  The link to the bug sounds like what I'm experiencing.  I tried the Piccolo parser and didn't notice any difference, though I may not be using it correctly... still experimenting.  

My question about Piccolo is, how is it then determining what the EOF is?  If it does a read() w/o such a marker, when does it know when to start reading??  This sounds like a bigger problem, though doesn't seem to affect some other streaming apps, eg, browsers, which start to display the data before they're finished reading all the data.  

Also, I wonder how the fellow in the link ever resolved his problem?

GB
0
 
LVL 92

Expert Comment

by:objects
ID: 7091175
Can you wrap all your XML requests in a single XML document?  From my understanding of the bug once you'd read the initial 8K then things should flow ok.
0
 

Author Comment

by:gbulla
ID: 7108795
Objects,

No, I really can't wrap all potential XML requests in a single doc.  I can, however, make it a simple protocol such as

- client connects and sends XML data
- server receives and parses XML
- server sends ACK XML and the socket is disconnected

I'm finding that the XML parser is still hanging and waiting for an EOF before it releases, even with an > 8k stream.  Therefore, I never get to the next phase where the server sending a response (unless I close the socket, which throws an error when the server tries to send a return ACK).

What a pain.  I'm wondering if the Java XML parser is suited for this type of client-server protocol?
0
 
LVL 92

Expert Comment

by:objects
ID: 7109520
What about storing the incoming XML to a buffer, creating a string from the resulting data, and parsing that?
0
 

Author Comment

by:gbulla
ID: 7109645
Objects - How..? The xml parser takes two arguments: an inputStream or a string reference to a URL (which can also be a file on your drive). I don't see how to parse a String.  

Note that in the http://developer.java.sun.com/developer/bugParade/bugs/4484901.html article, someone at the end suggests using StringReader to do this, but I can't see how this can be used in the xml parser. StringReader does not 'stream' as far as my reading shows.  I know that could get the data from the stream and write it to file on the disk, and then immediately read the file into the parse.  But this woud require an I/O op for every message.. Bad.
0
 
LVL 92

Accepted Solution

by:
objects earned 200 total points
ID: 7109665
new ByteArrayInputStream(string.getBytes());
0
 

Author Comment

by:gbulla
ID: 7109681
Objects - How..? The xml parser takes two arguments: an inputStream or a string reference to a URL (which can also be a file on your drive). I don't see how to parse a String.  

Note that in the http://developer.java.sun.com/developer/bugParade/bugs/4484901.html article, someone at the end suggests using StringReader to do this, but I can't see how this can be used in the xml parser. StringReader does not 'stream' as far as my reading shows.  I know that could get the data from the stream and write it to file on the disk, and then immediately read the file into the parse.  But this woud require an I/O op for every message.. Bad.
0
 

Author Comment

by:gbulla
ID: 7111289
Objects - The points are yours.  Too bad I didn't think of this earlier.  I'm using readLine() to get the data and it works like a charm. The only moderately negative implication is that this requires the user to 1) not use newlines in the body of the XML and 2) add a newline at the end.  But I might be able to think of a workaround on this..

Thanks
0
 
LVL 92

Expert Comment

by:objects
ID: 7112178
So effectively it sounds like you use eol to flag the end of an xml message. To get around this I guess you'll need to figure out another method to determine the end of an xml message.

Thanks for the points :)

http://www.objects.com.au/staff/mick
Brainbench MVP for Java 1
http://www.brainbench.com
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
groupSum6 challenge 6 76
allswap challenge 6 79
advertisement module in core php 4 145
servlet  URL Rewriting 1 26
For beginner Java programmers or at least those new to the Eclipse IDE, the following tutorial will show some (four) ways in which you can import your Java projects to your Eclipse workbench. Introduction While learning Java can be done with…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now