?
Solved

BufferedReader with a restricted buffer size

Posted on 2006-05-12
4
Medium Priority
?
1,573 Views
Last Modified: 2008-01-09
I'm occasionally getting exceptionally long text files with no line terminators. This is causing problems in my application in the following code, which is designed to strip UUEncoded blocks from plain text messages:

--------8<--------
            File temp;                  // Temporary file for collecting stripped text
            boolean skip_uue = false;
            boolean doneFirst = false;

            try {


                  // Create temp file.
                  temp = File.createTempFile("PlainTextHandler",".txt");

                  // Delete temp file when program exits.
                  temp.deleteOnExit();

                  // Write to temp file
                  BufferedWriter writer = new BufferedWriter(new FileWriter(temp));

//System.out.println(new TimeStamp().toString()+getClass().getName()+": Opening BufferedReader");

                  // Use a BufferedReader for the input stream
                  BufferedReader reader = new BufferedReader(
                        new InputStreamReader(is)
                        );

                  String line = null;
                  int line_number = 0;
                  int uue_line_number = 0;
                  while ((line = reader.readLine()) != null) {
                        ++line_number;
                        if (skip_uue) {
                              ++uue_line_number;
                              if (line.length() > 2 && "end".equals(line.substring(0,3))) {

                                    // Show how many UUEncoded lines we've skipped
System.out.println(new TimeStamp().toString()+getClass().getName()+": Skipped "+uue_line_number+" lines of UUEncoded text");
                                    skip_uue = false;

                              }
                              continue;
                        }
                        else if (line.length() > 5 && "begin".equals(line.substring(0,5))) {

                              // Look for a UUEncoded block
                              if (line.matches("^begin\\s\\d{3}\\s.+$")) {
                                    skip_uue = true;
                                    uue_line_number = 1;
                                    continue;
                              }

                        }

                        // Subsequent lines need white space
                        if (doneFirst)
                              writer.newLine();      // Give Lucene some white space to separate the tokens
                        else
                              doneFirst = true;      // We have at least one line

                        writer.write(line);            // Write the non-UUE data to the temporary file
                  }

                  reader.close();
                  writer.close();

                  // Show how many UUEncoded lines we've skipped
                  if (skip_uue)
System.out.println(new TimeStamp().toString()+getClass().getName()+": Skipped "+uue_line_number+" lines of UUEncoded text");

            }
            catch (IOException e) {
//System.out.println(new TimeStamp().toString()+getClass().getName()+": IOException "+e.toString());
                  throw new StandardDocumentHandlerException("Cannot read the text document",e);
            }
            catch (Exception e) {
//System.out.println(new TimeStamp().toString()+getClass().getName()+": Exception "+e.toString());
                  throw new StandardDocumentHandlerException("Exception caught in PlainTextHandler",e);
            }
            // ... the plain text in the temp file is then passed to Lucene, before being deleted
--------8<--------

The trouble with the code above is that it may cause String line to be loaded with an unacceptably large string, which makes this thread a bad citizen in my MT application, using up too much of the heap and causing another thread to barf with an out of memory exception, when it temporarily needs heap space.

My question is this:

Can I use the BufferedStream constructor that specifies a buffer size to limit the maximum length of string read from the reader - i.e.  http://java.sun.com/j2se/1.5.0/docs/api/java/io/BufferedReader.html#BufferedReader%28java.io.Reader,%20int%29 ? If so, is the stream still readable after reading a partial line? [I can live with having the long line broken up in such a way that tokens are broken up, because it is a special case.]
0
Comment
Question by:rstaveley
  • 3
4 Comments
 
LVL 86

Accepted Solution

by:
CEHJ earned 1000 total points
ID: 16666412
>>Can I use the BufferedStream constructor that specifies a buffer size to limit the maximum length of string read from the reader

No - you can only limit the size of the buffer.

>>If so, is the stream still readable after reading a partial line?

Yes. Even if you set the buffer size to 1, it will still read it
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16666447
Just do

if (line.length() > MAX_LINE_LENGTH) {
    line = line.substring(0, MAX_LINE_LENGTH);
}

otherwise you'd have to do your own line reading or override BufferedReader.readLine
0
 
LVL 17

Author Comment

by:rstaveley
ID: 16666798
I guess I need to implement my own line reader, then. Thanks for the quick response, CEHJ.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 16666802
:-)
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
Suggested Courses
Course of the Month13 days, 9 hours left to enroll

750 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question