Link to home
Start Free TrialLog in
Avatar of mbunkows
mbunkows

asked on

Fastest IO: For mbormann

From: mbunkows
                                                          Title: "Fastest IO"
                                                                                                                           

                      Status: Waiting for answer.

                      Points: 400
                                                          Date: Friday, February 11 2000 - 07:51PM PST
                                                                                                                           

                      Ive been messing around with File IO and was wondering what the fastest way is.  I have a small test application that simply
                      communicates with a servlet to read a file (binary or ascii) from the server and place it on the clients harddrive.
                      The files I used for the below test were:
                      test.txt -- ascii -- size: 19725bytes
                      test.exe -- binary -- size: 19725bytes

                      Here is the code:
                      //ServerReaderServlet.java
                      import javax.servlet.*;
                      import javax.servlet.http.*;
                      import java.io.*;

                      public class ServerReaderServlet extends GenericServlet implements SingleThreadModel  {

                          public void service(ServletRequest request,ServletResponse response) throws ServletException, IOException  {
                                 
                      try  {
                               BufferedReader reader= request.getReader();
                               String type= reader.readLine();
                                  String filename= reader.readLine();
                                 
                      if ("ascii".equals(type))  {
                               BufferedReader in= new BufferedReader(new FileReader(filename));
                               PrintWriter out= new PrintWriter(response.getWriter());
                                   String line;
                                  while ((line = in.readLine()) != null)  {
                                  out.println(line);
                                  }
                                  in.close();
                                  out.close();
                              }
                           else {
                              BufferedInputStream in= new BufferedInputStream(new FileInputStream(filename));
                              PrintStream out = new PrintStream(response.getOutputStream(),true);
                                  byte[] buf= new byte[1024];
                                  while (in.read(buf,0,1024) != -1)  {
                                   out.write(buf,0,1024);
                                   }
                                   in.close();
                                   out.close();
                               }
                              }
                      catch(IOException e)  {
                      e.printStackTrace();
                      }
                      }
                      }
                      //end Servlet code

                      //ServerReader.java
                      import java.io.*;
                      import java.net.*;

                      public class ServerReader  {

                      public static final String ASCII="ascii";
                      public static final String BINARY="binary";
                           
                          public static int readFile(String source, String destination, String type)  {

                      URLConnection connection;
                          try  {
                               URL url = new URL(host);
                                  connection = url.openConnection();
                                  connection.setDoOutput(true);
                              }
                              catch (IOException e)  {
                               return -1;
                              }
                               
                      try  {        

                                  PrintWriter toServ = new PrintWriter(connection.getOutputStream());
                                  toServ.println(type);
                                  toServ.println(source);
                                  toServ.close();

                                  if (ASCII.equals(type))  {
                                   BufferedReader in= new BufferedReader(new InputStreamReader(connection.getInputStream()));
                                   PrintWriter out= new PrintWriter(new FileWriter(destination));
                                   String line= null;
                                   while((line = in.readLine()) != null)  {
                                   out.println(line);
                                   }
                                   in.close();
                                   out.close();
                                  }
                                  else  {
                                   BufferedInputStream in = new BufferedInputStream(connection.getInputStream());
                                   BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(destination));
                                  byte[] buf= new byte[1024];
                      while (in.read(buf,0,1024) != -1)
                                   out.write(buf,0,1024);
                                  in.close();
                                   out.close();
                      }
                      }
                      catch(Exception e)  {
                               return -2;
                      }
                      return 0;
                      }

                      public static void main(String args[])  {
                      System.out.println("Starting");
                      if (readFile(textSrc,textDest,ASCII) != 0) System.out.println("ERROR IN ASCII TRANSFER");
                      System.out.println("Done with ascii");
                      if (readFile(binSrc,binDest,BINARY) != 0) System.out.println("ERROR IN BINARY TRANSFER");
                      System.out.println("Done with binary");
                      }
                      }
                      //end Client code
                      I hope that formats ok... It looks awful from the paste.
                      Also if you are actually going to try the code you need to specify the following variables (all strings):
                      String textSrc; //text src filename
                      String textDest; //text destination filename
                      String binSrc; //bin src filename
                      String binDest; //bin destination filename
                      String host; //host where your servlet resides

                      Well there has to be a question in here somewhere!
                      My questions:
                      1) What is the fastest way possible (in Java obviously) to do this?
                      2) Why do I get more bytes coming back (29696bytes) in the binary version?
                      3) Why is the binary version 2-3 times faster than the ascii in the above example? (even when it seems to be transfering more
                      bytes)
                      4) Please comment on any client/server oddities that you see in the code or tips how I can do things better.

                      Please give thorough comments (ie I'll grade the best answer, not the first)
                      Its worth quite a few points because I really want to understand the IO
                      package better (even though its changing in JDK1.4 :))

                      Thanks


                      Question History

                      Comment
                                                                                                                           

                      From: mbormann
                                                          Date: Friday, February 11 2000 - 08:03PM PST
                                                                                                                           


                      (1)See this site and it will answer most of your questions.

                      http://www.sun.com/workshop/java/wp-javaio/;$sessionid$4KS1C4IAAV2JTAMUVFZE3NQ 

                      (2)To do a correct test use DataInputStream's readFully() or something like that which will read the whole in one shot and
                      correctly

                      (3)Beacause you are using a big Buffer of 1024 bytes ,generally the readLine() buffer is around 128 as most Lines get over by
                      that character length

                      (4)
                      (a) Never use code like this directly without checking
                      String type= reader.readLine();
                      this may throw NullPointerException if it returns null and nothing is there in file.


                      Comment
                                                                                                                           

                      From: heyhey_
                                                          Date: Saturday, February 12 2000 - 03:56AM PST
                                                                                                                           


                      1. replace              
                      while (in.read(buf,0,1024) != -1) out.write(buf,0,1024);              

                      with

                      byte[] buf = new byte[1024];
                      int len;
                      while ((len = in.read(buf)) != -1) out.write(buf,0,len);


                      BufferedReader.readLine reads the file byte after byte and checks each byte for LF.
                      that's why its much slower than InputStream.read();

                      3. do you really need different code for binary and text files ?


                      Comment
                                                                                                                           

                      From: ravindra76
                                                          Date: Saturday, February 12 2000 - 05:00AM PST
                                                                                                                           


                      It's fine heyhey!!! :)


                      Comment
                                                                                                                           

                      From: vishone
                                                          Date: Sunday, February 13 2000 - 04:43AM PST
                                                                                                                           


                       Hi mbunkows,

                       This URL might be useful to you,

                       "How to improve Java's I/O performance" at
                              http://www.javaworld.com/javaworld/javatips/jw-javatip26.html 

                       - Vish


                      Comment
                                                                                                                           

                      From: vladi21
                                                          Date: Monday, February 14 2000 - 03:00AM PST
                                                                                                                           

                      also look:

                      Tuning JavaTM I/O Performance
                      http://developer.javasoft.com/developer/technicalArticles/Programming/PerfTuning/index.html 

                      Programming with JavaTM I/O Streams
                      http://developer.javasoft.com/developer/technicalArticles/Streams/ProgIOStreams/index.html 

                      Writing Your Own JavaTM I/O Stream Classes
                      http://developer.javasoft.com/developer/technicalArticles/Streams/WritingIOSC/index.html 



                      Comment

                      From: mbunkows
                                                          Date: Monday, February 14 2000 - 06:14AM PST
                                                                                                                           

                      Lots of good IO sites.  I dont know where you guys get all these.  I havent got through all of them yet.  But I thought I'd let you
                      know Im looking at them now since I was gone all weekend.
                      heyhey:
                      I thought I needed to use different code based on a file being binary or ascii.  (blush). I suppose a byte is a byte.  So why does
                      ftp require you to specify binary or ascii?  I know Ive transferred .class files to the server and tried to run them and got an error
                      because I didnt transfer them via binary ftp transfer.  If I can drop either branch of the code that would speed up the process
                      and I would only need to transfer the filename to fetch from the server.



                      Comment
                                                                                                                           

                      From: heyhey_
                                                          Date: Monday, February 14 2000 - 06:24AM PST
                                                                                                                           

                      I don't know the FTP protocol in details (I will take a look at it when I have some free time) - but the ASCII / binary modes come
                      from the 'ancient' Unix world ... ASCII files are supposed to contain only (ASCII ??)  end with <Ctrl+Z> (char with ASCII code 26)
                      and binary files contain 'random' data. I suppose that FTP protocol optimizes transfer for ASCII files (encodes the ASCII
                      characters).

                      transfering ASCII file as binary is ok. the problem comes if you want to transfer binary file as ASCII. in you case, you can always
                      send raw / binary data (that is byte[])


                      Comment
                                                                                                                           

                      From: mbormann
                                                          Date: Monday, February 14 2000 - 06:34AM PST
                                                                                                                           

                      The range of ASCII is limited to 128 chars but generally Binary can send more ,I think the range is 256 chars ,not sure though


                      Comment

                      From: mbunkows
                                                          Date: Monday, February 14 2000 - 07:39AM PST
                                                                                                                           

                      What Ive learned or understood more thoroughly:

                      1) Text is not necessarily ASCII (can be unicode) (I'll quit referring to it as such)
                      2) I can use the same code for both text and binary.  I believe I have to use the InputStream classes and not the Reader
                      classes if I do both. (Am I wrong?)  In my tests there isnt any garbage characters that I thought there would be in the
                      transferred text files.
                      3) Large buffer size is faster than small buffer size because of less access to the underlying system. This is true up to the point
                      of running out of memory on the system.  Calling the File classes length() method and setting a buffer size == to length() would
                      be ok for small files, but not very smart for a general purpose class like I am writing.  Also the java.io.File classes are slow. I
                      never tested the difference in getting a dynamic buffer size from length() and  having a generic buffer size because the length()
                      method wouldnt fit my application.

                      A couple things I would like to know before closing the question:
                      1) What should I use in place of:
                      String filename= reader.readLine();
                      or do I place a try/catch around it.
                      2) Is there a better/faster way of getting the filename from the client?
                      3) Is this the way everyone does client/server programs?



                      Comment

                      From: mbunkows
                                                          Date: Monday, February 14 2000 - 07:44AM PST
                                                                                                                           

                      My last post was before either heyhey or mbormann responded.. I was sitting on the page for awhile.  Sorry.
                      I would also like to know about questions 1,2,3 above.  But Im pretty clear on the text/binary stuff.

                      Then Ill close the question.  I was hoping I wasnt going to have to split this one up but theres alot of good stuff from quite a
                      few of people and I wouldnt feel right about giving it to one person.



                      Comment
                                                                                                                           

                      From: mbormann
                                                          Date: Monday, February 14 2000 - 10:27PM PST
                                                                                                                           

                      >>>I believe I have to use the InputStream classes and not the Reader classes if I do both. (Am I wrong?)

                      I think that Java by default picks up everything nicely and smoothly u dont have to worry but a good thumrule is use the Reader
                      classes if you have Unicode (it supports ASCII too)and the Streams if you have ASCII only,of course the Reader classes are paid
                      much much more attention while tuning whereas the Streams are getting deprecated in newer versions of JDK.

                      The Reader's strip off the high byte of the character and convert it to byte in case of ASCII ,have u read that article at sun.com it
                      is very illuminating on Readers and Streams particularly DataInputStream.

                      (1) Do a check like this maybe
                      String filename= reader.readLine();
                      if(filename != null)
                      {
                      //rest of code
                      }
                      else
                      {
                      //tell client that abrupt end and couldnt retrieve filename
                      }

                      no need to have a try/catch then in that case

                      (2) I dont think so , maybe others will shed some light on it
                      (3) I do it like this , maybe others will shed some light on it

                      A final speedy tip which my colleagues told me
                      class Somexclass
                      {
                      int [] instanceVar=new int[10000];
                      void someMewthod()
                      {
                      int [] localVar = instanceVar; // local vars are faster than instance vars
                      }
                      }

                      if u use this local variable it is a bit faster ,anybody can guess why?
                      :)


                      Comment
                                                                                                                           

                      From: vladi21
                                                          Date: Tuesday, February 15 2000 - 03:04AM PST
                                                                                                                           

                      some optimization links:
                      Java Optimization
                      http://www.cs.cmu.edu/~jch/java/optimization.html 
                      http://www.cs.cmu.edu/~jch/java/resources.html 

                      Writing Advanced Applications Chapter 8: Performance Techniques
                      http://developer.java.sun.com/developer/onlineTraining/Programming/JDCBook/perfTech.html 

                      Memory inspection/leak detectors
                      http://developer.java.sun.com/developer/earlyAccess/hat/ 

                      The Memory Management Reference: Full Bibliography
                      http://www.harlequin.com/products/st/mm/bib/full.html 

                      Using Finally Versus Finalize to Guarantee Quick Resource Cleanup
                      http://developer.java.sun.com/developer/TechTips/2000/tt0124.html#tip1 





                      Comment

                      From: mbunkows
                                                          Date: Tuesday, February 15 2000 - 02:40PM PST
                                                                                                                           

                      I think I have enough info to close the question.
                      Unfortunately, there is no easy way to divide points in EE.  It would probably be a pretty major rewrite anyway. I'll delete this
                      question and post a LONG question for each of you listed below with all the comments in the question body.

                      mbormann:
                      I see what you mean by "checking" readLine().  
                      I could see why using local vars INSTEAD of instance vars could save some time, but Im surprised copying the instance to local
                      and then accessing the local would be faster than just using the instance (even if its by reference).  
                      Yes I did read the Workshop article.  I love seeing actual performance tests with code.  It helped my understanding a lot.
                      Points: 150 X 'A'

                      heyhey:
                      Using streams only instead of separate code for text and binary works well and its faster.
                      Points: 150 X 'A'

                      vladi21:
                      Thanks for the links.  You always have excellent references.
                      Points: 100 X 'A'
ASKER CERTIFIED SOLUTION
Avatar of mbormann
mbormann

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial