?
Solved

asp pages: java.io.IOException: Connection reset by peer   with java.nio.channels

Posted on 2003-03-12
38
Medium Priority
?
480 Views
Last Modified: 2012-06-21
I am trying to find a fast and robust way to download pages from a website. I found some code on sun's site for non-blocking socket connections. I adapted it into a jsp page because it's easier to change, but I had the same problem running it command line.

ASP pages seem to cause a "Connection reset by peer" error after partially downloading the page. Below is the code, with two sites that cause the error embedded in the code. It seems to fail either at the beginning or at a certain spot in the file.

Here is the result with the www.networkadvertising.org example: (Consistent and complete result for both pages gets the points)

524.
524.
24.
java.io.IOException: Connection reset by peer
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
Date: Wed, 12 Mar 2003 16:02:46 GMT
cache-control: private
pragma: no-cache
Connection: Keep-Alive
Content-Length: 13499
Content-Type: text/html
Expires: Tue, 11 Mar 2003 16:02:46 GMT
Set-Cookie: ASPSESSIONIDCCAQBQTS=IEAJJAGCJANEOIEOPKCJPOHG; path=/
Cache-control: no-cache



<html>
      <head>
            <title> Welcome to Network Advertising Initiative </title>
            <style type=text/css>
                  A:link, A:visited { text-decoration: none; color: blue;}
            </style>
            
            
                        
.....................
            <script>
                  
                  arrowOn = new Image;
                  arrowOff = new Image;
                  blueArrowOn = new Image;
                  blueArrowOff = new Image;
                  arrowOn.src = "images/arrow_yellow_on.gif";
                  arrowOff.src = "images/arrow_yellow_off.gif";
                  blueArrowOn.src = "images/blue_arrow_on.gif";
                  blueArrowOff.src = "images/blue_arrow_off.gif";
                              
                  
                  function toggleImage( imageName, imgSrc ) {
                        document.images[imageName].src = imgSrc;
                  }
                  
                                    
                  
                  function positionMenus() {                        
                        docDone = true;
                        
                        .....................
                  }
                  
                  
                  fun.....................





Here is the code:
//****************************************************************
//****************************************************************
//****************************************************************
//****************************************************************
//****************************************************************
//****************************************************************
//****************************************************************
//****************************************************************

<%@ page import="java.io.*" %>
<%@ page import="java.net.*" %>
<%@ page import="java.nio.*" %>
<%@ page import="java.nio.channels.*" %>
<%@ page import="java.nio.charset.*" %>
<%@ page import="java.util.*" %>

<%
//< % @ page contentType="your type here" % >

//class NonBlockingReadURL {
            String data = "";
            int size =0;
            Selector selector;
      //public void NonBlockingReadURL(){
            
  //public static void main(String args[]) {
    //String host = "www.familysearch.org";
    //String file = "/Eng/Search/af/family_group_record.asp?familyid=124210";
    String host = "www.networkadvertising.org";
    String file = "/optout_nonppii.asp";

    SocketChannel channel = null;

    try {

      // Setup
      InetSocketAddress socketAddress = new InetSocketAddress(host, 80);
      Charset charset = Charset.forName("US-ASCII");
      //Charset charset = Charset.forName("UTF-8");
      //Charset charset = Charset.forName("ISO-8859-1");
      CharsetDecoder decoder = charset.newDecoder();
      CharsetEncoder encoder = charset.newEncoder();

      // Allocate buffers
      ByteBuffer buffer = ByteBuffer.allocateDirect(2048);
      CharBuffer charBuffer = CharBuffer.allocate(2048);

      // Connect
      channel = SocketChannel.open(socketAddress);
      channel.configureBlocking(false);

      // Open Selector
      selector = Selector.open();

      // Register interest in when connection
      channel.register(selector, SelectionKey.OP_CONNECT | SelectionKey.OP_READ);

                  boolean done = false;
      // Wait for something of interest to happen
      while (selector.select(500) > 0 && !done) {
        // Get set of ready objects
        Set readyKeys = selector.selectedKeys();
        Iterator readyItor = readyKeys.iterator();
        // Walk through set
        while (readyItor.hasNext()) {
           
          // Get key from set
          SelectionKey key = (SelectionKey)readyItor.next();

          // Remove current entry
          readyItor.remove();

          // Get channel
                              SocketChannel keyChannel = (SocketChannel)key.channel();

          if (key.isConnectable()) {
           
                                    // Finish connection
                                    if (keyChannel.isConnectionPending()) {
              keyChannel.finishConnect();
            }

            // Send request
            String url_request = "GET "+file+" HTTP/1.0\r\n";
                                    
                                    url_request+= "ACCEPT: */*\r\n";
                                    url_request+= "ACCEPT_LANGUAGE: en-us\r\n";
                                    url_request+= "CACHE_CONTROL: Max-age=259200\r\n";
                                    url_request+= "HOST: "+host+"\r\n";
                                    url_request+= "USER_AGENT: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)\r\n";
                                    url_request+= "\r\n";
                                    
                                    keyChannel.write(encoder.encode( CharBuffer.wrap(url_request) ));

          } else if (key.isReadable()) {
           
            // Read what's ready in response
                                    size = keyChannel.read(buffer);
                                    if (size==-1) done=true;
                                    out.println(size+".");
                                    buffer.flip();

            // Decode buffer
            decoder.decode(buffer, charBuffer, false);

            // Display
            charBuffer.flip();
            data += charBuffer+".....................";

            // Clear for next pass
            buffer.clear();
            charBuffer.clear();

          } else {
            out.print("Ooops");
          }
        }
      }
    } catch (UnknownHostException e) {
      out.println(e);
    } catch (IOException e) {
      out.println(e);
    } finally {
      if (channel != null) {
        try {
          channel.close();
        } catch (IOException ignored) {
        }
      }
    }
    out.print(data);
  //}
//}
//NonBlockingReadURL hi = new NonBlockingReadURL();
%>
0
Comment
Question by:cumom
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 20
  • 18
38 Comments
 
LVL 92

Expert Comment

by:objects
ID: 8123187
Why are you using non-blocking IO?

And why not use one of the existing http implementation classes already available?

0
 

Author Comment

by:cumom
ID: 8125430
Well, I tried to use the native classes but they block for more than a minute sometimes and this is not acceptable. I guess I could try one of those classes in the morning, but they look like they have a lot of overhead that I don't need. I was hoping that getting the raw data would be faster and that non-blocking would give me some more flexibility as I am planning to use as many threads as possible.
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 
LVL 92

Expert Comment

by:objects
ID: 8125468
Why is blocking an issue? Aren't you reading the data in a seperate thread.
Yes URLConnection does have it's limitations, which the above packages aim to address.
I would not imagine the above packages would have any more overhead than implementing your own http implentation.
0
 

Author Comment

by:cumom
ID: 8128138
Well, blocking was annoying because I couldn't get rid of a thread that wasn't getting results and start over. I had to start another one and make sure they didn't conflict. But perhaps those new packages will halp because they have a timeout. And I couldn't figure out if I was done because there were always threads lying around.

They are going to have more overhead because they parse the entire header. They handle for cookies, etc, but I don't need anything in the header. I can just ignore it.
0
 
LVL 92

Expert Comment

by:objects
ID: 8132617
> But perhaps those new packages will halp because they have a timeout.

Yes setting a timeout should resolve that problem.

> I can just ignore it.  

You still have to read it.
Which would be all the above packages probably do unless you request one of the header values. Except for header details which are required to complete the reading of the http request, which youneed to be reading anyway.


0
 

Author Comment

by:cumom
ID: 8132831
http://jakarta.apache.org/commons/httpclient/ helps quite a bit, but I still want to get java.nio working for the performance reasons stated here:
http://www.javaperformancetuning.com/tips/nio.shtml#REF1
0
 
LVL 92

Expert Comment

by:objects
ID: 8132856
Then you'll need to implement http.
Have a read of the rfc for details of protocol.

0
 

Author Comment

by:cumom
ID: 8133094
No I don't need to implement http I just need the raw data as a String. Please only comment if you know how to get my code to work with asp files. It works exactly how I want it to for other types of files, just asp files don't work.
0
 
LVL 92

Expert Comment

by:objects
ID: 8133262
But if you don't implement http at minimum how do you know how much data to read, or if there even is data to be read?

> just asp files don't work.

how it is generated on the server makes no difference. the client simply processes http responses.

0
 

Author Comment

by:cumom
ID: 8133449
Look at my code! If I read -1 bytes then I quit. The pages I need also always have a footer that I can recognize to verify.

Why don't you try the code before making statements about things making no difference. I haven't tried every page on the web yet, but every asp page I tried failed. And every other page succeded. You do the math.
0
 
LVL 92

Expert Comment

by:objects
ID: 8133507
I have read your code, I'm just trying to point out that there's more to the http protocol than simply reading the bytes.
0
 
LVL 92

Expert Comment

by:objects
ID: 8133546
> Look at my code! If I read -1 bytes then I quit.

It's not failing on the read anyway.
The server closing down the connection on the write.
0
 

Author Comment

by:cumom
ID: 8133616
>It's not failing on the read anyway.

Yeah, I know. You asked how I know when I am done reading, so I was just answering you. Try it on a non asp page now and it WILL use the -1 thing to quit properly.


>The server closing down the connection on the write.

I tried to mention in my original email that sometimes it dies before getting any data, sometimes while getting it. Try refreshing a couple times, it should stop at a certain point in the file like in the example I showed. If it doesn't try the other example page.


>I'm just trying to point out that there's more to the http protocol than simply reading the bytes.

Yeah, I know. But I don't need to implement the http protocol, just to get a very specific type of file that just happens to have been implemented with asp. I'm not sure that I trust the size in the header anyway, it is written in asp afterall. :)
0
 
LVL 92

Expert Comment

by:objects
ID: 8133620
If all you want to do is read the bytes returned until eof then why not just open a socket, set a timeout, and read the bytes directly?

> as I am planning to use as many threads as possible.

should that have said "as few threads"?
0
 

Author Comment

by:cumom
ID: 8133693
>why not just open a socket

I was frustrated because I couldn't figure out what was going on with my threads because they would block for 5+ minutes, so I wanted to do non-blocking, so I could figure out what was wrong. Having used http://jakarta.apache.org/commons/httpclient/ I now realize that java.net was one of my problems. But downloads are still blocking for 5+ minutes (it just doesn't suck up memory), and http://www.javaperformancetuning.com/tips/nio.shtml#REF1 seems to indicate that non-blocking is faster and more efficient. And other things are running on the server, so I don't want to use all the resources in threads if I don't have to.

>should that have said "as few threads"?

It should have said "I am planning to download as many pages concurrently as possible whether that means a lot or a few threads".
0
 
LVL 92

Expert Comment

by:objects
ID: 8133730
> because they would block for 5+ minutes

You can set the socket timeout to whatever you like.

> seems to indicate that non-blocking is faster and more efficient

Depends on how you are implementing it.
A lot of the performance issues are related to object creation, but if you are just reading the bytes directly then this won't be occurring. nbio would give you the option of handling multiple downloads from one thread, depends on how many concurrent downloads you would be doing whether this is significant.


Regards speed, you can't read the data any faster than it is being delivered to you :)

0
 

Author Comment

by:cumom
ID: 8133804
>depends on how many concurrent downloads you would be doing

At times I have over a thousand in the queue waiting to download with 500 threads trying to download. Each download results in 0-6 more in the queue.

Monitoring total throughput while adjusting the number of threads running at a time: the more threads I used the more throughput up until about 500 threads at which point my server chokes on the threads, top pretty much stops refreshing, and I have to kill -9 tomcat to make the server usable again. At that point I am only getting about 1/3 of what I have tested my connection to be capable of.
0
 
LVL 92

Expert Comment

by:objects
ID: 8133968
Yes lots of threads will bring your server to it's knees :)
In which case nbio may help by reducing the number of threads your require.
0
 
LVL 92

Expert Comment

by:objects
ID: 8134179
I think the problem is with the server(s) you are connecting to, and not your code. Sometimes the code works ok, sometimes it fails to connect at all, and others it loses the connection.
0
 

Author Comment

by:cumom
ID: 8134257
Try telnet. Works every time.
0
 
LVL 92

Accepted Solution

by:
objects earned 2000 total points
ID: 8134270
Though is there any reason you repeatedly send GET requests?
 
0
 

Author Comment

by:cumom
ID: 8134284
If I am, I am not meaning to.
0
 

Author Comment

by:cumom
ID: 8134291
If I am, I am not meaning to.
0
 

Author Comment

by:cumom
ID: 8134299
ooops sorry about that repeat. I refreshed the wrong page.
0
 
LVL 92

Expert Comment

by:objects
ID: 8134322
That could be what the server is taking offence to then.
0
 

Author Comment

by:cumom
ID: 8141087
Shouldn't I be able to make multiple requests if I want the connection to be persistent?
0
 
LVL 92

Expert Comment

by:objects
ID: 8141118
Would have to have a read of the protocol spec to give a definitive answer on that, though generally http is a request/response protocol where you send a request, you get a response and you close the connection.

Funny thing is if I run it through a proxy it works fine.

0
 

Author Comment

by:cumom
ID: 8141176
Spec says I should be able to, but since when did Microsoft follow the spec.
0
 
LVL 92

Expert Comment

by:objects
ID: 8141189
> Spec says I should be able to

Can you point me to where it says that?
0
 

Author Comment

by:cumom
ID: 8141241
http rfc
8.1.2.2 Pipelining
0
 
LVL 92

Expert Comment

by:objects
ID: 8141275
That's the 1.1 rfc, your code specifies 1.0.
And even still perhaps the server just gets the ****s with you repeatedly send GET requests.
0
 
LVL 92

Expert Comment

by:objects
ID: 8141289
i might try running it thru two proxies to try and see why it works when I run it thru one (if that makes any sense).
0
 

Author Comment

by:cumom
ID: 8141312
The spec says that proxies should not allow persistent connection, I wonder if that has anything to do with it?
0
 
LVL 92

Expert Comment

by:objects
ID: 8141333
Also do you know if IIS/5.0 supports 1.1?
0
 

Author Comment

by:cumom
ID: 8142839
I don't know
0
 

Author Comment

by:cumom
ID: 8170582
So I coded the whole thing up in php to make sure I knew how the server was working. IIS/5.0 does supports 1.1 (at least the persistant connection part of it) So then I went back to my Java code and split the Selector into two parts, connect and read, so that I could keep it from sending multiple requests (until I want it to). And that did the trick. I found in my particular case that having at least three requests sent to the server before each read gave the best performance, if anyone is interested. (of course you know where the end of each file is)

Thanks for all your help objects.
0
 
LVL 92

Expert Comment

by:objects
ID: 8170692
Glad to hear you got it resolved :)
Thanks for the points.
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
This video teaches viewers about errors in exception handling.
Suggested Courses
Course of the Month13 days, 5 hours left to enroll

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question