• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 586
  • Last Modified:

Need to get web content from a JAVA servelet

I have a servlet JAVA application running under Tomcat 5.x.  This web application essentially has to act like a browser given GET requests from a separate JAVA client running.

1.  Client makes a series GET request to server
2.  Server goes out to internet to get content and converts content to a byte array.
3.  Server returns content to client for display

I know this sounds bizarre but it's legacy code I have to get working with no choice.  I need to come up with the best and most efficient JAVA solution for step # 2.

The legacy code opens up a raw Socket and gets the content.  But, there are problems with this.  It's very slow and sometimes only returns only a small portion of the content.

I was thinking replacing Socket with HttpURLConnection class.  First, does anyone have any ideas, in general, why the Socket class could have issues?  Second, does anyone have working, robust code that does item #2?  

0
lcor
Asked:
lcor
  • 5
  • 4
  • 2
2 Solutions
 
CEHJCommented:
>>First, does anyone have any ideas, in general, why the Socket class could have issues?

Using a plain socket is so far apart from using a browser that we have a chalk/cheese situation.
The socket will return (if you're lucky), one replay from a web server. When a browser makes a request it will make numerous requests for the many bits of content contained in a page and then integrate them all.
In short, it's a complex piece of software - even at its simplest.

The nearest you'll get to emulating a browser it to use a specialised API such as Jakarta HTTPClient, which i'd recommend you get if you need to do this

http://hc.apache.org/httpclient-3.x/
0
 
objectsCommented:
Make sure you're using some buffereing to pull the pages. Can you post your code and I'll see if I can see why its so slow.

0
 
lcorAuthor Commented:
objects,  
sorry for the delay but the Prez Day holiday got into the way...I'll provide code snippets tomorrow.

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
objectsCommented:
theres a simple example and discussion for reading a GET response here

http://java.sun.com/docs/books/tutorial/networking/urls/readingWriting.html

If you just want the byte stream returned then use a BufferedInputStream instead of a BufferedReader
A ByteArrayOutputStream can be used to write the response to a byte array

0
 
lcorAuthor Commented:
Here's the raw socket attempt.  Tends to pause when trying to acquire web content.
0
 
lcorAuthor Commented:
Socket sock;
ByteArrayOutputStream bc = new ByteArrayOutputStream();
            
try {
    sock = new Socket(rHost.toString(), Integer.parseInt(rPort));                              
    outs = sock.getOutputStream();
    outs.write( html );
    outs.write('\r');
    outs.write('\n');
    outs.flush();
    sock.shutdownOutput();            

    byte[] b = new byte[1024];
    InputStream ins = sock.getInputStream();
    int i = ins.read(b);

    while(i != -1) {
        bc.write(b, 0, i);
        i = ins.read(b);
    }
                  
    content = bc.toByteArray();            

    bc.close();
    ins.close();
    outs.close();

    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
         e.printStackTrace();
}


0
 
lcorAuthor Commented:
Here's using HttpURLConnection.  Tends to return messed up content but it's faster than the Socket way.

Socket sock;
ByteArrayOutputStream bc= new ByteArrayOutputStream();

try {
    URL url = new URL(reqURL);
    HttpURLConnection conn = (HttpURLConnection)url.openConnection();
    conn.setRequestMethod("GET");
    conn.connect();      
    InputStream ins = conn.getInputStream();

    byte[] b = new byte[1024];
    int i = ins.read(b);

    while (i != -1) {
        bc.write(b, 0, i);
        i = ins.read(b);
    }

    ins.close();
    conn.disconnect();

    content = bc.toByteArray();
    bc.close();

    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

0
 
objectsCommented:
you can simplify and buffer the input using:

    URL url = new URL(reqURL);
    InputStream in = new BufferedInputStream(url.openStream());
    byte[] b = new byte[1024];
    int n = 0;
    while (-1!-(n=in.read(b))) {
      bc.write(b, 0, n);
    }
0
 
lcorAuthor Commented:
I'm steering towards the Jakarta solution since my research/prototyping shows that java.net has issues.  But, awarding points for the java.net solution that was a good example to show how it works.
0
 
CEHJCommented:
:-)
0
 
objectsCommented:
> shows that java.net has issues.

We use it in hundreds of production applications :)

0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 5
  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now