[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Io exception: Broken pipe

Posted on 2005-05-11
16
Medium Priority
?
11,465 Views
Last Modified: 2011-08-18
I'm not sure whether this is the correct forum to post this question, since the technologies involve Oracle, Java, as well as networking. Please let me know if you think I should post this question elsewhere.

Recently, we started receiving the following error in our log file:

java.sql.SQLException: Io exception: Connection reset by peer: Connection reset by peer
Io exception: Broken pipe

We have been running the application successfully for several weeks with no occurrences of this error. We also have several other applications, identical to the one in question, except for the fact that each of the other instances connect to different remote databases. Because these other instances are running correctly, and this one is not, I was under the impression that it was a problem at the remote database server side. I phoned the administrator over there, and they said a firewall had to be replaced, so I gave them our IP so that we could get through their firewall. However, the error appeared again last night (the application in question runs once every night).

The initial part of the application appears to be working correctly. Essentially, what we have at the remote end is a small PL/SQL program which waits for a trigger to tell it that new data has been inserted. After the program is notified, the data is sent to an Oracle pipe. At the local end, we have a small Java program which checks the pipe to see if any data is available in the Oracle pipe - if so, all the data is retrieved, and then the Java program enters a loop which checks the Oracle pipe for new data during each iteration of the loop, and retrieves any available data. The first part of the Java program appears to retrieve the initial data just fine. However, when the loop is started, the above error is received, and the loop is exited, terminating the Java program.

I still think something is wrong at the remote database end, but cannot pin down the problem. Any assistance in figuring out what is causing the above error message is much appreciated! I searched online for an explanation, and was not able to determine the problem.

I am assigning a high point value to this question since this is an urgent matter.

Thanks.
0
Comment
Question by:Electrokardiogram
  • 6
  • 3
  • 2
  • +2
13 Comments
 
LVL 48

Expert Comment

by:schwertner
ID: 13977166
It depends of the Oracle version, machine brand, OS, JDK version.
0
 
LVL 48

Expert Comment

by:schwertner
ID: 13977172
Try  to ping the Listener to see if it is available:

C:>tnsping alias 7
0
 
LVL 12

Expert Comment

by:geotiger
ID: 13977183

Are you connecting to the Oracle database through pooling or dedicated connection? If it is pooling, you may lose the connection during your loop.

You can read it more at http://forum.java.sun.com/thread.jspa?threadID=394408&messageID=1722730
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 13977184
Not sure if this is the answer or not but since it's a new firewall, see if there is a timeout on that port.  Although we don't use java, I have a specific DB that I get to through a firewall with a 2 hour idle connection timeout.  If I run and long transactions from SQL*Plus, the firewall sees the connection as idle and closes it.
0
 
LVL 48

Expert Comment

by:schwertner
ID: 13977267
See also the profiles of the Oracle user you use for idle time restriction.
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 13977314
schwertner,
normally I'd agree with everything you've said if it wasn't for the fact that the app was working fine for weeks and the only thing that is known to have changed is the replacement of a firewall.  I'm betting that there was not a hard and fast configuration on this firewall and when the new one was set up, items were missed.
0
 

Author Comment

by:Electrokardiogram
ID: 13977822
I just contacted the network administrator at the remote end, and he says that they are still have connectivity issues at his location. I am not sure what he means by "connectivity issues" - probably some unknown network issues that are causing some connection anomolies. He claims that there are no time out settings enabled on their firewall.

Please remember - our application is able to connect through their network and process available data at the start of our application. After this initial processing, however, we get the errors. I just took a look at our log files again, and the time duration between the beginning of the loop to which I referred (which follows the initial processing) and the error is always under 10 minutes (around 5 minutes is the most common).
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 13977919
I would probably wait until they have resolved their "connectivity issues" before you troubleshoot much more.  Has the amount of data processed by this process increased recently?

I'm looking for the 'what changed' parameters.  One of these has to be the cause.  It might be the timeout settings mentioned by  schwertner if the amount of data has crossed over some threshold.
0
 

Author Comment

by:Electrokardiogram
ID: 13978341
No, the amount of data processed by this process has not increased recently. According to my log files, the amount of data has been consistently quite low since I entered the related job to our crontab job schedule. The other instances of the application are processing a much higher amount of data.

Unfortunately, I do not have access to the remote network administrator's information - according to him, there are no timeout settings enabled on their firewall. It seems like there may be timeouts, based on my log files, but I can only rely on what the network administrator is telling me.

Besides timeout settings, of what other settings do you think I should be aware?

Thanks for all of your feedback so far.
0
 
LVL 78

Accepted Solution

by:
slightwv (䄆 Netminder) earned 2000 total points
ID: 13978525
There are SQLNet timeout settings but these should be server wide.  I'm not all that familiar with profiles so I'll have to refer to schwertner or the docs on the idle time restriction he mentioned.

Can you set up some type of test program to help troubleshoot.  If this connection gets dropped, then you've elminiated all app code.

Possible a simple pl/sql loop that displays sysdate.  Connect to the remote DB from a client-side sql*plus session and try something like (untested, i'm typing it in from here):

set serveroutput on size 1000000
begin
for i in 1..10000 loop
    dbms_output.put_line(to_char(sysdate,'MM/DD/YYYY  HH24:MI:SS'));
     dbms_lock.sleep(10);
end loop;
end;
/
0
 
LVL 22

Expert Comment

by:earth man2
ID: 13978715
Of course you really need your app to be robust enough to recover from lost connections.  Don't forget pipes lose data when power is lost so you should consider using AQ messaging instead if this data is valuable.
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 14526345
even split?
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 14587944
Still suggest even split:
schwertner, geotiger, earthman2 and myself
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many…
How to Unravel a Tricky Query Introduction If you browse through the Oracle zones or any of the other database-related zones you'll come across some complicated solutions and sometimes you'll just have to wonder how anyone came up with them.  …
This video shows information on the Oracle Data Dictionary, starting with the Oracle documentation, explaining the different types of Data Dictionary views available by group and permissions as well as giving examples on how to retrieve data from th…
This video shows how to recover a database from a user managed backup
Suggested Courses

865 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question