[Webinar] Streamline your web hosting managementRegister Today

x
?
Solved

My Program sometimes freezes, the log file shows error 10054 and ADS Error 7020

Posted on 2008-02-08
8
Medium Priority
?
1,210 Views
Last Modified: 2012-06-21
A week ago I installed our application into a new customers premises. It has been a nightmare. Every so  often, the app hangs, the only solution is to kill the process from the task manager. Yesterday I examined the Advantage error log, and found a whole stack of 10054 Socket errors followed by ADS 7020 (User logged out)  errors. At one stage when the workstation had locked up, I examined the active queries in Advantage Data Architect and it showed active queries running for the locked terminal, there percent % complete status was in he range of -495000.00%. The problem seem to happen when a transaction is commited to the datebase. I put some diagnostic messages into the code as follows:-

 DisplayMessage('Please Wait...., Saving Transaction');
 AdsConn.BeginTransaction;
  Try
     ....
       UPDATE THE VARIOUS TABLES
     ....
 Except
      AdsConn.RollbackTransaction;
      StandardEurekaNotify(ExceptObject, ExceptAddr);
      TransactionSaved := False;
      Exit;
 End;


 AdsConn.CommitTransaction;
 TransactionSaved := True;
 HideMessage();

When the code executes it displays the message, updates the database, hides the message and then sometimes hangs.

We have other customers with a similiar problem, but it only happens to their system every couple of weeks.

Can anybody help?

Regards.
Robert.

0
Comment
Question by:robert_n_harris
  • 5
  • 3
8 Comments
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 20856368
Has Sybase Tech Support had any suggestions?

I'm not familiar with the new Advantage suite but if you're having socket errors and hangs, it's time to bring in the network people at this client (or your own network people). Do they have a healthy LAN? (What collision rate and retries are they seeing?) What is the  TCP_KEEPALIVE setting on the server? Is the Windows Server healthy? (Anything showing up in its Event Viewer?)

(These are necessarily vague hints at this point; hopefully we can zero in on the issue.)
0
 

Author Comment

by:robert_n_harris
ID: 20856821
Hi Joe,
I have raised an issue with Sybase Tech Support many weeks ago with a similiar problem. It has not been resolved.

We brought in a nework company a couple of days ago. They changed the ADS router, re-terminated network cables. We also re-configured Norton Security Suite on the server to Ignore our database files. Other than that, they did not seem to know what else to do.

I went to examine the 'System Event Log' yesterday and got the message 'Event Log Corrupt'.

 Could you tell me where to get the collision rate and retries and also the TCP_KEEPALIVE settings ?
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 20856840
I would've hoped your network people would have looked into more than just the physical wiring... oh well.

Open a command prompt, run "netstat -a". If you see a non-trivial percentage of the TCP connections (could be a lot of output) in a "TIME_WAIT" state then we suspect TCP_KEEPALIVE is hanging onto dead connections for too long.

I'm not sure how to check for collisions or network retries under Windows, nor how to check the current setting for TCPKEEPALIVE. (More of a UNIX person myself). For what it's worth your network people totally should know how to do this, and should at least have looked into collisions & retries.

If the Windows Server event log is corrupt then I think we have to suspect more things are wrong with this box. At a minimum I would be looking to restart it, and perhaps make sure your database server is being given a fixed IP rather than DHCP (and that the DHCP server reserves that IP so no-one else can use it).

Sorry I can't help you more with this, but I think we have to suspect both Windows and the network setup at this point, but I'm not trying to claim it couldn't be a Sybase problem... but "Event Log Corrupt" is not something a healthy Windows machine will ever say.
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:robert_n_harris
ID: 20864808
Hi Joe, I have examined the IP processes using 'netstat -a'. There were only 2 processes in a TIM_WAIT state. I searched the registry for TCPKEEPALIVE, but it does not exist. Maybe I have phrased the question wrong. I need to find out more about 10054 errors. Specifically,

1. What is a 10054 error. I can see from searching the internet that they are 'Connection Reset by Peer', but what does that mean ?

2. How can I reproduce them here on my development machine ?

3. How can I correct them ?

Sorry for the delay in getting back to you, but I have been doing a lot of thinking.
0
 
LVL 24

Accepted Solution

by:
Joe Woodhouse earned 1500 total points
ID: 20864920
We're beyond where I can help you with this, I'm afraid.

"Connection reset by peer" means a connection was broken and your Advantage server thinks it was from the other end. It may or may not be correct about that.

My (old) sources tell me TCP_KEEPALIVE is handled in Windows in the registry setting

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters\KeepAliveTime

and that the (decimal) value is in ms. A good number is probably around 15 minutes, which would be 900,000 (decimal).


The only way I can think to reproduce this on another machine would be to give the DEV machine the PROD machine's IP address (after changing PROD), and patching it into the same switch.

You need some network people to look at things like network addresses, settings and traffic. Checking physical cables etc was not a waste of time but is a bit strange as the first thing to test, let alone the only thing to test.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 20917126
Was it just a KEEPALIVE issue?
0
 

Author Comment

by:robert_n_harris
ID: 20917175
Hi Joe,
  No. I Really dont know. There was no keep alive setting in the regisry. My original problem was that my program was freezing when commiting a transaction. I noticed from Advantages error log file that a socket error 10054 followed by a 'user disconnect' was logged whilst this was happening. Unfortunately things are never what they seem. On wednesday and thursday and last, the program froze but these error messages were not logged. This makes me think that something else is going on. If I could find a way of duplicating the 10054 error, then I might be getting somewhere.

On a seperate note, they use Windows 2003 server, last week the system event log became corrupt on 2 seperate occasions. This may also be related to the freezes thay are having. On  thursday we changed all of the NICs to Half Duplex from the original setting of 'Auto'. We had not reported errors on friday, although I worry that my customer is just getting used to rebooting rather that reporting the error.

Any further comments would be most useful.
0
 
LVL 24

Expert Comment

by:Joe Woodhouse
ID: 20917198
Autonegotiation can indeed can some network mischief if they'll not all playing together nicely.

If there was no KEEPALIVE setting then it means you're getting default behaviour of 2 hours... but yeah, from your description that probably isn't the root cause, just making things worse when the problem occurs.

I was asking because I wasn't confident I'd earned the points. Now I know I haven't. Will keep thinking about this for you.
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I guess it is not common knowledge to most Wintel engineers/administrators: If you have an SNMP-based monitoring system in your environment (and it's common to have SNMP or Syslog) it's reasonably easy to enable monitoring of the Windows Event logs,…
On July 14th 2015, Windows Server 2003 will become End of Support, leaving hundreds of thousands of servers around the world that still run this 12 year old operating system vulnerable and potentially out of compliance in many organisations around t…
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…
Kernel Data Recovery is a renowned Data Recovery solution provider which offers wide range of softwares for both enterprise and home users with its cost-effective solutions. Let's have a quick overview of the journey and data recovery tools range he…
Suggested Courses

607 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question