Link to home
Start Free TrialLog in
Avatar of deelew41
deelew41

asked on

Domino server crashing frequently

During the last two weeks, my Domino server (v7.0.2) has been crashing. Sometimes daily, sometimes every other day. Receive the following error message when restarting the Domino Server:
"(servername) has faulted and is now back up and running".
I checked the nsd log but I am not sure what I am looking for; as to the cause of the crash. I ran the nfixup.exe and it seemed to fix things for a couple days, but crashed again. I have several users with mail databases over 10GB. Could this be the cause of the crashes?

Any help will be appreciated!
Thanks!
Avatar of Sjef Bosman
Sjef Bosman
Flag of France image

There are a few potenial causes for this behaviour.

1/ most likely it is one of the many databases on the system, with a corruption somewhere. The crash info can be very interesting, it might supply the necessary clues as to which task was active and with what database.
2/ it might also be a system database, e.g. the router uses one that's called mail.box, or mail1.box. These databases are often used and are a known source of trouble, albeit rare. You could stop the Domino server, rename all files mail*.box, and restart the server. The necessary databases will be re-created by the Router task.
3/ last but not least: upgrade!
Avatar of deelew41
deelew41

ASKER

Thank you for your reply @sjef!
1. I will try and check the crash info and see if I can determine the cause of the crash.
2. If not, I will try and re-create the necessary mail.box databases.
3. I am in the process of getting information on how to complete the upgrade. I know it needs to be done as I cannot get support for IBM for v7.0.2!!! Are you familiar or could you provide instructions on how to upgrade from 7.0.2 to 8.5.3?
About #3: I just read your other question :-)
We had the same problem with full-text index on mail files larger than 10G on 7.0.2..  
Deleted the FT and the server stayed up.  Of course the users are not happy.
Upgrading to 8.5.2 did not fix the problem, by the way, still crashes.
So users now have a replica copy of their mail files locally, encrypted and full-text indexed.
My server is still crashing! I have renamed the mail.box file to bad and had the server re-create another mail.box, I renamed the log.nsf file to bad and had the server re-create the log.nsf file, as well as the ddm.nsf. It ran fine for a couple days and began to crash frequently again. I looked at the nsd log file but could not find anywhere that said the cause of the crash. I am at a loss and need to get this fixed ASAP. I have attached the nsd log file; could someone please take a look and see if you can let me know the cause of these crashes??? I know I need to upgrade the server but I need to get the clients upgraded first. If I can at least have time to upgrade the clients and still have the server run, that would be great. Thanks!!!
nsd-W32I-MLP-LN2-2013-02-08-07-0.log
Difficult to tell... As far as I can tell, the SMTP task crashes, while executing some external script:

 [ 1] 0x7c8285ec ntdll+165356 (1f4,927c0,0,4e0dbf4)
 [ 2] 0x77e61c8d KERNEL32+138381 (1f4,927c0,4e0de0c,3)
@[ 3] 0x6018fe17 nnotes._OSRunExternalScript@4+1111 (4e0de0c)
@[ 4] 0x601909c4 nnotes._FRTerminateWindowsResources+980 (1,0,0,4e0e904)
@[ 5] 0x60190d78 nnotes._OSFaultCleanupExt@20+872 (b74a34,0,0,0,0)
@[ 6] 0x60190dd8 nnotes._OSFaultCleanup@12+24 (0,0,0)
@[ 7] 0x6019c822 nnotes._OSNTUnhandledExceptionFilter@4+178 (4e0e904)
 [ 8] 0x77e761b7 KERNEL32+221623 (4e0e904,77e61ac1,4e0e90c,0)
 [ 9] 0x77e792a3 KERNEL32+234147 (0,0,0,0)

That's all I can see. What the external script is I don't know:

IMHO it's a lot better to upgrade the server first, and the clients afterwards. Why do you prefer to do the clients first?
I thought I had read somewhere online that it was suggested to upgrade the clients before the server! I guess I may have mis-read the post!!! I am unsure what the external script would be either as the server was running before I started working here. The weird thing is that it ran fine up until the last couple of weeks and then it has been crashing regularly. I was asked to run a command to allow winmail.dat files to open in Lotus on this server (and our other Lotus server) recently. Could that maybe be the problem? Good starting point????
ASKER CERTIFIED SOLUTION
Avatar of Sjef Bosman
Sjef Bosman
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I had to add the following commands:

Set config TNEFKeepAttachment=1
Set config TNEFEnableConversion=1
tell router update config

Would this config be in the notes.ini file? Otherwise, where and how do I go about undoing this modification? I did not install anything on the server otherwise. It is the last thing I remember changing/adding on the server before the crashes started happening....at least that is what I remember!!!
Ah... I checked, and found this document.

See the Restrictions...
I found the TNEF commands I added in the notes.ini file. I am going to remove them and see how long the server runs!!! Hopefully I can close this question!!! Thanks for your help!
Or upgrade to 7.0.2FP9 ;-) ...  or maybe even newer...
Yes, I plan on upgrading to at least 8.5.x once I have this issue resolved and receive the hardware I need (it is on order). I need to upgrade since I have no IBM support for this version!!!
Really, check all the documentation you can find on the subject, and then upgrade the server first. Quote: "IBM Lotus recommends upgrading servers before clients".

Have a nice weekend!
Server first for sure.  
Never tried to run the Winmail.dat conversion on the server because it crashes often on the client... and I'd rather crash clients than servers.
Yes, I have unfortunately found that out!!! The article I found said to do it on the server but obviously not!!!

Have a great weekend!
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The server has ran without crashing for the last three days!! I believe it had to do with the command I ran to allow winmail.dat files through the server. Once removed, the server has ran fine. Lesson learned!!!
What lesson exactly? ;-))

Good news, by the way!