• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 464
  • Last Modified:

SFT III primary engine confusion

I am trying to restablish mirroring between two SFT III servers. On server is up and running fine.
I duplicated the drive from the working server and installed it in the second server. I edited the IOSTART.NCF to change the IOengine name and internal network, then booted it with the MSL disconnected so I could load install and delete the SYS volume.
I shut down both servers, booted the working server into the IOengine, then booted the second. IO engines establish aconnection OK. When I type ACTIVATE SERVER, the servers abend with the following error:
"partitition object not found in PartitionMappingTable".
messages followed indicating that the system is seeing the second server as the primary engine and implies to me that it's trying to use the non-existant SYS volume.
Where did I go off-track?
How does SFT III determine which engine is the primary engine?
500 points offered to speed response.
0
scarpenter104
Asked:
scarpenter104
  • 11
  • 4
  • 3
  • +1
1 Solution
 
ShineOnCommented:
Did you go through the Install mirror/unmirror process?  It could be that since you duplicated the drive rather than using the mirror process, it has the identifiers for the primary on the secondary...

Just a thought.
0
 
ShineOnCommented:
Do that before "activate server" but after the IO engines establish the link, and do it from the primary.
0
 
umang505Commented:
Hi
Pls check the following link if you have n't already done so:
http://www.novell.com/documentation/nw42/index.html?superenu/data/hosohu3i.html

SFT III Error Log Files
When a failure occurs, SFT III updates three error log files in the SYS:SYSTEM directory:

IO$LOG.ERR records the activity of both IOEngines.
SYS$LOG.ERR records the MSEngine activity.
MSSTATUS.DMP records status dumps of engine states, synchronization and communications states, IOEngine to MSEngine requests, and other information following a failure or server switchover.

Use these error log files to track the events that occurred prior to a failure or following a switchover.

NOTE:  The IO$LOG.ERR file on the failed server is written to its boot partition until the servers come back up. Then, the IO$LOG.ERR file from the boot partition is appended to the IO$LOG.ERR file on volume SYS:.


Should This Machine Become the Primary Server? Message Appears
The message above appears on the secondary server's console preceded by

All communication channels with the primary server have failed. Since the IPX network communication channel failed before the mirrored server link failed, the secondary is unable to determine if the primary server is still active.
Verify that the primary server has failed, and then type Y. If the primary server is still active, type N.



Best of luck
0
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
PsiCopCommented:
Wow. This is an oldie. Someone still using stuff EOLed years ago, when better functionality is included in modern versions.

Setting the Way Back Machine to 1993, Novell TID #2013270 (http://support.novell.com/cgi-bin/search/searchtid.cgi?/2013270.htm) suggests this will happen if the partitions are not absolutely identical.

What broke the mirror originally? Has one server's hardware been replaced? Novell TID #1200754 (http://support.novell.com/cgi-bin/search/searchtid.cgi?/1200754.htm) indicates that RAM type is crucial, in terms of making sure the machine has the right type.

You're not specific on the VERSION of NetWare, so if you have NetWare 4.1x, TID #1004097 (http://support.novell.com/cgi-bin/search/searchtid.cgi?/1004097.htm) suggests tweaking the load ordering. Similarly, TID #2934550 (http://support.novell.com/cgi-bin/search/searchtid.cgi?/2934550.htm) suggests some troubleshooting ideas, if this is NetWare v4.1x.
0
 
umang505Commented:
Hi Psicop
"This is an oldie. Someone still using stuff EOLed years ago, when better functionality is included in modern versions."
If ain't broke don't fix it.
If your server can run for >500 days without any sort of problem why Upgrade it.
It's the point of view of Many Companies.
:-)


0
 
PsiCopCommented:
If it were just upgrading for the sake of upgrading, without any fundamental improvement in the environment, I'd agree with you, umanq505

The replacement for SFT, Novell Cluster Services, offers a number of advantages. No special hardware (like the MSL) is needed. The servers don't have to be identical. NCS can accept an iSCSI target for the shared disk. The process of installing and configuring is a lot closer to the installation/configuration of a non-clustered environment, there's no MSTART.NCF, IOSTART.NCF, etc. etc. to keep up with.

And if the purpose being served here is really so mission-critical that it needs SFT III, what is it doing running on a NOS that was EOLed 5 YEARS ago?

One more thing - when that MSL board dies, how easy do you think its going to be to find a replacement?
0
 
umang505Commented:
Hi
I hope this is not taking the discussion away from the question, but anyway, sometimes it is not in the hands of One person to take the decision regarding an Upgrade issue.You have all the points in favor of Upgrade, but try telling Higher-ups that we need an X-amount to upgrade the server that is sitting pretty for last 7-8 years, along with all the hours spent on reloading Client Software ( everybody doesn't use ZEN as they should be)?
Yeah but if it is so important a server for SFT-III, then you are absolutely RIGHT.
0
 
ShineOnCommented:
I still say, that if the servers weren't mirrored properly, and I don't think making a dup copy through whatever process was followed (possibly setting up a mirrored pair in the same server and breaking the mirror?) making sure gets done properly for SFTII using INSTALL.NLM from the primary (IO1) server, mirroring to the secondary(IO2) server, might be the answer.

See this TID: http://support.novell.com/cgi-bin/search/searchtid.cgi?/2936924.htm

This is working under the assumption that these 2 servers are the exact same servers that used to be the mirrored pair and the mirror got broken somewhere along the line, and this is the attempt to remirror, with the hardware all still being identical.
0
 
ShineOnCommented:
I know I typed SFTII instead of SFTIII.  Fingers not worrrkinnnnggg. .  err..  kkk kkk... |-P
0
 
scarpenter104Author Commented:
I'd appreciate if the discussion about SFT vs. clustering and how dumb it is to try to fix an SFT server could be taken elsewhere. It's making it more difficult to sort out the people really trying to help.

"this will happen if the partitions are not absolutely identical"
They are. The HD in server two is a bit-for-bit duplicate of the drive in server 2. The only difference is the SYS volume has been deleted.

To answer a few more questions, the version is 4.11, the two servers are absolutely identical, and the missing mirror partition was deleted before the drive was duplicated. So server one contains a fully functioning server with an unmirrored drive, and server two contains the exact same drive with SYS volume removed, and the name and interal net address changed in IOSTART.NCF.

If I bring server one up to the IOengine first, then bring server two up to the same point, I get the MSL connection established message. Is the first server I brought up considered primary or does that get determined after activation?

When I type ACTIVATE SERVER from the server one it starts to mount SYS and hangs. If I go back to the IOengine console I find the error message I referred to above. So it's not really possible to use INSTALL.NLM to set up mirroring because SYS has to mount before I can load get a system console or load INSTALL.

What if I moved the drive from the second server into the first and established the mirror. Would this produce the same result as mirroring them on the second server? I'd like to solve the problem but I would settle for sidestepping it.

As far as load order goes, I'll need to go back out to the client site and do a little study on that.
0
 
ShineOnCommented:
I think the problem here is a misunderstanding of mirroring.  Mirroring with SFTIII is done from the primary server to an available, identical partition on the secondary server.  That is entirely different from mirroring the hard drive.

Do you have the manual for the version of SFTIII your client has?  I don't know if it is exactly the same process as with 4.2, but the documentation for how to mirror with SFTIII on 4.2 can be found on the Novell documentation page.

Check out the section titled "Install SFTIII servers"

http://www.novell.com/documentation/nw42/index.html
0
 
scarpenter104Author Commented:
"Mirroring with SFTIII is done from the primary server to an available, identical partition on the secondary server.  That is entirely different from mirroring the hard drive."

What's the difference between what you describe and mirroring of two drives in the same server? Netware 4 mirroring is always mirroring to "an available, identical partition". The process is the same, the partition on the mirror drive is the same. Where am I missing the difference? It appears to me that the only difference is that in one case the mirroring is done on the SCSI bus, the other it's done over an MSL link.
0
 
ShineOnCommented:
The difference is SFTIII
0
 
ShineOnCommented:
The install that does the mirroring for SFTIII isn't the install.nlm that you use to load new programs or patches or whatever.  It's the install in the IO Engine.
0
 
ShineOnCommented:
If nothing else, use that install program to check the mirror status, before trying to do ACTIVATE SERVER.
0
 
ShineOnCommented:
Also, why would a NetWare server work without a SYS vol, even if it is a secondary server in an SFTIII pair?  For it to take over for the primary in a failover situation, it has to run NetWare, which resides on SYS vol.

I guess I'm a bit rusty on SFTIII - I haven't seen it in operation since, oh, 1995 or so, but it seems to me that when you ACTIVATE SERVER (on IO1) it does its thing with the IO engine mirroring in the background to the IO2 server's mirrored partition, so when you're doing ACTIVATE SERVER, and there's no SYS for it to mirror to in the background, it urps.

Start with a clean NetWare partition, with just the DOS partition and SFTIII files on IO2, and use the IO engine Install program do do the mirroring of the NetWare partition from IO1.  I think the initial mirror has to happen through install running on the IO2 server, telling it to mirror the IO1 partition.

Maybe you could mirror the partition directly, bitwise, like you did and then re-establish the mirror using the IO engine install program (again, before doing ACTIVATE SERVER) but you should mirror both SYS and VOL1, IMHO.  You say you're vlowing away SYS vol before trying to reestablish the server pair, but I don't think you should.  remember, they aren't really two separate NetWare servers in an SFTIII pair - just 2 separate physical boxes.  The only identity difference is in the IOSTART.NCF... the MSL engine makes it a single system image...
0
 
scarpenter104Author Commented:
Actually the server should come up and run with a SYS volume on only one server, you just aren't fault tolerant at that point. Like any mirror, both servers see the mirror set as one volume so the secondary will start up from the primary SYS volume if mirroring hasn't been established. I think you're mistaken about running the INSTALL in the IO engine. INSTALL.NLM won't run in the IO engine. Even if it did the CDM won't load in the IOEngine so you wouldn't have any drives to set up. You have to set up the mirror from the MSengine, not the IOengine.

At any rate, I just returned from the client site with new info. I tried moving the MSL driver to the beginning of the IOSTART.NCF with no change. Then I booted both systems to the IOengine and typed ACTIVATE SERVER -NS to keep the MSSTART.NCF from mounting SYS and all the other system files from loading. Then I manually loaded CPQSHD.CDM and the IOengines abended. So it appears now that the problem may be an incompatibility between the SCSI driver and the MSL. Interestingly, if I boot one server with the other shut down and activate it comes up just fine. So it's not a compatibility issue with having the MSL driver loaded, just having an MSL link active.
0
 
ShineOnCommented:
I "misspoke" when I said "install.nlm" - the mirroring in SFTIII is done at the IO engine level with the install program that's on C:.

Even if you disagree, the documentation doesn't.
0
 
scarpenter104Author Commented:
Got a pair of Smart Array 3200s  off ebay and installed them in the server.Still had partition problems, so this is the procedure I followed to resolve the problem:
- Booted into DOS and used FDISK to delete the Netware partition.
- Attached an drive with a copy of the original Netware partition on it.
- Booted MSERVER and typed ACTIVATE SERVER -ns -na.
- Loaded INSTALL.NLM and recreated the Netware partition.
- Mirrored the original partition with the newly created partition and allowed syncronization to complete.
- Shutdown, removed extra drive and rebooted normally.
- Loaded INSTALL.NLM and deleted missing mirror segment leaving me with a clean, new Netware partition with all the old data on it.
- Booted second server to DOS and removed Netware partion.
- Ran MSERVER on both servers and allowed MSL to connect.
- Activated, ran INSTALL and created mirror volume on second server (from the MSengine, not the IOengine).
After syncronization was complete, everything functioned properly.

Giving the points to PsiCop because it was the documents that he referenced that got me moving in the right direction. Oh and PsiCop, I did have to replace the MSL boards and they were fairly easy to find at $20 each. ebay is your friend;^)
0
 
ShineOnCommented:
Sounds to me like you did a lot more work than was necessary. It strikes me that had you followed the instructions (gasp) it might have gone smoother.  Just my $.02.

eBay *is* nice, but I wouldn't bet my business on always finding needed parts there.  Maybe, if you can find 'em, you should buy a couple of spare MSL's of the same make/model just to have them on the shelf.
0

Featured Post

Vote for the Most Valuable Expert

It’s time to recognize experts that go above and beyond with helpful solutions and engagement on site. Choose from the top experts in the Hall of Fame or on the right rail of your favorite topic page. Look for the blue “Nominate” button on their profile to vote.

  • 11
  • 4
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now