Redhat Cluster Fencing Method for San Boot Nodes


We have two nodes redhat cluster,
tc1 & tc2
Existing Cluster Conf tcbi1 & tcbi2

Service:XYZ, ABC
Fencing Method = Brocade
OS = Installed on Raid 1 on each server using local disks.
15 Luns connected.
2 NICs = 1 Public and another for Heartbeat on each server.

Now we are planning to upgrade our hardware with the same mentioned configuration. However one difference will be there the two new nodes are going to boot from SAN. my question is can we use same fencing method?

if we used same brocade fencing method after fencing server will lose storage access as OS is booting from SAN leading system to crash.

** what are my options?

Our New Hardware:

Server : Fujitsu BX960 booting from SAN.
Storage : EMC CX4-120
OS : Redhat Linux 5.7 64x OS.
we are using EMC powerpath 5.6 for multipathing.

looking for experts help on this.

Thank You
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Kerem ERSOYPresidentCommented:

You can move your cluster to the new hardware. Since they will be booting from the SAN you'll need SAN drivers in the booting initrd if not already there. Fencing is used to isolates and reboot a node if it goes unresponsive. So it has nothing to do where you boot your system. There's no limitation about where you boot your system.

AhmedQmeAuthor Commented:
Thank You for your quick replay.

Existing Clustered server are booting from local disks and we are using brocade switches as fence/unfence method.

/sbin/fence_brocade -l myusername -p mypassword -o disable -n 1 -a
/sbin/fence_brocade -l myusername-p mypassword -o disable -n 1 -a

Now we configured new disk less servers. booting from SAN.
My question is can we use fencing with brocade for SAN boot servers?
is the fenced server going to loose storage connectivity? if yes, then i need your help to guide us for alternative solution.

Thank You
Kerem ERSOYPresidentCommented:
As I told you fencing and booting are not related doing one does not have any effect on the other. So You can boot wherever you want including Local Disk, NAS, HBA ... but still get fencing work for you over the SAN switch.

Fencing is just a way the server to isolate itself from the cluster and reboot. As I see I am talking of not being a cluster member to prevent 2 systems are accessing the resources  concurrently. It does not remove SAN connectivity. Even if fencing notices that the node is not responding it will drop it from the cluster but not interfere with SAN connectivity so that you'll be accessing your system volumes. Otherwise it might disrupt the system easily since it is not a good method to remove storage without proper dismounting.

SolarWinds® IP Control Bundle (IPCB)

Combines SolarWinds IP Address Manager and User Device Tracker to help detect IP conflicts, quickly identify affected systems, and help your team take near instantaneous action. Help improve visibility and enhance reliability with SolarWinds IP Control Bundle.

Kerem ERSOYPresidentCommented:
In fact I understand you are concerned about fencing to remove the SAN connection as part of the fencing process. But you should understand that you would have 2 different LUNs over the SAN. One for booting and the other for the Cluster. When fencing kicks of it has 2 methods to isolate the node from the SAN (considering your connection):

- Disabling a Fibre Channel switch port,
- Revoking a host's SCSI 3 reservations.

It is obvious that if you're attaching your system to SAN over a single port you can't just use disable the port. What you'd do is either use 2 ports one for the boot one for the Cluster storage if you want to use Disabling the switch port. If all you have a single port then you'll revoke the SCSI 3 Reservation for the cluster volume so that you only disable cluster access but not the system volume access..

BTW. Why do you boot your system over the SAN? This might create a single point of failure if your SAN connection is not redundant. I always prefer to have at least 2 drives and boot from a RAID-1 Array.

Fencing is the method of rebooting or totally isolating a server in order to prevent split brain (2 nodes accessing the same storage)

Most fence methods that I am aware of reboot the server that can't be communicated with. This is done via turning off the power via apc switch, remote mgt such as Ilo, ilom, rsa etc...

The last option if supported in your config is to use scsi reservations which I have not done and don't know much about. This option may allow you to terminate devices at a lun level.
Kerem ERSOYPresidentCommented:
As far as I understand this is not the problem jgiordano. Everybody is aware that fencing reboots the system with some methods. The question is whether it boots the system properly or suddenly severing te connections so that system boot volumes suffer from corruption or not.
He was asking for options, I suggested looking into scsi reservations.

as per the cluster documentation -

7.1.3. SCSI Fencing with Persistent Reservations
Red Hat Cluster Suite is able to perform fencing via SCSI persistent reservations by simply removing a node's registration key from all devices. When a node failure occurs, the fence_scsi agent will remove the failed node's key from all devices, thus preventing it from being able to write to those devices.

This might allow fencing without ripping the storage out from underneath a running OS and causing possible corruption.
Kerem ERSOYPresidentCommented:
I'd already told this please see my note: ID: 37795602.
AhmedQmeAuthor Commented:
Thank You for your answers its really helped a lot.

i already configured cluster and its working fine but still i am have problem with fencing.

also i contacted red hat support they suggested to use fence_ipmilan

<clusternodes> <clusternode name="niacnica1-priv" nodeid="1" votes="1"> <fence> <method name="1"> <device name="ilo3_1" action="reboot"/> </method> </fence> </clusternode> <clusternode name="niacnica2-priv" nodeid="2" votes="1"> <fence> <method name="1"> <device name="ilo3_2" action="reboot"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_ipmilan" ipaddr="" login="Admin" name="ilo3_1" passwd=*** lanplus="1" method="cycle" power_wait="4"/> <fencedevice agent="fence_ipmilan" ipaddr="" login="Admin" name="ilo3_2" passwd=*** lanplus="1" method="cycle" power_wait="4"/> </fencedevices> 

Open in new window

unfortunately ipmilan is not working returns with error
ipmilan: ipmitool not found! failed: failed to initialize

when i execute this command it works fine : Fujitsu Remote Service board

fence_rsb -l username -p password -o status -i
it replies "Server is On"

I am not able to use this line in cluster.conf, it gives parameter error

need your help on this problem

also i need to know what cluster services should be in the startup ?

Kerem ERSOYPresidentCommented:
There's a bugfix issued to fix the problem with ipmilan. You nened to update it.:


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
AhmedQmeAuthor Commented:

Thanks for your help. it worked with ipmilan.

my last question :

as i mentioned we have 4 Fujitsu BX960 two of Server are Vmware ESX. all servers are connected using blade switch which is connected to a cisco switch using trunk port to allow all vlans. ESX Servers are using something called v-switch. i can assign any vlan i want to the ESX Servers with no issue.

my question is on linux server :
i have 4 NICs on the server server already connected by ( etherchannle trunk port ) one nic is already configured for heart beat network. i would like to team all the three remaining NICs. also i want to configure it with specific vlan. by default i am getting vlan 1 ip address. is it recommended to configure using vlanX for production environment?

also its not possible to configure it by assigning the internal server port to specific vlan because the limitation of the uplink ports we are having.
Kerem ERSOYPresidentCommented:

It is better to create another question for that. It needs to be discussed as a different question. Also if we discuss everything here it would be impossible for people searching for specific answers and new questions added as an extension to the original one will be lost forever.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.