Solved

Redhat Cluster Fencing Method for San Boot Nodes

Posted on 2012-04-02
12
1,027 Views
Last Modified: 2012-06-21
Hello,

We have two nodes redhat cluster,
tc1 & tc2
Existing Cluster Conf tcbi1 & tcbi2

Service:XYZ, ABC
Fencing Method = Brocade
OS = Installed on Raid 1 on each server using local disks.
15 Luns connected.
2 NICs = 1 Public and another for Heartbeat on each server.

Now we are planning to upgrade our hardware with the same mentioned configuration. However one difference will be there the two new nodes are going to boot from SAN. my question is can we use same fencing method?

if we used same brocade fencing method after fencing server will lose storage access as OS is booting from SAN leading system to crash.

** what are my options?

Our New Hardware:

Server : Fujitsu BX960 booting from SAN.
Storage : EMC CX4-120
OS : Redhat Linux 5.7 64x OS.
we are using EMC powerpath 5.6 for multipathing.

looking for experts help on this.

Thank You
Ahmed,
0
Comment
Question by:AhmedQme
  • 7
  • 3
  • 2
12 Comments
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37795389
Hi,

You can move your cluster to the new hardware. Since they will be booting from the SAN you'll need SAN drivers in the booting initrd if not already there. Fencing is used to isolates and reboot a node if it goes unresponsive. So it has nothing to do where you boot your system. There's no limitation about where you boot your system.

Cheers,
K.
0
 

Author Comment

by:AhmedQme
ID: 37795512
Thank You for your quick replay.

Existing Clustered server are booting from local disks and we are using brocade switches as fence/unfence method.

example:
#!/bin/bash
/sbin/fence_brocade -l myusername -p mypassword -o disable -n 1 -a 10.77.9.123
/sbin/fence_brocade -l myusername-p mypassword -o disable -n 1 -a 10.77.9.124


Now we configured new disk less servers. booting from SAN.
My question is can we use fencing with brocade for SAN boot servers?
is the fenced server going to loose storage connectivity? if yes, then i need your help to guide us for alternative solution.


Thank You
0
 
LVL 30

Assisted Solution

by:Kerem ERSOY
Kerem ERSOY earned 500 total points
ID: 37795576
As I told you fencing and booting are not related doing one does not have any effect on the other. So You can boot wherever you want including Local Disk, NAS, HBA ... but still get fencing work for you over the SAN switch.

Fencing is just a way the server to isolate itself from the cluster and reboot. As I see I am talking of not being a cluster member to prevent 2 systems are accessing the resources  concurrently. It does not remove SAN connectivity. Even if fencing notices that the node is not responding it will drop it from the cluster but not interfere with SAN connectivity so that you'll be accessing your system volumes. Otherwise it might disrupt the system easily since it is not a good method to remove storage without proper dismounting.

Cheers,
K.
0
 
LVL 30

Assisted Solution

by:Kerem ERSOY
Kerem ERSOY earned 500 total points
ID: 37795602
In fact I understand you are concerned about fencing to remove the SAN connection as part of the fencing process. But you should understand that you would have 2 different LUNs over the SAN. One for booting and the other for the Cluster. When fencing kicks of it has 2 methods to isolate the node from the SAN (considering your connection):

- Disabling a Fibre Channel switch port,
- Revoking a host's SCSI 3 reservations.

It is obvious that if you're attaching your system to SAN over a single port you can't just use disable the port. What you'd do is either use 2 ports one for the boot one for the Cluster storage if you want to use Disabling the switch port. If all you have a single port then you'll revoke the SCSI 3 Reservation for the cluster volume so that you only disable cluster access but not the system volume access..

BTW. Why do you boot your system over the SAN? This might create a single point of failure if your SAN connection is not redundant. I always prefer to have at least 2 drives and boot from a RAID-1 Array.

Cheers,
K.
0
 
LVL 11

Expert Comment

by:jgiordano
ID: 37795619
Fencing is the method of rebooting or totally isolating a server in order to prevent split brain (2 nodes accessing the same storage)

Most fence methods that I am aware of reboot the server that can't be communicated with. This is done via turning off the power via apc switch, remote mgt such as Ilo, ilom, rsa etc...

The last option if supported in your config is to use scsi reservations which I have not done and don't know much about. This option may allow you to terminate devices at a lun level.
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37795682
As far as I understand this is not the problem jgiordano. Everybody is aware that fencing reboots the system with some methods. The question is whether it boots the system properly or suddenly severing te connections so that system boot volumes suffer from corruption or not.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 11

Expert Comment

by:jgiordano
ID: 37796279
He was asking for options, I suggested looking into scsi reservations.

as per the cluster documentation -

7.1.3. SCSI Fencing with Persistent Reservations
Red Hat Cluster Suite is able to perform fencing via SCSI persistent reservations by simply removing a node's registration key from all devices. When a node failure occurs, the fence_scsi agent will remove the failed node's key from all devices, thus preventing it from being able to write to those devices.

This might allow fencing without ripping the storage out from underneath a running OS and causing possible corruption.
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37796356
I'd already told this please see my note: ID: 37795602.
0
 

Author Comment

by:AhmedQme
ID: 37817638
Thank You for your answers its really helped a lot.

i already configured cluster and its working fine but still i am have problem with fencing.

also i contacted red hat support they suggested to use fence_ipmilan

<clusternodes> <clusternode name="niacnica1-priv" nodeid="1" votes="1"> <fence> <method name="1"> <device name="ilo3_1" action="reboot"/> </method> </fence> </clusternode> <clusternode name="niacnica2-priv" nodeid="2" votes="1"> <fence> <method name="1"> <device name="ilo3_2" action="reboot"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice agent="fence_ipmilan" ipaddr="10.194.195.191" login="Admin" name="ilo3_1" passwd=*** lanplus="1" method="cycle" power_wait="4"/> <fencedevice agent="fence_ipmilan" ipaddr="10.194.195.192" login="Admin" name="ilo3_2" passwd=*** lanplus="1" method="cycle" power_wait="4"/> </fencedevices> 

Open in new window



unfortunately ipmilan is not working returns with error
ipmilan: ipmitool not found! failed: failed to initialize


when i execute this command it works fine : Fujitsu Remote Service board

fence_rsb -l username -p password -o status -i 10.10.10.5
it replies "Server is On"

I am not able to use this line in cluster.conf, it gives parameter error


need your help on this problem

also i need to know what cluster services should be in the startup ?


Regards
AhmedQme
0
 
LVL 30

Accepted Solution

by:
Kerem ERSOY earned 500 total points
ID: 37818067
There's a bugfix issued to fix the problem with ipmilan. You nened to update it.:

https://bugzilla.redhat.com/show_bug.cgi?id=164627

Here..
0
 

Author Comment

by:AhmedQme
ID: 37821030
Hi

Thanks for your help. it worked with ipmilan.

my last question :

as i mentioned we have 4 Fujitsu BX960 two of Server are Vmware ESX. all servers are connected using blade switch which is connected to a cisco switch using trunk port to allow all vlans. ESX Servers are using something called v-switch. i can assign any vlan i want to the ESX Servers with no issue.

my question is on linux server :
i have 4 NICs on the server server already connected by ( etherchannle trunk port ) one nic is already configured for heart beat network. i would like to team all the three remaining NICs. also i want to configure it with specific vlan. by default i am getting vlan 1 ip address. is it recommended to configure using vlanX for production environment?

http://www.mysidenotes.com/2008/01/30/vlan-configuration-on-fedora-core-red-hat-centos/

also its not possible to configure it by assigning the internal server port to specific vlan because the limitation of the uplink ports we are having.
0
 
LVL 30

Expert Comment

by:Kerem ERSOY
ID: 37822844
Hi,

It is better to create another question for that. It needs to be discussed as a different question. Also if we discuss everything here it would be impossible for people searching for specific answers and new questions added as an extension to the original one will be lost forever.

Cheers,
K.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

The 6120xp switches seem to have a bug when you create a fiber port channel when you have a UCS fabric interconnects talking to them.  If you follow the Cisco guide for the UCS, the FC Port channel will never come up and it will say that there are n…
Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
This demo shows you how to set up the containerized NetScaler CPX with NetScaler Management and Analytics System in a non-routable Mesos/Marathon environment for use with Micro-Services applications.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now