troubleshooting Question

Active/Active Clustering using GFS2

Avatar of Michael McGovern
Michael McGovernFlag for United States of America asked on
LinuxHardwareStorage HardwareServer Hardware* CentOS
1 Comment1 Solution318 ViewsLast Modified:
We're trying to setup a two-node active/active cluster on CentOS 7 using Pacemaker/Corosync/DLM/CLVMD/Fencing with a shared LUN over multipath iSCSI connections. The trouble i'm having is when I add the GFS2 filesystems as a resource to the cluster but I get an error. When I do pcs resource debug-start <resource-id> on both nodes, it was able to mount the filesystem. I was able to write file from one node and see it on the other and vice versa. Here's the pcs status. I have a deadline on this project to get it working by early next week!

[root@ddc-testwp1 /]# pcs status --full
Cluster name: WPCluster

Following stonith devices have the 'action' option set, it is recommended to set 'pcmk_off_action', 'pcmk_reboot_action' instead: vmfence1

Stack: corosync
Current DC: ddc-testwp2 (2) (version 1.1.19-8.el7_6.4-c3c624ea3d) - partition with quorum
Last updated: Thu Feb  7 19:58:37 2019
Last change: Thu Feb  7 18:30:25 2019 by root via cibadmin on ddc-testwp1

2 nodes configured
8 resources configured

Online: [ ddc-testwp1 (1) ddc-testwp2 (2) ]
Full list of resources:

vmfence1       (stonith:fence_vmware_soap):    Started ddc-testwp2
vmfence2       (stonith:fence_vmware_soap):    Started ddc-testwp1
Clone Set: dlm-clone [dlm]
     dlm        (ocf::pacemaker:controld):      Started ddc-testwp1
     dlm        (ocf::pacemaker:controld):      Started ddc-testwp2
     Started: [ ddc-testwp1 ddc-testwp2 ]
Clone Set: clvmd-clone [clvmd]
     clvmd      (ocf::heartbeat:clvm):  Started ddc-testwp1
     clvmd      (ocf::heartbeat:clvm):  Started ddc-testwp2
     Started: [ ddc-testwp1 ddc-testwp2 ]
Clone Set: wpshared_rsc-clone [wpshared_rsc]
     wpshared_rsc       (ocf::heartbeat:Filesystem):    Stopped
     wpshared_rsc       (ocf::heartbeat:Filesystem):    Stopped
     Stopped: [ ddc-testwp1 ddc-testwp2 ]

Node Attributes:
* Node ddc-testwp1 (1):
* Node ddc-testwp2 (2):

Migration Summary:
* Node ddc-testwp1 (1):
   wpshared_rsc: migration-threshold=1000000 fail-count=1000000 last-failure='Thu Feb  7 19:53:55 2019'
* Node ddc-testwp2 (2):
   wpshared_rsc: migration-threshold=1000000 fail-count=1000000 last-failure='Thu Feb  7 18:06:51 2019'

Failed Actions:
* wpshared_rsc_start_0 on ddc-testwp1 'not installed' (5): call=27, status=complete, exitreason='Couldn't find device [/dev/wpsharedvg/wpsharedlv]. Expected /dev/??? to exist',
    last-rc-change='Thu Feb  7 19:53:55 2019', queued=0ms, exec=96ms
* wpshared_rsc_start_0 on ddc-testwp2 'unknown error' (1): call=30, status=complete, exitreason='Couldn't mount device [/dev/wpsharedvg/wpsharedlv] as /wpshared',
    last-rc-change='Thu Feb  7 18:06:50 2019', queued=0ms, exec=168ms

PCSD Status:
  ddc-testwp1: Online
  ddc-testwp2: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

Please help!
Michael McGovern
VP of Information Technology

Our community of experts have been thoroughly vetted for their expertise and industry experience.

Join our community to see this answer!
Unlock 1 Answer and 1 Comment.
Start Free Trial
Learn from the best

Network and collaborate with thousands of CTOs, CISOs, and IT Pros rooting for you and your success.

Andrew Hancock - VMware vExpert
See if this solution works for you by signing up for a 7 day free trial.
Unlock 1 Answer and 1 Comment.
Try for 7 days

”The time we save is the biggest benefit of E-E to our team. What could take multiple guys 2 hours or more each to find is accessed in around 15 minutes on Experts Exchange.

-Mike Kapnisakis, Warner Bros