Solved

Cluster Services will not start

Posted on 2009-07-02
5
4,167 Views
Last Modified: 2013-12-16
Specifically, we are trying to setup a two-node cluster to provide a highly available apache server. After reviewing the documentation, it appears that shared storage may not be necessary, though we would like to have the document root be on shared storage eventually.


We have followed the steps laid out in the howto:

http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5.2/html/Cluster_Administration/ap-httpd-service-CA.html


However when we try to start the service in Luci, the httpd service fails to start. We get the following errors in /var/log/messages:


Jun 29 16:22:36 habox1 clurgmgrd: [10855]: <err> Stopping Service
apache:Apache_Test_Srvr > Failed
Jun 29 16:22:36 habox1 clurgmgrd[10855]: <notice> stop on apache
"Apache_Test_Srvr" returned 1 (generic error)
Jun 29 16:22:36 habox1 clurgmgrd[10855]: <crit> #13: Service
service:Web_Server failed to stop cleanly
Jun 29 16:26:29 habox1 clurgmgrd[10855]: <notice> Starting disabled
service service:Web_Server
Jun 29 16:26:29 habox1 clurgmgrd: [10855]: <err> Looking For IP
Addresses [apache:Apache_Test_Srvr] > Failed - No IP Addresses Found
Jun 29 16:26:29 habox1 clurgmgrd[10855]: <notice> start on apache
"Apache_Test_Srvr" returned 1 (generic error)
Jun 29 16:26:29 habox1 clurgmgrd[10855]: <warning> #68: Failed to start
service:Web_Server; return value: 1
Jun 29 16:26:29 habox1 clurgmgrd[10855]: <notice> Stopping service
service:Web_Server
Jun 29 16:26:35 habox1 clurgmgrd: [10855]: <err> Checking Existence Of
File /var/run/cluster/apache/apache:Apache_Test_Srvr.pid
[apache:Apache_Test_Srvr] > Failed - File Doesn't Exist
Jun 29 16:26:35 habox1 clurgmgrd: [10855]: <err> Stopping Service
apache:Apache_Test_Srvr > Failed
Jun 29 16:26:35 habox1 clurgmgrd[10855]: <notice> stop on apache
"Apache_Test_Srvr" returned 1 (generic error)
Jun 29 16:26:35 habox1 clurgmgrd[10855]: <crit> #12: RG
service:Web_Server failed to stop; intervention required
Jun 29 16:26:35 habox1 clurgmgrd[10855]: <notice> Service
service:Web_Server is failed
Jun 29 16:26:35 habox1 clurgmgrd[10855]: <crit> #13: Service
service:Web_Server failed to stop cleanly


Can you advise us as to what the problem may be? Let us know if you need
more information.

my cluster.conf file created in web GUI (luci)
 

<?xml version="1.0"?>

<cluster alias="app_server" config_version="16" name="app_server">

        <fence_daemon clean_start="0" post_fail_delay="0" 

post_join_delay="3"/>

        <clusternodes>

                <clusternode name="habox2.nimh.nih.gov" nodeid="1" 

votes="1">

                        <fence>

                                <method name="1"/>

                        </fence>

                </clusternode>

                <clusternode name="habox1.nimh.nih.gov" nodeid="2" 

votes="1">

                        <fence>

                                <method name="1"/>

                        </fence>

                </clusternode>

        </clusternodes>

        <cman expected_votes="1" two_node="1"/>

        <fencedevices/>

        <rm>

                <failoverdomains/>

                <resources>

                        <apache config_file="conf/httpd.conf" 

name="Apache_Test_Srvr" server_root="/etc/httpd" shutdown_wait="0"/>

                        <ip address="172.16.52.151" monitor_link="1"/>

                        <script file="/etc/rc.d/init.d/httpd" 

name="script_test"/>

                        <fs device="/dev/hda2" force_fsck="0" 

force_unmount="0" fsid="36806" fstype="ext3" mountpoint="/HA" 

name="httpd_content" self_fence="1"/>

                </resources>

                <service autostart="1" exclusive="0" max_restarts="0" 

name="Web_Server" recovery="restart" restart_expire_time="0">

                        <apache ref="Apache_Test_Srvr">

                                <ip ref="172.16.52.151"/>

                                <script ref="script_test"/>

                                <fs ref="httpd_content"/>

                        </apache>

                </service>

        </rm>

</cluster>

Open in new window

Document1.pdf
0
Comment
Question by:Justin_Edmands
  • 2
  • 2
5 Comments
 

Author Comment

by:Justin_Edmands
ID: 24768145
need some help!
0
 
LVL 2

Expert Comment

by:JabbaDow
ID: 24770784
I think it might be easier to use hearbeat with DRBD: www.linux-ha.org and www.drbd.org. They work very nicely together. Basically you set up some information in their config files about each other, which IP address they will share, etc. Then at the end of it all, you have the heartbeat process start up, which then starts up the DRBD shared storage and places symlinks on the system to point to the shared storage. Heartbeat then handles the starting and stopping of Apache. It is pretty easy to do, and both projects are very well documented and there are many how-to articles on the web for doing exactly what you want to do.
0
 
LVL 77

Expert Comment

by:arnold
ID: 24774609
IMHO, it is better to load balance a web server rather than set it up in a fail over cluster.
You could use rsync to synchronize the document root data.

You could setup a cluster resource dealing with a specific IP.
This will deal with making the IP "available all the time"

One error I see is that you are not assigning an IP that will move with the web server.

You have to setup an IP that will move between/among the nodes.

0
 

Author Comment

by:Justin_Edmands
ID: 24789982
already got DRBD to work and all. need to do RedHat Cluster Suite
0
 
LVL 2

Accepted Solution

by:
JabbaDow earned 500 total points
ID: 24791698
I have no experience with Red Hat Cluster Suite, but I guess the first thing would be to make sure that you have a virtual IP address (i.e. an address bound to a virtual interface like eth0:0), and make sure that that address is working on the active node of the cluster. When you failover to the other node, that address needs to follow the active node. Then set your Apache to listen on that address. So instead of Listen *:80, you need to have "Listen x.x.x.x:80" apache directive. Make sure that the networking comes up before apache does.
0

Featured Post

Save on storage to protect fatherhood memories

You're the dad who has everything. This Father's Day, make sure your family memories are protected. My Passport Ultra has automatic backup and password protection to keep your cherished photos and videos safe. With up to 3TB, you have plenty of room to hold the adventures ahead.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
docker invalid registry name 2 106
Why is my Splunk Web URL not working? 2 47
nagios 1 21
AD LDAP LDS 3 47
If you have a server on collocation with the super-fast CPU, that doesn't mean that you get it running at full power. Here is a preamble. When doing inventory of Linux servers, that I'm administering, I've found that some of them are running on l…
Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

914 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now