Solved

Bash Shell Sript to Echo and write to text file.

Posted on 2013-12-17
9
485 Views
Last Modified: 2013-12-19
Hi, I am working with a ROCKS cluster on RHEL 5.9. It is running PBS as the grid. I want to make sure that all ot the nodes of the cluster are responive, so I need a script that will possibly grab the name, date and time stamp from each node and write that info back to a text file on the front end node.

Can someone provide a quick example? Google is of too much help on the subject!
0
Comment
Question by:capperdog13
9 Comments
 
LVL 40

Assisted Solution

by:omarfarid
omarfarid earned 25 total points
ID: 39725921
You can run a crontab job on each node, say every 5 min, that will run

myhost=`hostname`
/usr/bin/hostname > /tmp/$myhost
/usr/bin/date >> /tmp/$myhost

You could then schedule an ftp or sftp job to copy the files to the front server
0
 
LVL 19

Assisted Solution

by:simon3270
simon3270 earned 25 total points
ID: 39725985
Or, if you have password-less logins to the nodes in the cluster (with ssh keys), you could run this as root on your central server.  Have a file with just a list of node names in it called hosts.lst, then:
while read host; do
    rsp=$(ssh $host 'echo $(hostname): $(date) 2>/dev/null' 2>/dev/null </dev/null)
    if [ "$rsp" != "" ]; then
        echo $rsp
    else
        echo Could not connect to $host at $(date)
    fi
done < hosts.lst > hosts.log

Open in new window

0
 

Author Comment

by:capperdog13
ID: 39726795
Great! Let me work with this and I will respond later today. The nodes do require a password, so the rsh script I don't think will apply.

Many thanks! Will get back with you.
0
 
LVL 28

Expert Comment

by:serialband
ID: 39728348
Do you have the ganglia roll installed and enabled?  You can just use that to track all the compute nodes.

Rocks also includes the tentakel command to query all the hosts more quickly, since it forks all the calls at once.  It should be set up if you loaded all the compute nodes with the rocks installer.  If you want the results to come back in order, you can sort the results afterwards.  The while loop could take quite a while if you have a lot of compute nodes.

It's much simpler to run this line to query all the hosts simultaneously.  Your results will likely come back out of order, but it'll be much faster than running the while loop and waiting for each node's network to respond.

tentakel "hostname; date; hostaname" >> compute_nodes.txt

If I remember correctly, I think you actually just need

tentakel date >> compute_nodes.txt

since tentakel already outputs the hostname of the system with the command.

The head node should have an ssh key automatically installed on each of the compute node already.  You shouldn't need a password when you run tentakel or ssh to the compute nodes, unless the installer messed up somehow or the system becomes corrupted by the users code crashing.  That does happen frequently enough when you have hundreds of systems, but the compute nodes should be easy and quick to reinstall.

http://www.rocksclusters.org/roll-documentation/base/5.5/index.html  You can install other linux distros with Rocks.  Rocks 6 is out and that supports Redhat 6
0
Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

 

Author Comment

by:capperdog13
ID: 39729691
Hi yes we do have Ganglia installed and from the Web Front End all looks fine. Thanks for the tentakel date >> compute_nodes.txt It says all is fine as well.

I was just handed this old POS, so is it safe to say that from a high level that this cluster is functioning as it should relying on the tentakel cmd and the Gaglia front end??
0
 

Author Comment

by:capperdog13
ID: 39729706
Also, I do notice one problem you may be able to help with. The nodes are not reloading when I tell them to on a hard reboot. PXE is enabled on the nodes and they do make contact with the front end, but the frontend never sends a packet for the reload, time out occurs and the node boots back up to old image.

Any suggestions here?
0
 
LVL 28

Accepted Solution

by:
serialband earned 200 total points
ID: 39730033
From a high level, if Ganglia shows the system as functional and tentakel returns ok, then you should be good to go.

Your system is set to boot instead of install.  You need to change the setting
on the head node with the rocks command

rocks set host boot compute-0-0 action=install

Once the system is up, the action should revert.  If not, you can set the action back to boot.  You can list the settings with:

rocks list host boot

--

Some helpful hints:

The best place to ask rocks questions is through the rocks mailing list.  They have more experienced users as well as the developers checking the list.  You can sign up here.  https://lists.sdsc.edu/mailman/listinfo.cgi/npaci-rocks-discussion

It's been a year since I touched a Rocks cluster.  It depends on the error message.  Problems happen frequently with rocks when users run their computations on the head node.  Keep your users from running their processes on the head node.  

The other thing to check is that the rocks kickstart directories are working.  They sometimes get corrupted and don't show up properly.  You need to check that Apache is started correctly on the head node and that it's sharing the rocks directories for the compute nodes to connect to.  They need them to install.

Sometimes the compute nodes get corrupt and you just need to stick a live distro on them to completely wipe the  partition.  Unfortunately, the old kickstart on Redhat 5.x  doesn't work on a GUID partition, so anything 2 TB or larger needs to be tweaked.  It's simplest, and quickest, to stick a smaller drive in the system as the primary boot and configure kickstart to mount the secondary drive for processing space.

If all else fails, sometimes you just have to redo the head node installation.  This will take some time, but once set up, the compute nodes are quick to install.  They will install very quickly, but out of order on the rack if you turn them on all at once.  If you want them installed in order, you'll have to turn them on one at a time starting with the first one.  You'll need to watch until DHCP accepts them.  They'll automatically be numbered starting with rack 0, computer 0.
0
 

Author Comment

by:capperdog13
ID: 39730287
Hey thanks a bunch for all the info! I come from a Windows background and was literally tossed into the sea of Linux and told to fix that cluster...

I did the commands on the head node and forced an install on one of the nodes. I checked it with ROCKS LIST HOST BOOT before I hard rebotted the node, but it still did not reload. The nodes are not getting the info back from the server to reload like I mentioned ealier.

Anyway I am going to post this to the ROCKS site you gave me. You've been a big help!
Many thanks and have a happy holiday!!
0
 

Author Closing Comment

by:capperdog13
ID: 39730300
The original question was about a script to help me check a ROCKS cluster. Simon supplied me with a couple of great examples. thanks Simon! I did get the most help from serial, who has ROCKS experiance and went over and above with tips and links to help out.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I hope you'll find this tutorial useful and interesting. So let's try to extend Tcl with a new package.  For anyone more deeply interested please check out the book "Practical Programming in Tcl and Tk". It's really one of the best written books abo…
Join Greg Farro and Ethan Banks from Packet Pushers (http://packetpushers.net/podcast/podcasts/pq-show-93-smart-network-monitoring-paessler-sponsored/) and Greg Ross from Paessler (https://www.paessler.com/prtg) for a discussion about smart network …
The viewer will learn how to dynamically set the form action using jQuery.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now