Pre-testing ssh connection

Posted on 2007-10-05
Last Modified: 2013-12-16
I have to administer many RHEL4 linux nodes from my desktop. I do this by passing commands from the desktop to remote nodes through a trusted ssh channel. It works fine except when the remote node is in a semi-dead state such that network is alive, remote ssh server accepts connection but does not do any thing beyond this. As a result ssh connection neither fails nor work completely i.e. my command passing hangs. I can interrupt it by a ctrl-c and go to the next node which works only if I am running the script interactively. How can I skip a node in a cron job?

I tried pre-testing ssh connection by "ssh -o BatchMode=yes -o ConnectTimeout=2 nodexx /bin/true" but this does not timeout after 2 seconds.
Question by:vinod
    LVL 7

    Expert Comment

    There are perhaps a few ways to get around it.

    In your ssh_config file (/etc/ssh/ssh_config in Slackware), you have the option for a couple of variables that might help.

                 Specifies the number of tries (one per second) to make before exiting.  The argument must be an integer.  This may be useful
                 in scripts if the connection sometimes fails.  The default is 1.

                 Specifies the timeout (in seconds) used when connecting to the SSH server, instead of using the default system TCP timeout.
                 This value is used only when the target is down or really unreachable, not when it refuses the connection.

    The 'ConnectTimeout' might help you out here - if it doesn't get a full connection in X amount of time, it should disconnect the session and drop back to shell.   In Slackware, the default ConnectTimeout is 0 - or disabled.  (Actually, 0 is even commented out.  So it's probably 0 by default).   I haven't logged into one of my RH boxes to check this one.

    Another workaround would be to have a process 'watch' the SSH stream, and keep an eye out - if it doesn't see the shell prompt within X seconds, terminate and go to the next.  

    Hopefully the ConnectTimeout fixes the problem.  

    Author Comment

    As I said in my original posting:

    I tried pre-testing ssh connection by "ssh -o BatchMode=yes -o ConnectTimeout=2 nodexx /bin/true" but this does not timeout after 2 seconds.

    ConnectTimeout given on command line should override the default, but it does not work:(
    'watch' also works interactively. I need something that works in batch mode.
    LVL 7

    Expert Comment

    If you need it working in batch mode, I'd try the ConnectTimeout in the main configuration file, rather than command line.  It may be that it won't work correctly from the command line.

    You _could_ run a loop that first attempts a telnet session to the ssh port - if the SSH isn't responding correctly, it won't give the right answer.  

    You might test it with the next 'hung' server - do a telnet to SSH.

    It should give you something like the following
    Connected to localhost.
    Escape character is '^]'.

    run the telnet, break the connection (run telnet, pipe the input to a file, capture the PID, wait five seconds, kill the PID), parse the output from telnet, pass a boolean to make SSH run or skip to the next machine.  

    Additional options, that may or may not help


    Also, I don't know if it helps, but here's a link to a suggestion to another person with the same issue.
    LVL 7

    Expert Comment

    As I don't know if my suggestion could help, I have no objections to either having it finalized, or simply removed.   I could see the information assisting someone else, but as it's incomplete, the assistance would be minor.

    Author Comment

    I got around this problem by adding a host_alive shell function that tests ssh connection in the back ground, returns 0 if success else cleans up hung ssh and returns 1. The main loop passes command via ssh only if host_alive tests OK.

    Since this solution was inspired by suggestions from Bibliophage, moderator may award the points to him/her.


    # Run a command on a remote host via ssh only if the remote sshd
    # is actually responding to ssh connections. ssh keys are assumed
    # to be already set up.

    host_alive ()
        ping -c 1 -q -w 5 $host >/dev/null 2>&1;
        if [ $? -ne 0 ]; then
            echo $host does not ping;
            return 1;
            ssh root@$host /bin/true >/dev/null 2>&1 & timeo=50;  # run the test in bg with a timout of 5 secs
            while [ $timeo -gt 0 ]; do
                pid=`/bin/ps auwx | grep "ssh root@$host /bin/true" | grep -v grep | awk '{print $2}'`;
                if [ "$pid" ]; then
                    usleep 100000;
            if [ "$pid" ]; then
                kill -9 $pid >/dev/null 2>&1;
                echo $host pings but does not ssh;
                return 1;

    # The main loop

    while read h; do
      host_alive $h && ssh $h my-command;
    done < hosts.lis
    LVL 7

    Expert Comment

    No real objection.  I may have pointed him the right way, but he came up with his own solution.
    LVL 1

    Accepted Solution

    PAQed with points refunded (125)

    EE Admin

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Better Security Awareness With Threat Intelligence

    See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

    I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
    The purpose of this article is to demonstrate how we can use conditional statements using Python.
    Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
    Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now