lennagy
asked on
Undesirable early termination - Bash shell scripting while loop
I've created a bash shell script that loops through a command file. This command file performs a range of tasks across a range of servers.
WORKS: If I place 'echo' in front of the commands within the loop, then it will show a simulated execution of all tasks on all servers in the list.
DOESNT WORK: After I remove the 'echo' then it will successfully perform all the tasks for the first server, then never do anything for the subsequent servers!
A heavily edited version for clarity and security of the code is as follows ... ( these are the major utilities used in actual script )
while read recLabel recSourcePath recDestinationServer recDestinationPath recDestinationFile
do
ssh ${recDestinationServer} mkdir ${recDestinationPath}
ssh ${recDestinationServer} rm ${recDestinationPath}/*.ta r.gz
rsync -baz -e ssh ${recSourcePath }/${recDestinationFile} username@${recDestinationS erver }:${recDestinationPath }
done < ${parmFileList}
... sample parmFileList ...
app1 /opt/app1/data webserver1 /tmp/update 1.tar.gz
app1 /opt/app1/data webserver2 /tmp/update 1.tar.gz
app1 /opt/app2/data appserver1 /tmp/update 3.tar.gz
app1 /opt/app2/data appserver2 /tmp/update 3.tar.gz
Any thoughts?
I've put in place a temporary solution, which was to cut all the loop code and create another script. I call that script from the loop context with a split of the pid using '&'
This of course works, as parallel; however, I'm still curious why it won't run through the whole parmFileList as this is posted.
WORKS: If I place 'echo' in front of the commands within the loop, then it will show a simulated execution of all tasks on all servers in the list.
DOESNT WORK: After I remove the 'echo' then it will successfully perform all the tasks for the first server, then never do anything for the subsequent servers!
A heavily edited version for clarity and security of the code is as follows ... ( these are the major utilities used in actual script )
while read recLabel recSourcePath recDestinationServer recDestinationPath recDestinationFile
do
ssh ${recDestinationServer} mkdir ${recDestinationPath}
ssh ${recDestinationServer} rm ${recDestinationPath}/*.ta
rsync -baz -e ssh ${recSourcePath }/${recDestinationFile} username@${recDestinationS
done < ${parmFileList}
... sample parmFileList ...
app1 /opt/app1/data webserver1 /tmp/update 1.tar.gz
app1 /opt/app1/data webserver2 /tmp/update 1.tar.gz
app1 /opt/app2/data appserver1 /tmp/update 3.tar.gz
app1 /opt/app2/data appserver2 /tmp/update 3.tar.gz
Any thoughts?
I've put in place a temporary solution, which was to cut all the loop code and create another script. I call that script from the loop context with a split of the pid using '&'
This of course works, as parallel; however, I'm still curious why it won't run through the whole parmFileList as this is posted.
Just a thought, have you got another read command inside the loop?
ASKER
No, sorry, no 'read' statements in the whole file. It's a parameter and configfile-based script.
I am thinking that perhaps the ssh command to 'mkdir' or 'rm' files that can fail if the directory is already created or no files exist to wipe out is somehow jinxing the loop. Of course, those commands can fail; however, trapping the error from the remote system is more complex than $?, which only tests a successful connection to the remote server.
I should try this again without those two statements.
I am thinking that perhaps the ssh command to 'mkdir' or 'rm' files that can fail if the directory is already created or no files exist to wipe out is somehow jinxing the loop. Of course, those commands can fail; however, trapping the error from the remote system is more complex than $?, which only tests a successful connection to the remote server.
I should try this again without those two statements.
replace
ssh ${recDestinationServer} rm ${recDestinationPath}/*.ta r.gz
by
ssh ${recDestinationServer} rm ${recDestinationPath}/\*.t ar.gz
ssh ${recDestinationServer} rm ${recDestinationPath}/*.ta
by
ssh ${recDestinationServer} rm ${recDestinationPath}/\*.t
ASKER
Thanks, ahoffman. I didn't need to escape that because it actually did perform the 'rm' if files existed. Unless you think that causes the problem of the 'while' loop terminating after the first iteration of the loop?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
> .. did perform the 'rm' if files existed. ..
could be the problem, for example is your shell runs with -x
I'd always quote *all* variables.
could be the problem, for example is your shell runs with -x
I'd always quote *all* variables.
ahoffman, any errors inside loop are ignored, even with -x flag.
Such glob expansion will be treated as illegal 'ssh' usage and returns exit code non-zero. Then shell code continues and loop goes to the next read line.
So, that script should either work or not work for each loop invocation. Probably something wrong either with ssh identities or with destination paths.
Such glob expansion will be treated as illegal 'ssh' usage and returns exit code non-zero. Then shell code continues and loop goes to the next read line.
So, that script should either work or not work for each loop invocation. Probably something wrong either with ssh identities or with destination paths.
which shell with -x continues (in a loop) if any used command does not return 0?
ASKER
private key and ssh are working fine otherwise. Besides, with ssh improperly configured it wouldn't work at all, and it is now working as I said on different PIDs &. I've taken the 'exact' same code and put into another script, called it from the loop, and forcing it to 'wait' for the child PIDs. All goes smoothly.
ahoffman: Tried your suggestion, unfortunately, the same results
Nopius: I will try the rsync approach today...
ahoffman: standard Bourne Shell
$ uname -a
SunOS sf250 5.10 Generic sun4u sparc SUNW,Sun-Fire-V250
$ which sh
/usr/bin/sh
$ cat ../bad_loop.sh
for i in *
do
cp $i
done
$ sh -x ../bad_loop.sh
+ cp a
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp b
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp c
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp d
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
$ uname -a
SunOS sf250 5.10 Generic sun4u sparc SUNW,Sun-Fire-V250
$ which sh
/usr/bin/sh
$ cat ../bad_loop.sh
for i in *
do
cp $i
done
$ sh -x ../bad_loop.sh
+ cp a
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp b
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp c
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
+ cp d
cp: Insufficient arguments (1)
Usage: cp [-f] [-i] [-p] [-@] f1 f2
cp [-f] [-i] [-p] [-@] f1 ... fn d1
cp -r|-R [-H|-L|-P] [-f] [-i] [-p] [-@] d1 ... dn-1 dn
ASKER
Thanks for your help. Still uncertain as to the direct cause, but using rsync to manage the remote directories works nicely.
Thanks again!
Thanks again!