farzanj
asked on
Explain bash command
Please explain in detail what is this line doing.
xargs -a $tmpdir/missing -P 20 -L 1 -I '{}' /bin/bash -c 'do_distcp "$@"' _ {}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I'm not 100% sure of the meaning of the underscore as it's a syntax I hadn't seen before, but I think it is specific to xargs and a way of passing the joining the arguments.
The - on the other hand is used to specify stdin as the filename.
The - on the other hand is used to specify stdin as the filename.
since the _ will be passed to do_distcp, you should ask the author of do_distcp what it does with _ or consult the documentation for do_distcp, or show us the code of do_distcp
ASKER
Increasing points :)
Ok, here's the script. It is copying files from one Hadoop cluster to another using one of Hadoop's library functions.
Ok, here's the script. It is copying files from one Hadoop cluster to another using one of Hadoop's library functions.
#!/bin/bash
LOGPATH='/user/relay_rpt/BBDS-DB/*/logdb/*/p_dc=6/p_date=*/p_hour=*/_SUCCESS'
ARGS=""
ARGS="$ARGS -Dmapred.job.queue.name=bdslogging"
ARGS="$ARGS -Ddfs.nameservices=nameservice1,nameservice2"
ARGS="$ARGS -Ddfs.ha.namenodes.nameservice2=nn1,nn2"
ARGS="$ARGS -Ddfs.namenode.rpc-address.nameservice2.nn1=r3m1.hadoop.log5.blackberry:8020"
ARGS="$ARGS -Ddfs.namenode.rpc-address.nameservice2.nn2=r7m1.hadoop.log5.blackberry:8020"
ARGS="$ARGS -Ddfs.client.failover.proxy.provider.nameservice2=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"
do_distcp() {
dir=$1
echo $dir/_SUCCESS does not exist remotely. Copying.
hdfs dfs $ARGS -rm -R "hdfs://nameservice2/$dir/*"
hdfs dfs $ARGS -mkdir "hdfs://nameservice2/$dir/_tmp" &&
mapred distcp $ARGS -m 10 -overwrite hdfs://nameservice1/$dir hdfs://nameservice2/$dir/_tmp &&
hdfs dfs $ARGS -mv "hdfs://nameservice2/$dir/_tmp/part*" "hdfs://nameservice2/$dir" &&
hdfs dfs $ARGS -mv "hdfs://nameservice2/$dir/_tmp/_SUCCESS" "hdfs://nameservice2/$dir" &&
hdfs dfs $ARGS -rm -R "hdfs://nameservice2/$dir/_tmp"
}
(
flock --exclusive --nonblock 200 || exit 1
export ARGS
export -f do_distcp
tmpdir=`mktemp -d`
touch $tmpdir/missing
hdfs dfs $ARGS -ls "hdfs://nameservice1/$LOGPATH" | grep _SUCCESS | grep -v 'tmp' | perl -pe 's{.*hdfs://nameservice1(.*)/_SUCCESS}{$1}' > $tmpdir/local
hdfs dfs $ARGS -ls "hdfs://nameservice2/$LOGPATH" | grep _SUCCESS | grep -v 'tmp' | perl -pe 's{.*hdfs://nameservice2(.*)/_SUCCESS}{$1}' > $tmpdir/remote
for file in `cat $tmpdir/local`
do
grep "$file" $tmpdir/remote >/dev/null || echo $file >> $tmpdir/missing
done
xargs -a $tmpdir/missing -P 20 -L 1 -I '{}' /bin/bash -c 'do_distcp "$@"' _ {}
rm -rf $tmpdir
) 200> /tmp/bbds.distcp.lock
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you.
ASKER
What is the meaning of underscore in 'do_distcp "$@"' _ {} ?
What is generally the meaning of - in commands like
tar xf -
Any other examples of cases like these that you can think of ?