Link to home
Create AccountLog in
Avatar of terencepires
terencepires

asked on

Tar script from file list

Hello,

i made a shell script to automatically tar files from a given filelist (.txt), the script is below.
My problem is that there is an issue with the tar command, which seems not to understand the name of the file i want to put in the archive.
For a given filename, i.e : bijou65_t.jpg, i get these errors...

bijou65_t.jpg

tar: Removing leading `/' from member names
tar: bijou65_t.jpg\r: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors

Can anyone tell me what's wrong ??


#!/bin/sh
i=0
workdir=`pwd`
basedir=/Users/terencepires/Documents/Skydog/Images/for_web/thumb
csvdir=/Users/terencepires/Documents/Skydog
 
cd $basedir
for file in $(cat $csvdir/ftpskydog.csv); do
curfile=`find . | grep  $file`
curfile=`echo "$file" | sed 's://:/:'`
echo $curfile
 
tar -cvfA /Users/terencepires/Documents/Skydog/skydogimg.tar $file > $csvdir/tar.log
done

Open in new window

Avatar of MushyPea
MushyPea

Your source file has carriage returns on the end.

To delete them, do:

sed -i 's/\r$//' filename.txt

I'm a bit confused as to what you're trying to achieve and think there might be an easier way... if you re-explain, I'll gladly help.
Avatar of terencepires

ASKER

yes, it might be a little twisted, i admit :)

what i'm trying to do is create a tar file from a filelist i have.
The files are located in several directories and each directory contains multiple files. I only want one specified file from each directory.

So what i did is :

#!/bin/sh
i=0
workdir=`pwd`
basedir=/Users/terencepires/Documents/Skydog/Images/for_web/thumb
csvdir=/Users/terencepires/Documents/Skydog
 
cd $basedir --> workdir where the subdirectories are
for file in $(cat $csvdir/ftpskydog.csv); do --> reading the filelist
curfile=`find . | grep  $file` --> use find to find them
curfile=`echo "$file" | sed 's://:/:'` --> put them in a var and correcting a display problem ( two // instead of one)
echo $curfile --> just to be sure which file is processed
 
tar -cvfA /Users/terencepires/Documents/Skydog/skydogimg.tar $file > $csvdir/tar.log --> adding the file to the archive
done

Since i'm new to shell scripting and if you know a better way to do it, please show me !
tar -cv -T filelist -f files.tar

Still need to watch out for those carriage returns, mind.
well my problem is that i have to parse recursively the subdirectories from the main workdir, find the needed files according to the filelist and then put them into the archive
Avatar of Tintin
I'd do it as
#!/bin/bash
basedir=/Users/terencepires/Documents/Skydog/Images/for_web/thumb
csvdir=/Users/terencepires/Documents/Skydog
 
cd $basedir
for file in $(cat $csvdir/ftpskydog.csv)
do
  find . -name $file >>/tmp/$$
done
 
tar -T /tmp/$$ -cvf /Users/terencepires/Documents/Skydog/skydogimg.tar $file > $csvdir/tar.log

Open in new window

thanks, but it only works for the last file of the csv...
I also attached the csv (renamed as txt) and the generated tar (renamed as zip)

Also, what does ">>/tmp/$$" mean, i couldn't figure it out myself...
ftpskydog.txt
skydogimg.zip
Ah, the filelist doesn't contain the path?

You have a stray $file on line 11, and you can incorporate the carriage-return filter too.

Oh, and we can get rid of the need for a temp file by using stdin.

Try this:
#!/bin/bash
basedir=/Users/terencepires/Documents/Skydog/Images/for_web/thumb
csvdir=/Users/terencepires/Documents/Skydog
 
cd $basedir
for file in $(tr -d '\r' <$csvdir/ftpskydog.csv)
do
  find . -name $file
done | tar -cv -T - -f /Users/terencepires/Documents/Skydog/skydogimg.tar >$csvdir/tar.log

Open in new window

$$ is the pid of the shell running the script.  

By the way, using /tmp/$$ is horribly insecure, but that's fairly paranoid unless there's other users on your machine with unfriendly tendencies...
thanks for the info !
Still doesn't work though, last try creates "mirror copies" of the archive, inside the archive.
I attached the file for you to see...
with the file...
skydogimg.zip
Okay, I'll rewrite my own instead of adapting someone else's... try the code below.

Note that it only calls 'find' once, so should be much faster.
#!/bin/bash
 
BASEDIR=/Users/terencepires/Documents/Skydog
SEARCHDIR=$BASEDIR/Images/for_web/thumb
FILELIST=$BASEDIR/ftpskydog.csv
ARCHIVE=$BASEDIR/skydogimg.tar
LOGFILE=$BASEDIR/tar.log
 
cd $SEARCHDIR
FILES=($(find . -type f))
NUMFILES=$((${#FILES[@]} - 1))
 
cat $FILELIST | tr -d '\r' | while read FILE; do
  for ELEMENT in $(seq 0 $NUMFILES); do
    if [[ "${FILES[$ELEMENT]}" =~ "$FILE" ]] ; then
      echo ${FILES[$ELEMENT]}
      break 1
    fi
  done
done | tar -cv -T - -f $ARCHIVE >$LOGFILE

Open in new window

i get an error message
tr: Illegal byte sequence

actually the problem seems to be the csv file...
is there a way to create a csv without \r ?
Besides, i don't have a seq command installed on my mac...
Anyway, thanks for helping !
Okay, let's use 'sed' instead of 'tr', and not use 'seq'...
#!/bin/bash
 
BASEDIR=/Users/terencepires/Documents/Skydog
SEARCHDIR=$BASEDIR/Images/for_web/thumb
FILELIST=$BASEDIR/ftpskydog.csv
ARCHIVE=$BASEDIR/skydogimg.tar
LOGFILE=$BASEDIR/tar.log
 
cd $SEARCHDIR
FILES=($(find . -type f))
NUMFILES=$((${#FILES[@]} - 1))
 
cat $FILELIST | sed -e 's/\r//g' | while read FILE; do
  ELEMENT=0
  while (($ELEMENT <= $NUMFILES)); do
    if [[ "${FILES[$ELEMENT]}" =~ "$FILE" ]] ; then
      echo ${FILES[$ELEMENT]}
      break 1
    fi
    let "ELEMENT++"
  done
done | tar -cv -T - -f $ARCHIVE >$LOGFILE

Open in new window

thanks mushy
so, it takes time to process, but still does mirror copies of itself like last time.
Besides, the created log is empty...
Do you need more info on my system or something else that could ease this ?
here are the files created
skydogimg.zip
tar.zip
I have no idea why the log file would appear inside the tar file.

I have noticed, however, that the ftpskydog.txt you attached has arrived to me in Unicode format, rather than plain ASCII text.  Is the same true at your end, or is that EE?  I assume this list is exported from somewhere - can you check to see if you can export it as ASCII rather than Unicode, and see if that makes a difference?  I suspect that will be why it's not finding the files to archive for you...
in fact i created the archive containing the log file, just to send it to you via EE, for you to check it. Sorry for the confusion. Using ascii encoding doesn't help any further...
However i noticed that the cat command seems to dysfunction, meaning that on this particular file it only displays the last line, whereas it contains over 100 entries !
Bit of a head-scratcher, this one!

Can you provide the output of the follow commands, and we'll see where it's going wrong - I can only assume this is a Mac-vs-Linux inconsistency.


find /Users/terencepires/Documents/Skydog/Images/for_web/thumb -type f
cat /Users/terencepires/Documents/Skydog/ftpskydog.csv | sed -e 's/\r//g'

Might be worth checking that your 'tar' accepts   -T -  ... try:

touch testfile       (no output)
echo testfile | tar -cv -T - -f testfile.tar     (should output "testfile")
tar -tf testfile.tar      (should out "testfile")
here are the results :

find command : see find.txt
cat command : see cat.txt

tar testings :
macbook-de-terence-pires:Skydog terencepires$ touch testfilemacbook-de-terence-pires:Skydog terencepires$ echo testfile | tar -cv -T - -f testfile.tar
testfile
macbook-de-terence-pires:Skydog terencepires$ tar -tf testfile.tar
testfile
macbook-de-terence-pires:Skydog terencepires$

find.txt
skydog.txt
other thing, when using
cat -e ftpskydog.csv i get
101ers109_t.gif^M54_nude_honeys01_t.gif^M ...
What is the meaning of ^M ?
Well, the find worked.  Tar works as expected.  However, catting that file through sed failed; skydog.txt is 0 bytes.  I have no idea what would cause that, unless there's something wrong with the ftpskydog.csv file.

That's the \r I'm trying to strip out with the sed -e 's/\r//g'; in UNIX, an end-of-line in a text file is simply a \n (line feed), whereas in DOS it's \r\n (carriage return + line feed).  ^M means ctrl-M, which is the same as a carriage return.
^M (control M) is a character that get added by OS to mark end of line. This is different from OS to OS e.g. unix to windows or the other way.

you may use text / ascii mode while transferring files between the systems or tools like unix2dos or dos2unix to get rid of this problem
i am clueless to what's wrong in this i deleted the file, tried to re-encode it, remake it and nothing gets better...
however, the first script i wrote could retrieve info from the csv file, maybe we should try to merge down the two methods to get this working ?
this file has been created on the same system where the script is run (mac os 10.5.4)...
I do not know what tools you use to create the file, but did you try to use the tools mentioned?
Just remove the sed completely, so the line looks like:

cat $FILELIST | while read FILE; do


That help?
okay, i replaced the following line in your script

done | tar -cv -T - -f $ARCHIVE >$LOGFILE

by this one :
done | tar --append --file /Users/terencepires/Documents/Skydog/skydogimg.tar $ARCHIVE >$LOGFILE

and i have this :
tar: /Users/terencepires/Documents/Skydog/skydogimg.tar: file is the archive; not dumped

Since i don't understand the syntax i can you put i can't figure out what represents the current image to be treated, do you think it would work if the image variable was put instead ?

omar, i'll try dos2unix but i doubt on it's efficiency, since it's made to port mac/dos files to unix systems, and that i'm on a mac
you may try it. The commands named like that historically and mac is some sort of unix like os.
darwinports don't seem to install correctly on my computer (murphy's law indeed)...
anyway i'll try another method : copying the files in a buffer directory and then adding them in a tar archive.
Let's see how it works
ok, almost working now !!
the filelist had a problem of some sort, but i couldn't say which one it was... Anyway i re-input the EOL markers in my editor and it did the trick this time (i already tried it before), using you script, Mushy !

So thanks again !

Now just one last thing, the files in the .tar archive are in subdirectories, but i'd like to get rid of them, how could i do that ?
ASKER CERTIFIED SOLUTION
Avatar of MushyPea
MushyPea

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
thanks so much ! Feels good to finally beat the system :)

Cheers