Solved

zip all files into seperate archives and then move based on date

Posted on 2009-07-06
17
219 Views
Last Modified: 2013-12-26
I have hundreds of thousands of files all in the same naming convention (EX : DENI_acn_69083_2009093121200.csv)  i need to gzip them and move them to a folder based on its date.  This one would be moved to ../2009/09/31
how could i do this?  right now im just using bash to zip them all.
0
Comment
Question by:THEROMPSTER2000
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 4
  • 3
17 Comments
 
LVL 12

Expert Comment

by:kevin_u
ID: 24787784
here's a bash version to do what you want.
for f in *.csv
do
  echo $f
  a=`expr length "$f" - 16`
  d=`expr substr "$f" $a 8`
  mm=`expr substr $d 5 2`
  yyyy=`expr substr $d 1 4`
  dd=`expr substr $d 7 2`
  dir="../$yyyy/$mm/$dd"
  mkdir -p $dir
  gzip $f
  mv $f.gz $dir
done

Open in new window

0
 

Author Comment

by:THEROMPSTER2000
ID: 24787810
could yopu please explain the lines?  bash would work for me but if you could show me what they mean.  lines 45678 what do they mean?
0
 

Author Comment

by:THEROMPSTER2000
ID: 24787818
*.csv
expr: syntax error
expr: syntax error
expr: syntax error
gzip: *.csv: No such file or directory
./newscript.sh: line 13: /bin/mv: Argument list too long
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:THEROMPSTER2000
ID: 24787841
nevermind on telling me waht it does i figured that out but its giving me a syntax error like i cant use *.csv.....
0
 
LVL 12

Expert Comment

by:kevin_u
ID: 24787922
The last error means there are no CSV files found.
0
 

Author Comment

by:THEROMPSTER2000
ID: 24787933
ok but what about arguement list too long?  there are too many fiels for mv to do anything?
0
 
LVL 12

Expert Comment

by:kevin_u
ID: 24788055
This will fix the problem.

The mv is being done one at a time, but when *.csv appeared, it tried to move all your *.csv.gz that already existed all at once.
for f in *.csv
do
  if [ "$f" = "*.csv" ]
  then
    exit
  fi
  echo $f
  a=`expr length "$f" - 16`
  d=`expr substr "$f" $a 8`
  mm=`expr substr $d 5 2`
  yyyy=`expr substr $d 1 4`
  dd=`expr substr $d 7 2`
  dir="../$yyyy/$mm/$dd"
  mkdir -p $dir
  gzip $f
  mv $f.gz $dir
done

Open in new window

0
 

Author Comment

by:THEROMPSTER2000
ID: 24788791
but then it wouldnt do anything if there were any csv files.
0
 
LVL 12

Expert Comment

by:kevin_u
ID: 24788884
Before I added the little code to stop it from running when there are no csv's, the mv would have found *.csv.gz, which there must have been a bunch of them already there, thus showing the too many arguments message.

The new script will have no such errors, and it will work on any new .csv's in the folder.

If you want it to move the .csv.gz files you already have, we'll need to make a slightly different script.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 24789067
Are the CSV files all in the same folder?
Do the files all have the same length filename?

Note that a for loop using shell globbing will fail on large numbers of files.

Also note that using non-bash builtin commands will significantly slow down the script when you are dealing with large numbers of files.

I'm making the assumption the files are all in one dir.


#!/bin/bash
CSVDIR=/path/to/cvsfiles
NEWDIR=/path/to/newdir
 
cd $CSVDIR
 
find . --maxdepth 1 -name "*.csv" | while read file
do
  d=${file##*_}
  dir=${d:0:4}/${d:4:2}/${d:6:2}
  mkdir -p $NEWDIR/$dir 2>/dev/null
  mv $file $NEWDIR/$dir
  gzip $NEWDIR/$dir/$file
done

Open in new window

0
 

Author Comment

by:THEROMPSTER2000
ID: 24789255
but then would the above script not get the right dates because they are reading three more characters than the original (since they were gzippeD).
0
 

Author Comment

by:THEROMPSTER2000
ID: 24789527
ok guys so what you gave me messed up all my folders...what can i do now?  they are in directories like this now.....
/c02/qualution/Telcordia/_200/90/11/GENI_dci_41285_200901170430.csv.gz
/c02/qualution/Telcordia/_200/90/11/GENI_dci_41286_200901170445.csv.gz
/c02/qualution/Telcordia/_200/90/11/GENI_dci_41287_200901170500.csv.gz
all off by a number.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 24789625
Which scripts did you run and in which order?

I did make one small typo in my script, where --maxdepth should be -maxdepth.

Once that mistake is corrected, it all works fine.  See the following output of my tests.

tintin$ ls -1
DENI_acn_69083_200903291200.csv.gz
DENI_acn_69083_200909301230.csv
DENI_acn_69083_2009093121200.csv
moveit.sh

tintin$ cat moveit.sh
#!/bin/bash
CSVDIR=$(pwd)
NEWDIR=newdir

cd $CSVDIR

find . -maxdepth 1 -name "*.csv" | while read file
do
  d=${file##*_}
  dir=${d:0:4}/${d:4:2}/${d:6:2}
  mkdir -p $NEWDIR/$dir 2>/dev/null
  mv $file $NEWDIR/$dir
  gzip $NEWDIR/$dir/$file
done

tintin$ ./moveit.sh

tintin$ ls -1
DENI_acn_69083_200903291200.csv.gz
moveit.sh
newdir

tintin$ ls -1R newdir/
newdir/:
2009

newdir/2009:
09

newdir/2009/09:
30
31

newdir/2009/09/30:
DENI_acn_69083_200909301230.csv.gz

newdir/2009/09/31:
DENI_acn_69083_2009093121200.csv.gz


0
 

Author Comment

by:THEROMPSTER2000
ID: 24789643
i ran this one not yours :

for f in *.csv
do
  if [ "$f" = "*.csv" ]
  then
    exit
  fi
  echo $f
  a=`expr length "$f" - 16`
  d=`expr substr "$f" $a 8`
  mm=`expr substr $d 5 2`
  yyyy=`expr substr $d 1 4`
  dd=`expr substr $d 7 2`
  dir="../$yyyy/$mm/$dd"
  mkdir -p $dir
  gzip $f
  mv $f.gz $dir
done

ill try yours next.
0
 

Author Comment

by:THEROMPSTER2000
ID: 24795907
ok so your code worked tin tin can you explain to me what each line means and is doing?  Thank you so much.
#!/bin/bash
CSVDIR=/path/to/cvsfiles
NEWDIR=/path/to/newdir
 
cd $CSVDIR
 
find . --maxdepth 1 -name "*.csv" | while read file
do
  d=${file##*_}
  dir=${d:0:4}/${d:4:2}/${d:6:2}
  mkdir -p $NEWDIR/$dir 2>/dev/null
  mv $file $NEWDIR/$dir
  gzip $NEWDIR/$dir/$file
done

Open in new window

0
 
LVL 48

Accepted Solution

by:
Tintin earned 500 total points
ID: 24797651
Hopefully the first 6 lines are obvious (let me know if they aren't)

Line 7 uses the find command to match all .csv files in the current directory.  The -maxdepth 1 option prevents find from looking in subdirectories (I didn't know if you had subdirectories or not).

The output of the find command is piped to a while read loop to read each file.  When you are dealing with large numbers of files, you can't just do

for file in *.csv

as that will break when you exceed the file globbing (pattern matching) limit.

Line 9 uses the bash builtin expression to delete everything in the $file up to the last _

so if $file was DENI_acn_69083_2009093121200.csv
then $d will be set to 2009093121200.csv

Line 10 uses the bash expression to extract certain portions of the string (like substr).  The first number is the offset and the second number is the number of bytes to extract.

Line 11 creates the new dir and the 2>/dev/null is there to supress any errors if the directory already exists.

0
 

Author Closing Comment

by:THEROMPSTER2000
ID: 31600277
thank you.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article discusses four methods for overlaying images in a container on a web page
Many old projects have bad code, but the budget doesn't exist to rewrite the codebase. You can update this code to be safer by introducing contemporary input validation, sanitation, and safer database queries.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

710 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question