Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1354
  • Last Modified:

Linux Shell script or Perl script to gunzip & rezip using zip with highest compression

I'm running into disk space issue on one archiving filesystem & noticed that
there are lots of *.gz that were not previously gzipped using "gzip -9" to get
the highest compression.

So will need a Shell or Perl script such that it will scan for all *.gz files in a
directory (that's provided as the first parameter) that will gunzip *.gz files
in the given directory one at a time & then zip up that same file with the
highest compression & then proceed to do the same for each *.gz file in
that directory.  Note that we need to do this one file at a time because
if we gunzip all files together, it will fill up the disk space.

I'm on RHES 4.6 & the zip version is :

# zip -v
Copyright (C) 1990-1999 Info-ZIP
Type 'zip "-L"' for software license.
This is Zip 2.3 (November 29th 1999), by Info-ZIP.


Help for the zip is as below:

# zip
Copyright (C) 1990-1999 Info-ZIP
Type 'zip "-L"' for software license.
Zip 2.3 (November 29th 1999). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
  The default action is to add or replace zipfile entries from list, which
  can include the special name - to compress standard input.
  If zipfile and list are omitted, zip compresses stdin to stdout.
  -f   freshen: only changed files  -u   update: only changed or new files
  -d   delete entries in zipfile    -m   move into zipfile (delete files)
  -r   recurse into directories     -j   junk (don't record) directory names
  -0   store only                   -l   convert LF to CR LF (-ll CR LF to LF)
  -1   compress faster              -9   compress better
  -q   quiet operation              -v   verbose operation/print version info
  -c   add one-line comments        -z   add zipfile comment
  -@   read names from stdin        -o   make zipfile as old as latest entry
  -x   exclude the following names  -i   include only the following names
  -F   fix zipfile (-FF try harder) -D   do not add directory entries
  -A   adjust self-extracting exe   -J   junk zipfile prefix (unzipsfx)
  -T   test zipfile integrity       -X   eXclude eXtra file attributes
  -y   store symbolic links as the link instead of the referenced file
  -R   PKZIP recursion (see manual)
  -e   encrypt                      -n   don't compress these suffixes
0
sunhux
Asked:
sunhux
  • 6
  • 3
2 Solutions
 
sweetfa2Commented:
#!/bin/bash
if [ $# -ne 1 ];
then
    echo "Invalid arguments"
    exit
fi
for file in `find $1 -maxdepth 1 -type f -name "*.gz" -print`
do
  echo $file
  gunzip $file
  zip -9 ${file%%.*}
done

Open in new window

0
 
sunhuxAuthor Commented:


I think "gzip -9 filename" is supported but not "zip -9 filename" :
Kindly help me get the right syntax as I've been googling but no luck


 # zip -9 ftp_get.log.20090413
        zip warning: missing end signature--probably not a zip file (did you
        zip warning: remember to use binary mode when you transferred it?)

zip error: Zip file structure invalid (ftp_get.log.20090413)
0
 
sunhuxAuthor Commented:

When I do "man zip", it gave the following on my platform:


# man zip
Formatting page, please wait...
ZIP(1L)                                                               ZIP(1L)

NAME
       zip,  zipcloak,  zipnote,  zipsplit  -  package and compress (archive)
       files

SYNOPSIS
       zip   [-aABcdDeEfFghjklLmoqrRSTuvVwXyz!@$]   [-b path]   [-n suffixes]
       [-t mmddyyyy] [-tt mmddyyyy] [ zipfile [ file1 file2 ...]] [-xi list]

       zipcloak [-dhL] [-b path] zipfile

       zipnote [-hwL] [-b path] zipfile

       zipsplit [-hiLpst] [-n size] [-b path] zipfile

DESCRIPTION
       zip  is a compression and file packaging utility for Unix, VMS, MSDOS,
       OS/2, Windows NT, Minix, Atari and Macintosh, Amiga and Acorn RISC OS.

       It  is analogous to a combination of the UNIX commands tar(1) and com-
       press(1) and is compatible with PKZIP (Phil Katzâs ZIP for MSDOS  sys-
       tems).
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
wilcoxonCommented:
Very odd.  Your original post indicates that zip accepts -# to indicate compression level.  The man page is also for info-zip which supports -# (-9 for best).
0
 
sunhuxAuthor Commented:

Assuming I want to zip xx.log, the syntax is
  zip -9 xx.log.zip xx.log
  rm -f xx.log

You need to remove xx.log as zip will not housekeep it, unlike gzip

Can you rewrite the script to cater for the correct syntax as well as removing the
gunzipped log file?
0
 
sunhuxAuthor Commented:

Sorry to add 2  more requirements, in the statement
   `find $1 -maxdepth 1 -type f -name "*.gz" -print`
can you find only *.gz files that are more than 1MB in size as
I've found there's not much savings for small files & some of
the *.gz files are already gzipped with "gzip -9" which already
gave high compression ratios
0
 
sweetfa2Commented:
#!/bin/bash
if [ $# -ne 1 ];
then
    echo "Invalid arguments"
    exit
fi
for file in `find $1 -maxdepth 1 -type f -name "*.gz" -size +1024000000c -print`
do
  echo $file
  gunzip $file
  zip -9 ${file%%.*).zip ${file%%.*}
  rm -f ${file%%.*}
done

Open in new window

0
 
sunhuxAuthor Commented:

Is there any way to check if the existing *.gz file is already gzipped with "gzip -9 ..."
compression ratio? If this can be checked before doing 'gunzip', that will be perfect.
If this is not possible, it's ok, just let me know & I'll close this thread & award points
0
 
sweetfa2Commented:
Not that I am aware of
0
 
sunhuxAuthor Commented:
Ok
0

Featured Post

Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now