Avatar of buttonMASTER
buttonMASTER asked on

Unix commands to unzip a bunch of zip files scattered in folders (and other requirements)?

I need to extract a bunch of zip files, but I have requirements.

The zip files are scattered in various folder like this

  • base_folder/
  • base_folder/batch_1/batch_1_1.zip
  • base_folder/batch_1/batch_1_2.zip
  • base_folder/batch_2/batch_2_1.zip
  • base_folder/batch_2/batch_2_2.zip
  • base_folder/batch_2/batch_2_1.zip
  • base_folder/big_batch/batch_a/batch_a_1.zip
  • base_folder/big_batch/batch_a/batch_a_2.zip
  • base_folder/big_batch/batch_b/batch_b.zip

I want to extract the files to another folder and keep the same folder structure

  • base_folder/extracted/
  • base_folder/extracted/batch_1/file_1
  • base_folder/extracted/batch_1/file_2
  • base_folder/extracted/batch_2/file_2_1
  • base_folder/extracted/batch_2/file_2_2
  • base_folder/extracted/batch_2/file_2_1
  • base_folder/extracted/big_batch/batch_a/file_a_1
  • base_folder/extracted/big_batch/batch_a/file_a_2
  • base_folder/extracted/big_batch/batch_b/file_b

I want to extract everything in the zip file except for files that have certain extensions

  • file - OK
  • file.exe - OK
  • file.xml - OK
  • file.txt - Not OK
  • file.xls - Not OK

Is this possible with a few Unix commands? If it isn't entirely possible, what is the closest I can do?
Unix OS

Avatar of undefined
Last Comment
buttonMASTER

8/22/2022 - Mon
noci

This should do it....

for i in $( find  /base_folder ! -path '*/extracted*' -a -name '*.zip'   -print ) ; do
   out=$( dirname $i | sed 's/base_name/base_name\/extracted/' ) 
   [ -d $out ] || mkdir $out
   (cd $o ; unzip $i '*' -x '*.txt' *.xls'
done 

Open in new window


You may need more -x options. of you want to forbid more files.
ASKER
buttonMASTER

Hi noci. Thank you for your quick reply.

I added what you said in a script file and cleaned it up a little because it looked like it had some syntax errors.

for i in $( find base_folder ! -path '*/extracted*' -a -name '*.zip'   -print ) ; do
   out=$( dirname $i | sed 's/base_name/base_name\/extracted/' )
   [ -d $out ] || sudo mkdir -p $out
   (cd $o ; sudo unzip $i '*' -x '*.txt' '*.xls')
done

Open in new window


And it looks like it almost works, but I get one error like the one below for all the zip files.

unzip:  cannot find or open base_folder/batch_1/batch_1_1.zip, base_folder/batch_1/batch_1_1.zip.zip or base_folder/batch_1/batch_1_1.zip.ZIP.

Open in new window


Do you know how I can fix that?
noci

That requires a absolute path in the find command.....

so find /where/ever/is/the/base_folder  ...
in stead of find base_folder...
or:
for i in $( find $PWD/base_folder ! -path '*/extracted*' -a -name '*.zip'   -print ) ; do
   out=$( dirname $i | sed 's/base_name/base_name\/extracted/' )
   [ -d $out ] || sudo mkdir -p $out
   (cd $o ; sudo unzip $i '*' -x '*.txt' '*.xls')
done

Open in new window

Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
ASKER
buttonMASTER

Ok almost there. So it was able to extract everything, but I don't know where it extracted to.... It seems like it extracted it somewhere and extracted all the contents into the same folder. I know this because to test this, I put copies of the same zip file in the nested folders and was prompted to replace all the files.

Do you know whats going on?
noci

Bummer...     there is a typo $o used as output..., in stead of $out...

This should show the location:....
for i in $( find $PWD/base_folder ! -path '*/extracted*' -a -name '*.zip'   -print ) ; do
   out=$( dirname $i | sed 's/base_name/base_name\/extracted/' )
   [ -d $out ] || sudo mkdir -p $out
   (cd $o ; echo $o)
done

Open in new window

if this only show blank lines then the unzip is probably done on the original directory. (from where the commands are started)
Sorry for that.


And this should be the right code:

for i in $( find $PWD/base_folder ! -path '*/extracted*' -a -name '*.zip'   -print ) ; do
   out=$( dirname $i | sed 's/base_name/base_name\/extracted/' )
   [ -d $out ] || sudo mkdir -p $out
   (cd $out ; sudo unzip $i '*' -x '*.txt' '*.xls')
done

Open in new window

ASKER
buttonMASTER

Ok it worked! But one more thing I'm sorry, I didn't realize I example was incorrect.

How would I modify the code to have the contents of the zip extract to a folder named the same thing as the zip file (minus the extension)?

so:

base_folder/batch_1/batch_1_1.zip -> base_folder/extracted/batch_1/batch_1_1/file_1
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ASKER CERTIFIED SOLUTION
noci

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
See how we're fighting big data
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
ASKER
buttonMASTER

It looks like it extracted the files to the folder the zip files were in and not the 'extracted' directory.
noci

That was because $o was empty, cd without any path returns you to the login directory ($HOME) of an account.
or nowhere if that is empty as well (stays where you are).
ASKER
buttonMASTER

Thank you noci. I was able to fix it by changing:

out=$( dirname $i)/$( basename $i .zip| sed 's/base_name/base_name\/extracted/' )

to:

out=$( dirname $i | sed 's/base_name/base_name\/extracted/')/$(basename $i .zip)
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy