grep .tar.Z file

ok, I have a .tar.Z file, and I want, without untarring it, to be able to grep this file and find out which file within the tar it came from, if possible

I know i can do

zcat x.tar.Z | grep xyz

however it does not let you know which file it is from. Essentially I want output similar to what you would get if you were grepping a lot of files,

i.e. filename: line

all help appreciated

Who is Participating?
GnsConnect With a Mentor Commented:
Sure mmajere, but that would even be worse than the perl thing I'm talkin about;-).
First you'd need a pass to get the files to loop over, then a pass _per file_ for the grep.... On a large tar archive this would turn horrendous --- pretty fast:-).
At least for a generic solution...

-- Glenn
gingermeatboyAuthor Commented:
or is this even possible?
AFAICS the answer is "no" using standard tools.
To do it you'd need implement a targrep hybrid... which would probably be more work than is reasonable.
Are we to assume the tar file(s) are huge, so that unpacking it/them to a "scratchpad area" would be less than feasible?
Or are they created with an absolute path to the files, so that unpacking 'em would risk overwriting the (more modern) file?
If the latter, you could create a "scratchpad somewheere (just a directory really), copy (or possibly link) the file there and then chroot to that place and unpack... Downside is that you'd need setup the chroot jail...

-- Glenn
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Mind you, setting up a chroot jail is most often rather easy... Just a question of copying files in this particular case, since it'd be to "overcome" tar shortcomings, not for security.

-- Glenn
Which Unix?  And do you know what the contents of the tarfile are?

To display a list of the files plus the matching string, try:
zcat x.tar.Z | strings | egrep "/path/to/files|string_you_want"  

It's not foolproof, as if the path appears as a string in one of the files you'll get a false positive, but the alternative would be to write a utility that understands tar format and can send output to the screen instead of extracting it to disk

... which is no better than Tims suggestion Achim. Probably worse since you lack the "strings";)

-- Glenn
> ..  and I want, without untarring it ..
that's in the question.
But that's not the problem, but the grep in tarballs, nnd I see now that zgrep can't do it either, sigh.
Actually, playing about with this, the "strings" is useless as it can hide the filename; I say Glenn is right - It can't be done without something that understands tar format.
Yep. The sorry state of things:-).

I might add that this is not the first time this type of discussion has been on my plate... Finding the space to untar to was the solutyion back in -90 too:-).

The only "targrep" one can find is this very tantalizing mail (I'm quoting since the link found in google doesn't work... This is the cached result:)

DATE: 10/16/1997 23:40:10
SUBJECT:  targrep



i recently needed to find a utility to run grep on a tar file,
i.e. report which files in a tar archive contain a regexp.

a web search revealed nothing; and the current gnu tar and
grep did not have such an option as i was hoping.  so i
decided to compile together gnu grep and gnu tar into a
single binary, hacking as necessary, to do what i needed.

it`s actually not too bad.  i now have a util `targrep`
with the syntax

  targrep -q [-Q grep_options] [tar_options] <pattern> [file(s)]

since tar -g was already taken, i used -q for `query`.
the -Q flag will pass single-char argumentless options
to the grep option parser, e.g.,

  targrep -q -Qliw -zf file.tar.gz notdef `*.[ch]`

would search the *.[ch] files in a gzip`d tar file for
case-insensitive instances of the word `notdef`, and list
those files that had at least one match.

i was wondering if this might be useful to include in the
next release of gnu tar.

CLIFF MILLER (<EMAIL: PROTECTED>)  908-582-6450.  Office MH 2C-415.

      Dreams are just compressed tar files of daily life.

From bug-gnu-utils-request  Fri Oct 17 01:43:14 1997


Now a man tar for gnu tar shows that the answer was "no".
Gnu tar has one noice feature though, the "extract to stdout" flag. Makes it possible to search (but not really grep... not expecting to find the filename... Perhaps a perlhack.....:-) like this:
tar xvzOf filename.tgz 2>&1 |less -e
Then just use the paginators /search function.... When you find what you need, back up to the "a filename" and there you are:-).
As said, probably a good candidate for a perlhack:-).

-- Glenn
I'd vote for perl too. Would have tried it, but I'm lacking tar knowledge :-|
targrep sounds good, I need it too (as well as ztargrep:)
Unfortunately a bit short of time, but one could well imagine a "two-pass" approach with perl and gnu tar (the -O option)... which would mean you spend a little time, but need next to no tar fileformat knowledge... First pass you read "possible filenames" into an array (and discard the std out), next pass you 2>&1 and /regex/ through the file, noting where you "switch files".... Should be fairly easy:-):-) Not exactly foolproof though;-)

-- Glenn
I may be way off base here, but I use GNU tar version 1.13.25 and I can grep a tar.gz file by simply doing this:

tar tzvf {tar file name} | grep {string}  

Would this not also work for you if you have a tar that accepted the z parameter for gzip'ed files?

From there is you were looking simply for the filename you could parse it using cut or awk.
Expanding the afore mentioned answer you could loop it in a for loop coupled with an if/then statement that would place the tar.gz file into a flat file if grep returned positive results.
Hi gingermeatboy

I see you haven't gotten your answer yet so try this on for size.

First your x.tar.Z file is a COMPRESSED file that was compressed using the UNIX 'compress' command. The 'compress' command automatically attaches a .Z (upper case Z) to the end of the compressed file.

Soooo... the first thing that your need to do is to UN-COMPRESS your file. THis is done with the coresponding 'uncompress' UNIX command.

Example: # uncompress x.tar.Z

This will uncompress the file and it will be saved as 'x.tar'

Now you can dump the 'Table Of Contents' of the x.tar file without actually UN-Tarring the file. You can also PIPE (|) the output to grep and look for your 'XYZ' file.

Example: tar -tvf x.tar | grep -i xyz

where -t = dump table of comtents
          -v = verbose
          -f = the tar file

also the -i in grep tells grep to ignore case.

Be aware that this will dump the Table of Contents only, it will not give you access to the file 'internals'

Maybe this will help

gingermeatboyAuthor Commented:
thanks for all your help everyone, unfortunately there was no "easy" way round it as far as I could see either, The way Gns said was how I ended up implementing it so I gave them the points, even though everyone made valid helpful points

Thanks again

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.