Searching within files and compressed archives for a text string

Posted on 2006-04-07
Last Modified: 2010-04-25
I would like to search all the files stored on a MAC (from a given base folder and all its subfolders) for a specific text string.  I think the information that I am looking for was stored in a compressed archive - eg such as one created by stuffit

I am not sure of what programs created the files stored within the archive.  At this point, what I would like to do is have the program treat the files as a binary object and then just search through them for the target string.

I looked at the manual for Stuffit ( now owned by smithmicro ) and did not see that it had an option to look within the body of the file for specific text.  I just saw an option to look within file names - and that will not solve my problem.

What I would like as an answer would be:
1.  Name of the program and a link to the website where I can obtain the program.
2.  Link to review(s) or article discussing the programs text search capabilities.
3.  Any comments as to your actual experience with the program.

I am also interested in PC's - but the focus of this problem is for MAC.

Thanks for your help.
Question by:flindgren
    LVL 1

    Assisted Solution

    You can do this on a Mac running OS X with a built-in system command called grep, from the terminal. Do the following:

    1) Open in the Utilities directory.
    2) In Terminal, change into the directory from which you want to do the search: cd directoryname
    3) execute the following command: grep -R -e "textToSearchFor" *
    4) Let the computer do its thing. If you have a lot of compressed files this will be a very processor intensive activity as it will search every single file. It will most likely bog down your computer to the point of not being able to work on it, but will eventually free up. There is no better search tool than this one.

    Several tips: If the search path has any .img or .dmg files (audio or disk img files or mp3s) I recommend temporarily moving them or executing the command as follows:

    grep -R --exclude *.dmg *.img *.mp3 -e "textToSearcchFor" *

    this will exclude files with those endings from the search.

    alternatively, you could just look into .hqx, .sitx or .sit files:

    grep -R -e "textToSearchFor" *.hqx *.sitx *.sit

    For information on this command, from the terminal issue the command: man grep

    This will give you the man page which is the most thorough and authoritative information on the command. As for reviews, its a Unix tool used every single day by Unix/Mac professionals.

    Let me know if you have any more questions.

    Expert Comment

    If you would like a GUI interface, there are several apps that claim to give one;
    I haven't tried them myself, so I cannot compare them or give you a recommendation,
    but you might look at:
    good luck!

    Author Comment

    This is a request for clarification, I will try your suggestion as soon as I can get access to the computer.
    I understand the reference to man grep, and will take a look at the manual pages at that time.

    However, in the meanwhile, if you could respond to the following I would appreciate it.

    Does the MAC version of Grep search "understand" the Stuffit file format?
    By this I am asking if this version of Grep will just look for the text string in the binary Stuffit file, or does it,
    in effect "unstuff" the archive and then search through the files in the archive as though they were not
    in an archive?

    If grep is just looking within the binary of the archive without being aware of the storage format,
    then it may not find the string since the archive is going to compress its contents and what was "plain text"
    will be transformed by the compression process.



    Accepted Solution

    You'll have to poke around if you don't know how the text is compressed. There are
    versions of grep for compressed files, such as zgrep, bzgrep and zipgrep; they are listed in the man pages on the Mac.

    Another option is to use pipes: uncompress the files and automatically feed the output into grep.

    Hope this helps!
    LVL 1

    Expert Comment

    Stuffit's compression algorithm was written by Ramond Lau back in the 80's and I haven't found any public information about it. I suspect that there aren't any "free" tools that understand it explicitely given that it is commercial software and has been since its inception. grep can find a hit in a .zip file and tell you so it may be able to find it in the stuffit archive.

    I can't create a sit or sitx archive with a known test string to find out if grep can understand it or not. I would try and see what happens. At best it will tell you it has a hit inside the binary file. At worst it won't see it. If the latter is the case you could write a perl or applescript to expand every archive into a directory and then do the grep that way or like the previous suggestion, pipe things from the command line. However, having the files in the directory will give you ready access when it does find it. The pipe solution may require a little extra scripting magic to make sure you know which archive the file it found came from.



    Author Comment

    Thank you both for your advise.  I wish there was a solution / script that took care of looking in the archives.  The files are not on my computer and I don't have a mac, so I can't test and develope a script that could safety uncompress a file ( that might contain subfolders) and then scan them for my string, and recurse through all the files and subfolders that exist while doing the necessary house keeping along the may in managing its temp folder.

    As a separate question would either of you be interested in writing such a script ?  If so, I will post such a question in this same area.

    Thanks fior your help.

    Author Comment

    I intended the Accepted answer to be by baus and the assistance by mrocek.

    Sorry, I made some sort of error in closing the question.


    Expert Comment

    Is there a way to transfer some points to baus? I would be happy to transfer 300
    points to make it the way flindgren intended.


    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    Join & Write a Comment

    Suggested Solutions

    A lot of new and distinct gadgets are making their appearance every other day. The latest gadget that has wooed the attention of all gadget lovers and non gadget lovers alike is the Smartwatch. This tiny gadget is capable of offering live access to …
    In this article we will discuss some EI Capitan Mail app issues and provide some manual process to resolve them.
    Users will learn how resize a batch of photos from a single command in Photoshop via Photoshop's Image Processor. Open up an Image you'd like to resize in Adobe Photoshop: Adjust the image size according to your preferences. Image > Adjustments > …
    Users will learn how to set proper sequence settings, scale images, paste attributes, add transitions, fades, and music. Open up Final Cut Pro 7 and Create a new Project: Set the Sequence Settings. a) Click File > Easy Setup > Format > Apple ProRe…

    731 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    15 Experts available now in Live!

    Get 1:1 Help Now