Search for files by Content

Posted on 2004-11-09
Last Modified: 2010-04-11
Hi all,

I work for a University, and we have frequent problems with students downloading illegal music files.  Then we a complaint from the DMCA, we slap the student on the wrist, and revoke their internet access until we can verify that all illegal content is removed from their machine.  It's kind of hopeless, given the infinite ways to save their content before we come inspect their system, but hey, what can you do.

So here's my question:

I noticed that MP3 files begin with the two ASCII charactors ÿû (at least those that I've inspected) when viewed in Notepad.  Is there a way to search all files on the system to see if any begin with those 2 ASCII charactors?  The idea is to find ILLEGAL_SONG.MP3  that has been renamed to HARMLESS_PHOTO.JPG

Any raw ASCII/binary search utility that can do this for me....  or suggestions on some code I could compile (I imagine it would be a lot)?

Question by:mistagitar
    LVL 87

    Expert Comment

    I'd setup the firewall not to allow P2P software getting getting data from the internet. Also make sure plain users can't install software (use Group policy if your using a Win2k server environment), that will prevent them from isntalling P2P software.
    LVL 51

    Expert Comment

    best you go to any Unix, better Linux, system and see the /etc/magic file
    there're dozents of MPx file formats defined
    LVL 18

    Expert Comment

    Well in a University environment, in most cases, you can't kepe users (the students) from installing software or being admins on their machines because the students own the machines. Atleast this is the case in the University I work for :)

    I would research the firewall option as you have more control over your network and what goes thru your network then you have over the student's actual computers.
    LVL 51

    Expert Comment

    a (traditional) firewall does not stop any downloads
    LVL 87

    Expert Comment

    If the PCs are the personal property of those stundents, I'd say it is their responsibilty what they do with it and if they download illegal content or not, not yours.
    LVL 87

    Expert Comment

    ahoffmann, true, but you can prevent P2P software from getting active, and that is where most "illegal" mp3 files are getting downloaded from (websites that provide illegal mp3s won't last long as they can be traced easily, but on a true p2p network most users aren't even aware they are providing files for download, it is also more difficult to get hold of the providers as many users share the same file).
    LVL 51

    Expert Comment

    .. and how many do not know how to tunnel p2p over http or https?

    Author Comment

    Thanks for all the replies, but prevention is not my worry.

    We don't like to close ports (and that's not my department anyway)...  we know students download from PTP.  BitTorrent accounts for over 60% of our inbound pipe traffic!  We know about it but we let the students do it at their own risk.  We then get about 30-40 DMCA (Digital Millenium Coppyright Act) violation complaints per year.  Fortunately, no students have peen prosecuted.

    I just have to go in and make sure they're all clean.  Like I said, 95% of them probably burn their music before deleting it, but we have to at least look like we're making the effort.

    I'm interested in ahoffmann's comment about "the /etc/magic file."  Is this a map of common file types or somthing?  We use CD-bootable Knoppix for data retrieval on systems with a botched OS (Knoppix can read but not write to NTFS partitions), so that could be a solution somehow....

    LVL 51

    Accepted Solution

    boot knoppix, then mount your window partition(s) and run somthing like:

        find /mounted_windoze_partition -type f -exec file {} \; |grep -i mp

    be prepared for a huge amount of data when waiting long enough ;-)
    LVL 3

    Assisted Solution

    I don't know how to search for ASCII codes in files that have been renamed or might have been. But, there is a faster way to do things if you suspect this is going on and there is no other solution for you..

    The main files one would rename an MP3 to would be doc, bmp, avi, asf, mpg, mpeg, anything that would look like it fits its right size to look like a legit file. You would be stupid to rename it as a jpg, seeing as most MP3's are over many jpg sizes...If you suspect this is happening, then he/she is probably not that stupid as to rename it to a jpg file.


    Click start<search<for files or folders

    Click pictures, music or video

    Tick the box for pictures and photos

    Click advanced search options

    Scroll down to find "what size is it"

    Click specify size (in kB)

    Select "at least" from the drop down menu

    Enter this amount 2,700

    Don't enter a name to search for, leave it blank, and click search

    Once you have finished searching, right click in that window and click view<Thumbnails....If any file is not showing a picture, then check its file size by hovering your mouse over it it that option is enabled, or if you have too many pop up then do it this way<

    Right click in that window again and select View<Details

    Then right click again and select Arrange Icons By<Size

    Click the size tab at the top to order everything by size. Start with the bigger files to smallest. Under around 20 megs. If anything is over this and is a jpg, bmp or png, then it's suspect video or something else, investigate it.

    If any files come up as jpg then you can pretty much suspect that this file is not a jpg and has been renamed,
    is a corrputed file or even an unfinished download.

    Find all the files that are over the size limit you specified, and open a music player that plays MP3's, simply grab all those files, drop them into it and play each one. You don't need to rename it to will play if it's a song..If not, then it's not a song or you may need the right codec or ext for that matter, it might be a wav file or sometihng like this.

    BMP's are harder to detect since BMP's are normally over 2 megs and an MP3 can be as low as this size too. But do the same thing for everything you find.

    Do the same thing for music and video options in the search while doing each one, not all at once...

    I'm sure you know what I mean now...we're searching for files that are suspect in file type and size from the normal standards of the file, so we don't have to search all files and open them to find its ASCII code to verify it as an MP3.

    Good Luck

    Featured Post

    How your wiki can always stay up-to-date

    Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
    - Increase transparency
    - Onboard new hires faster
    - Access from mobile/offline

    Join & Write a Comment

    Don’t let your business fall victim to the coming apocalypse – use our Survival Guide for the Fax Apocalypse to identify the risks and signs of zombie fax activities at your business.
    Container Orchestration platforms empower organizations to scale their apps at an exceptional rate. This is the reason numerous innovation-driven companies are moving apps to an appropriated datacenter wide platform that empowers them to scale at a …
    Sending a Secure fax is easy with eFax Corporate ( First, Just open a new email message.  In the To field, type your recipient's fax number You can even send a secure international fax — just include t…
    Here's a very brief overview of the methods PRTG Network Monitor ( offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…

    755 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now