Link to home
Start Free TrialLog in
Avatar of pimpp1184
pimpp1184

asked on

Please Help!! Reading a directory and then finding duplicate files in the directory

Hi. I have to make a program in which I have to read a directory in the computer and then display all the files in that directory and all the subdirectories. I got that part to work fine I think. But what I can't get to work .. is that I gotta make a directory such as C:\robocode default but the part that I can't get to work is when for example when the program is compiled from the command prompt .. I gotta compile it like this: Ex:  java ListAll C:\mysharedfolder. If the user doesn't specify any directory .. the program will automatically use the default and show the files in that directory. Thanks for your help.

Code ... which I have so far:
import java.io.*;

class ListAll
{
      private void listDirectory(String dir)
      {
            File currentDirectory = new File(dir);
            BufferedReader input = new BufferedReader
                                             (new InputStreamReader(System.in));
            String[] listing = currentDirectory.list();
            for ( int k = 0; k < listing.length; k++ )
            {
                  if (new File(dir+File.separator+listing[k]).isDirectory())
                  {
                        listDirectory(dir+File.separator+listing[k]);
                  }
                  System.out.println( dir+File.separator+listing[k] );
            }
      }

      public static void main(String args[]) throws IOException
      {
            new ListAll().listDirectory("c:/robocode");
      }
}

Avatar of jimmack
jimmack

If I've understood you correctly, the parameter (String[] args) in you main method is the one you need to look at.  It contains an array of all the strings that follow the text "java ListAll" on your command line.

(I think you mean "execute" when you say "compile").

In your main() method, you can do this:


public static void main(String[] args)
{
    String filename;
    if (args.length() == 1)
    {
        filename = args[0];
    }
    else
    {
        filename = "c:/robocode";
    }

    new ListAll().listDirectory(filename);
}
Sorry the test for args.length shouldn't have any parenthesis:

ie.

    args.length

not
    args.length()
Avatar of pimpp1184

ASKER

Thanks for the quick answer. Now another questions(s). For this program, I have to two classes. The first one is TestDups.java ... thsi one controls the flow of processing. It will get the files from the directory. It controls all the printing on the screen. The other class is called FindDups.java. This class recursively gathers files creating info Objects, adds info objects to a collection, contains method to calculate CRC to the current file object and then return returns the collection of into fo TestDups.java. In this class, we are supposed to also have a info method to hold information such as the file objects and then the last method in this class is a compare method which will compare the files in the directory for any duplicate files. The main purpose of this program is to get the files and then compare them and print those names of the files back to the user. The part I can't get now is how to call the directory information into the FindDups class and then store those files into info objects and add them into a collection. After that I am sure I can do everything else. Thanks again for your help.

TestDups.java
import java.io.*;

class TestDups
{
      private void listDirectory(String dir)
      {
            File currentDirectory = new File(dir);

            String[] listing = currentDirectory.list();
            for ( int k = 0; k < listing.length; k++ )
            {
                  if (new File(dir+File.separator+listing[k]).isDirectory())
                  {
                        listDirectory(dir+File.separator+listing[k]);
                  }
                  System.out.println( dir+File.separator+listing[k] );
            }
      }

      public static void main(String args[]) throws IOException
      {
                String filename;
                if (args.length == 1)
                {
                    filename = args[0];
                }
                else
                {
                    filename = "c:/robocode";
                }

                new TestDups().listDirectory(filename);
      }
}

FindDups.java
import java.io.*;

public class FindDups
{
      public String processInput()
      {
            String inputLine, outputLine;
            FindDups kkp = new FindDups();
            outputLine = kkp.listDirectory(null);
            out.println(outputLine);
      }
}

Thanks agian for all your help
Can anyone help me up on this? Thanks
Well, it looks like you're going to need another class:

import java.util.*;

public class DirInfo
{
    private String path;
    private ArrayList fileList;

    public DirInfo(String dirPath, ArrayList files)
    {
        path = dirPath;
        fileList = files;
    }

    public String getPath()
    {
        return (path);
    }
 
    public ArrayList getFileList()
    {
        return (fileList);
    }
}

You need to create another ArrayList (dirListings) in the TestDup class to store these DirInfo objects and create them within the listDirectory() method:

     private void listDirectory(String dir)
     {
          File currentDirectory = new File(dir);

          String[] listing = currentDirectory.list();
          dirListings.add(new DirInfo(dir, Arrays.asList(listing)));

At the end of this method create a new FindDups object and call the processInput method including the dirListings as a parameter:

          new FindDups().processInput(dirListings);

You'll need to update FindDups to handle this, but then inside the processInput method, you'll have an ArrayList that contains DirInfo objects.  Each DirInfo object contains the path to the directory and the list of files in it.

Hope that helps.
Thanks for the quick reply. I can't get the obect part to work .. when create a new FindDups object and call the processInput. I am not exactly sure how to do this. Thanks
This is what I have:
TestDups:
import java.io.*;

class TestDups
{
      private void listDirectory(String dir)
      {
            File currentDirectory = new File(dir);

            String[] listing = currentDirectory.list();
            dirListings.add(new Info(dir, Arrays.asList(listing)));

            for ( int k = 0; k < listing.length; k++ )
            {
                  if (new File(dir+File.separator+listing[k]).isDirectory())
                  {
                        listDirectory(dir+File.separator+listing[k]);
                  }
                  System.out.println( dir+File.separator+listing[k] );
            }

                  String outputLine;
                  FindDups fDups = new FindDups();
                  outputLine = fDups.processInput(drListings);
                  out.println(outputLine);
      }


      public static void main(String args[]) throws IOException
      {
                String filename;
                if (args.length == 1)
                {
                    filename = args[0];
                }
                else
                {
                    filename = "c:/public";
                }

                new TestDups().listDirectory(filename);
      }


}

OK.  So you've avoided the need for the DirInfo object by storing the path with each file.  That's fine ;-)

If you only want to compare files in a single directory (and not files in different directories), the you just need to pass the current "listing" array into the processInput (not drListings - which could be removed).

On the other hand, if you *do* want to compare files in different directories, then you need all the strings.  In this case, all you are missing is the definition of drListings.  Add this to the top of the file:

class TestDups
{
     private ArrayList drListings;

     private void listDirectory(String dir)
     {
          File currentDirectory = new File(dir);


You also need to move the following line so that it is *after* the for loop (otherwise you are just adding an empty array):

          dirListings.add(new Info(dir, Arrays.asList(listing)));

Inside the FindDups.processInput() method, you need to extract the data like this (replace drListings with whatever you've named your parameter):

    Iterator it = drListings.iterator();
    List fileList = null;
    String[] files = null;
    while (it.hasNext())
    {
        fileList = (List)it.next();
        files = (String[])fileList.toArray(new String[0]);
    }

I've put both the List and String[] options in here, but you need to decide which one you want to use.
Alright this is what I got:

TestDups:
import java.io.*;
import java.util.*;

class TestDups
{
      private ArrayList drListings;

      private void listDirectory(String dir)
      {
            File currentDirectory = new File(dir);

            String[] listing = currentDirectory.list();

            for ( int k = 0; k < listing.length; k++ )
            {
                  if (new File(dir+File.separator+listing[k]).isDirectory())
                  {
                        listDirectory(dir+File.separator+listing[k]);
                  }
                  System.out.println( dir+File.separator+listing[k] );
            }

            drListings.add(new Info(dir, Arrays.asList(listing)));

            String outputLine;
            FindDups fDups = new FindDups();
            outputLine = fDups.processInput(drListings);
            out.println(outputLine);
      }


      public static void main(String args[]) throws IOException
      {
                String filename;
                if (args.length == 1)
                {
                    filename = args[0];
                }
                else
                {
                    filename = "c:/public";
                }

                new TestDups().listDirectory(filename);
      }


}

FindDups:

import java.io.*;
import java.util.*;

public class FindDups
{
      public String processInput()
      {
            Iterator it = drListings.iterator();
            List fileList = null;
            String[] files = null;
            while (it.hasNext())
            {
                  fileList = (List)it.next();
                  files = (String[])fileList.toArray(new String[0]);
            }
      }
}

Info (same as yours):

import java.util.*;

public class Info
{
      private String path;
      private ArrayList fileList;

      public Info(String dirPath, ArrayList files)
      {
            path = dirPath;
            fileList = files;
      }

      public String getpath()
      {
            return (path);
      }

      public ArrayList getFileList()
      {
            return (fileList);
      }
}

and I get the followings errors:
TestDups error:
C:\Param's Stuff\Second Year at R.I.T\Computer Programming 3\Find Duplicate Files (HW #5)\TestDups.java:23: cannot resolve symbol
symbol  : constructor Info (java.lang.String,java.util.List)
location: class Info
            drListings.add(new Info(dir, Arrays.asList(listing)));
                                                     ^
C:\Param's Stuff\Second Year at R.I.T\Computer Programming 3\Find Duplicate Files (HW #5)\TestDups.java:27: processInput() in FindDups cannot be applied to (java.util.ArrayList)
            outputLine = fDups.processInput(drListings);
                                                          ^

FindDups error:
C:\Param's Stuff\Second Year at R.I.T\Computer Programming 3\Find Duplicate Files (HW #5)\FindDups.java:8: cannot resolve symbol
symbol  : variable drListings
location: class FindDups
            Iterator it = drListings.iterator();
                                                 ^

I don't know what is causin this problem. Thanks for you help

In FindDups, you haven't defined the parameter in the method:

     public String processInput(ArrayList drListings)

In Info, the constructor should receive a "List" as a parameter, not an "ArrayList".  Then change the line fileList = files; to:

    fileList = new ArrayList(files);

In TestDups, on the last line of listDirectory, you've missed "System" off of the System.out.println ;-)

Since you have decided to use the Info class, you need to change the processInput method to (note that I've also added a "return"):

     public String processInput(ArrayList drListings)
     {
          Iterator it = drListings.iterator();
          Info fileList = null;
          String[] files = null;
          while (it.hasNext())
          {
              fileList = (Info)it.next();
              files = (String[])fileList.getFileList().toArray(new String[0]);
          }
         
          return ("Not finished");
     }
Alrigth I did what u told me and it complies and everythign but it is giving me an error in the command prompt.

TestDups:
Same as before with the System.out.println problem fixed

FindDups:
import java.io.*;
import java.util.*;

public class FindDups
{
      public String processInput(ArrayList drListings)
      {
            Iterator it = drListings.iterator();
            List fileList = null;
            String[] files = null;
            while (it.hasNext())
            {
                  fileList = (List)it.next();
                  files = (String[])fileList.toArray(new String[0]);
            }
            return ("Not Finished");
      }
}

Info:
import java.util.*;

public class Info
{
      private String path;
      private ArrayList fileList;

      public Info(String dirPath, List files)
      {
            path = dirPath;
            //fileList = files;
            fileList = new ArrayList(files);

      }

      public String getpath()
      {
            return (path);
      }

      public ArrayList getFileList()
      {
            return (fileList);
      }
}

This is the error on the command prompt:

Exception in thread "main" java.lang.NullPointerException
        at TestDups.listDirectory(TestDups.java:14)
        at TestDups.main(TestDups.java:44)
Press any key to continue . . .
Sorry, I deleted a line from my last comment :-(

The line:

    private ArrayList drListings;

should be

    private ArrayList drListings = new ArrayList();
sorry for being such an annoyance .. it keeps givin me an error in the command prompt when i try to run it:

TestDups code:
import java.io.*;
import java.util.*;

class TestDups
{
      private ArrayList drListings = new ArrayList();

      private void listDirectory(String dir)
      {
            File currentDirectory = new File(dir);

            String[] listing = currentDirectory.list();

            for ( int k = 0; k < listing.length; k++ )
            {
                  if (new File(dir+File.separator+listing[k]).isDirectory())
                  {
                        listDirectory(dir+File.separator+listing[k]);
                  }
                  System.out.println( dir+File.separator+listing[k] );
            }

            drListings.add(new Info(dir, Arrays.asList(listing)));

            String outputLine;
            FindDups fDups = new FindDups();
            outputLine = fDups.processInput(drListings);
            System.out.println(outputLine);
      }


      public static void main(String args[]) throws IOException
      {
                String filename;
                if (args.length == 1)
                {
                    filename = args[0];
                }
                else
                {
                    filename = "c:/public";
                }

                new TestDups().listDirectory(filename);
      }


}

Error:
Exception in thread "main" java.lang.NullPointerException
        at TestDups.listDirectory(TestDups.java:14)
        at TestDups.main(TestDups.java:44)
Press any key to continue . . .

Thanks again
Oh.

It looks like "listing" might be null.  There are a few reasons why this might happen.  My guess is that the initial file/path you provide is invalid on the command line (or "C:/public" doesn't exist).

If the problem persists, try putting a System.out.println(currentDirectory) just before you call list() on it.
damn i am so stupid ... yeah the public folder didnt' exist . i was workin on this program at school and i changed it and i never thought of changing it back. Well i tried it and it sorta worked. It shows some files but not all of them. I have no idea what could hav caused that ... it worked fine before when I first asked for your help. And after it shows some of the files in the folder. It gives that same error. No idea what can cause that. THanks again for your help. I really apprecaite it
No problem.  Glad I could help.

If it's still giving you NullPointerException, have a look at the javadocs for list() to see if it gives you any clues ;-)

(Is there a question in your last post, or are you just making an observation? ;-))
yeah the question on the last post is that the program won't display all the files in the folder. It only displays some files. I do not know why it does that. When I first posted with the first code ... it would display all the files. I don't know what is wrong with it now. Thanks
Can you identify which file it is having problems with?

Is it breaking down within a directory (eg. between processing file1.txt and file2.txt) or between directories (eg. when it calls listDirectory())?

How quickly is the problem showing up?  I've just run the last code you posted on my system and it quite happily displayed several thousand files from my home directory.

Does it work if you run it on a directory with no subdirectories?  If it does, keep moving up the directory tree until you get the problem again.

Keep going.  I'll get this fixed ;-)
Alright I put in a directory with many subdirectories and it has many many files in this folder. And all that prints out in the command prompt is this:

c:/my shared folder\AlbumArtSmall.jpg
c:/my shared folder\AlbumArt_{1A2C8636-A33D-4A69-991E-F16904E85191}_Large.jpg
c:/my shared folder\AlbumArt_{1A2C8636-A33D-4A69-991E-F16904E85191}_Small.jpg
c:/my shared folder\AlbumArt_{6E45119B-A10A-41B3-93D8-D48CCCE3BBCF}_Large.jpg
c:/my shared folder\AlbumArt_{6E45119B-A10A-41B3-93D8-D48CCCE3BBCF}_Small.jpg
c:/my shared folder\AlbumArt_{80C0457C-661C-4B64-88A4-75EB21212072}_Large.jpg
c:/my shared folder\AlbumArt_{80C0457C-661C-4B64-88A4-75EB21212072}_Small.jpg
c:/my shared folder\AlbumArt_{8792034B-4220-4919-9A4E-0449829823FE}_Large.jpg
c:/my shared folder\AlbumArt_{8792034B-4220-4919-9A4E-0449829823FE}_Small.jpg
c:/my shared folder\AlbumArt_{ED7F557F-014B-42B5-9388-88431E46B139}_Large.jpg
c:/my shared folder\AlbumArt_{ED7F557F-014B-42B5-9388-88431E46B139}_Small.jpg
c:/my shared folder\AlbumArt_{F489121E-E03A-4DDF-AE92-739A56491391}_Large.jpg
c:/my shared folder\AlbumArt_{F489121E-E03A-4DDF-AE92-739A56491391}_Small.jpg
c:/my shared folder\AlbumArt_{FD96394F-1103-4ABA-834B-9843DB7CF956}_Large.jpg
c:/my shared folder\AlbumArt_{FD96394F-1103-4ABA-834B-9843DB7CF956}_Small.jpg
c:/my shared folder\Applications\DeadAim 4.0 KeyGen.exe
c:/my shared folder\Applications\DeadAIM_4.0.exe
Exception in thread "main" java.lang.ClassCastException
        at FindDups.processInput(FindDups.java:13)
        at TestDups.listDirectory(TestDups.java:27)
        at TestDups.listDirectory(TestDups.java:18)
        at TestDups.main(TestDups.java:43)
Press any key to continue . . .

I hope this is what u were lookin for
Ah.  That's a ClassCastException, not a NullPointerException.  Let's see where that leads ;-)

Right.  The problem is that on line 13 of FindDups, a cast is failing.  The line should be:

               fileList = (List)it.next();

This means that the drListings object contains something other than Lists.

Aha!  That's because your FindDups.java isn't up-to-date.  The last posting you have didn't include the change I described ;-)  Not to worry.  Here's what you should have in FindDups.java

import java.io.*;
import java.util.*;

public class FindDups
{
     public String processInput(ArrayList drListings)
     {
          Iterator it = drListings.iterator();
          Info fileList = null;
          String[] files = null;
          while (it.hasNext())
          {
              fileList = (Info)it.next();
              files = (String[])fileList.getFileList().toArray(new String[0]);
          }
         
          return ("Not finished");
     }
}

Note the cast to (Info), not (List) ;-)
oh damn that was so stupid of me. Well hey I won't be in the majority of the day. I got a 6 hour course to go to on my birthday of all the days and then party .. hehe. But I do have a question. I was wondering if u can give me some tips on how I can first sort the files using CRC method and then find the duplicate files and show them on the screen instead of everything in the folder. Thanks again for your help.
;-)

Happy Birthday ;-)  Hope you had a good time :-)

I believe (^) is the correct emoticon :-)

>> can give me some tips

Probably the easiest place to calculate the CRC is in the Info class.  Add another attibute to it to store the CRC and calculate the CRC for each file provided in the constructor.  At least that way, you've found a place to link the filename and CRC value.

If you then want to go the "whole hog", you can have the Info class implement the java.util.Comparable interface which will allow you to call Collections.sort() to sort them into order (on the CRC value).

Once they're sorted, you just need to iterate through the list.  If the current item is the same as the previous item, display it with your "duplicate file" message.
Hi. Well I was playin with the code to get the files sorted and then show which ones have the same name and then print them out.
I was playin with the FindDups part of the code cuz I was wondering if I can do the CRC method in it. I tried using the code that I had to do in class and it wouldn't work so well. I am gonna post the FindDups code and the codes I had to do in class which is supposed to help me with this program.

FindDups:
import java.io.*;
import java.util.*;

public class FindDups
{
     public String processInput(ArrayList drListings)
     {
          Iterator it = drListings.iterator();

          Info fileList = null;
          String[] files = null;
          while (it.hasNext())
          {
              fileList = (Info)it.next();
              files = (String[])fileList.getFileList().toArray(new String[0]);
          }

          Comparator comp = new IdComparator();
          Collections.sort(drListings, comp);

          System.out.println("\n\n");

          it = drListings.iterator();
          while (it.hasNext())
              {
                  fileList = (Info)it.next();
                  files = (String[])fileList.getFileList().toArray(new String[0]);
          }
          //return ("Not finished");
     }

}

class IdComparator implements Comparator
{
      public int compare(Object firstObj, Object secondObj)
      {
            long first = ((Info)firstObj).getId();
            long second = ((Info)secondObj).getId();

            if(first < second) return -1;
            if(first == second) return 0;

            return 1;
      }
}

SortIt18:

import java.util.*;
import java.io.*;
import java.text.*;

public class SortIt18
{
   public static void main(String [] args)
   {
      Iterator it;
      ArrayList dataAry = new ArrayList();

      dataAry.add( new Info17( 888, "banana", "I go.") );
      dataAry.add( new Info17( 222, "(bug) ", "never does what I want") );
      dataAry.add( new Info17( 666, "candle", "it's off") );
      dataAry.add( new Info17( 111, "_vase ", "wish they would sell it") );
      dataAry.add( new Info17( 999, "1 Lomb", "- - the end - - ") );
      dataAry.add( new Info17( 000, "glass ", "Damm computer") );
      dataAry.add( new Info17( 333, "light ", "only what I tell it.") );
      dataAry.add( new Info17( 777, "mouse ", "to disk") );
      dataAry.add( new Info17( 444, "Tecra ", "- - and another - -") );
      dataAry.add( new Info17( 555, "candle", "IO IO") );

        it = dataAry.iterator();
        while( it.hasNext() )
        {
              System.out.println( it.next() );
        }

        Comparator comp = new IdComparator();
        Collections.sort(dataAry, comp);

        System.out.println("\n\n");

        it = dataAry.iterator();
        while(it.hasNext() )
        {
              System.out.println( it.next() );
         }

   }// end main
}// end sorting

class IdComparator implements Comparator
{
      public int compare(Object firstObj, Object secondObj)
      {
            long first = ((Info17)firstObj).getId();
            long second = ((Info17)secondObj).getId();

            if (first < second) return -1;
            if (first == second) return 0;
            return 1;
      }
}

class Info17
{
      private int id;
      private String name;
      private String otherStr;

      public Info17( int inId, String inName, String inOther)
      {
            id = inId;
            name = inName;
            otherStr = inOther;
      }

      public int getId() { return id; }

      public String toString()
      {
            return Integer.toHexString(id).toUpperCase()+" "+name+" "+otherStr;
      }


}

CalcCRC:

import java.io.*;
import java.util.zip.*;
import java.text.*;
import java.util.*;

public class CalcCrc
{
      private final int BF_SIZE = 8192;

      public static void main(String[] args)
      {
            long crc;
            CalcCrc cc = new CalcCrc();
            String filename = args.length > 0 ? args[0] : "CalcCrc.java";

            File fil = new File( filename );
            crc = cc.getCrc( fil );

            System.out.println("Information for: " +filename);
            System.out.println("CRC            : " + Long.toHexString(crc).toUpperCase());
            System.out.println("Name           : " + fil.getName());
            System.out.println("Path Name      : " + fil.getAbsolutePath());
            System.out.println("File Size      : " + fil.length());
            System.out.println("Directory      : " + fil.isDirectory());
            System.out.println("File?          : " + fil.isFile());
            System.out.println("Hidden?        : " + fil.isHidden());
            System.out.println("Exits?         : " + fil.exists());

            long ts = fil.lastModified();
            Format sdf = new SimpleDateFormat("MM/dd/yy hh:mm:ss aa");
            System.out.println("Last Modified  : " + sdf.format(new Date(ts)));
            System.out.println("Last Modified  : " + (new Date(ts)));


      }

      public long getCrc (File aFile)
      {
            byte[] buffer = new byte[BF_SIZE];
            int len = 0;
            long crcValue = -1;
            CRC32 crc = new CRC32();
            crc.reset();

            try
            {
                  BufferedInputStream bis = new BufferedInputStream(new FileInputStream(aFile));

                  while (( len = bis.read(buffer)) > -1)
                  {
                        crc.update(buffer,0,len);
                  }
                  crcValue = crc.getValue();
                  bis.close();
            }
            catch(Exception e)
            {
                  e.printStackTrace();
            }

            return crcValue;
      }

}


I was wondering if u can help me on what I should do cuz as always I am so lost. I wanna figure out how to proplery implement the CRC in the code and have it take the files and sort them out.  I had a really bad weekend ... the party that my friends planned for me they just backed out of it sayin that no one is comin and crap like that. And top it all my computer messed up badly. It wouldnt' start into windows so I had to reformat everything and put windows back on. I lost practically everything. It sucked but anyway sorry for boring you with stupid problems. I know I am annoying. I just wanna know this stuff works. Thanks agian for your help.
It looks like you've got everything you need, it's just a case of putting it all together.

The main problem you (we) have is that the Info class stores a complete list of files inside an ArrayList.  We might have to change this, depending on exactly what you want to do.

Do you want to:

a) Compare files within a directory
or
b) Compare all the files that you have stored?

For a), you should probably create a new class to store the individual file & CRC details

For b), you need to store only 1 file in the Info class at any time and the drListing should then store all the files that are read in (in all the directories).

How do you want to play it ;-) ?
haha .. well not exaclty PLAY with it ... fool around with so I can get it work since I am not too good in java. This class was just hell. I got 2 more weeks of this including the final. Thank god. haha. But anyway. I want to compare all the files with the directory and the sub-directory. Can u start me up on how I can do that using the code(s) I had to do in class. I understood what I did in class but I can't understand this. I think i gotta something wrong with myself :P. THanks again for your help
>> I want to compare all the files with the directory and the sub-directory

Good answer.  I've gone all this way on the fact that you have actually done a lot of the work yourself.  Here is a fully working solution built from all your bits and re-arranged slightly.  I've sorted the formatting out, so it should be quite easy to read.

When you're happy it works, please read and understand the code.  I'll be happy to explain anything that you don't understand.  You've come so far with this, it would be a shame to miss out on a full understanding ;-)

BTW: I expect an A grade for this one ;-)

CalcCrc.java
----------------

Just what you posted.


FindDups.java
-------------------

Pretty much what you posted, but with the StringBuffer that collects the text of the result:

import java.io.*;
import java.util.*;

public class FindDups
{
     public String processInput(ArrayList drListings)
     {
         StringBuffer result = new StringBuffer();
         
          Iterator it = drListings.iterator();

          Info fileInfo = null;

          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
//              System.out.println(fileInfo.getpath());
          }

          Comparator comp = new IdComparator();
          Collections.sort(drListings, comp);

//          System.out.println("\n\n");

          it = drListings.iterator();
          String prevFile = null;
          long previousCRC = 0;
         
          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
              if (fileInfo.getId() == previousCRC)
              {
                  result.append(prevFile);
                  result.append(" is the same as ");
                  result.append(fileInfo.getpath());
                  result.append(" CRC = " + previousCRC);
                  result.append("\n");
              }
              prevFile = fileInfo.getpath();
              previousCRC = fileInfo.getId();
              System.out.println(fileInfo.getpath());
          }
         
          return (result.toString());
     }
}

IdComparator.java
-------------------------

Just what you posted (the one that uses Info, not Info17)!


Info.java
------------

Removed the array list.  Each Info object now stores a filename (including path) and the CRC for it.

import java.util.*;

public class Info
{
     private String path;
     private long id;

     public Info(String dirPath, long crc)
     {
          path = dirPath;
          id = crc;          
     }

     public String getpath()
     {
          return (path);
     }

     public long getId()
     {
         return (id);
     }
}


TestDups.java
-------------------

Cleaned up the main method to actually create a proper object for the class.  Added another method buildAndSort() to perform the two operations.  Modified listDirectory() to create Info objects for each file and store *all* files from *all* directories in the single drListing ArrayList.  Also created an instance of CalcCrc.

import java.io.*;
import java.util.*;

class TestDups
{
    private ArrayList drListings = new ArrayList();
    private CalcCrc calcCrc = new CalcCrc();


    private void listDirectory(String dir)
    {
        File currentDirectory = new File(dir);

        String[] listing = currentDirectory.list();
        File curFile = null;

        for (int k = 0; k < listing.length; k++)
        {
            curFile = new File(dir + File.separator + listing[k]);
            if (curFile.isDirectory())
            {
                listDirectory(dir + File.separator + listing[k]);
            }
            else
            {
                drListings.add(new Info(curFile.getPath(), calcCrc.getCrc(curFile)));
            }
        }
    }

    private void buildAndSort(String filename)
    {
        listDirectory(filename);
       
        String outputLine;
        FindDups fDups = new FindDups();
        outputLine = fDups.processInput(drListings);
        System.out.println(outputLine);
    }

    public static void main(String args[]) throws IOException
    {
        String filename;
        if (args.length == 1)
        {
            filename = args[0];
        }
        else
        {
            filename = "c:/public";
        }

        TestDups td = new TestDups();
        td.buildAndSort(filename);
    }

}


That should be it.  It seems to recurse nicely through my directories and successfully picks up the duplicates ;-)
ALright I did what u told me to do. I put Idcomparator and CalcCrc in FindDups. Everything would compile but nuthin would show up .. unless I am doin something wrong.

Here is FindDups:
import java.io.*;
import java.util.zip.*;
import java.text.*;
import java.util.*;

public class FindDups
{
     public String processInput(ArrayList drListings)
     {
         StringBuffer result = new StringBuffer();

          Iterator it = drListings.iterator();

          Info fileInfo = null;

          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
//              System.out.println(fileInfo.getpath());
          }

          Comparator comp = new IdComparator();
          Collections.sort(drListings, comp);

//          System.out.println("\n\n");

          it = drListings.iterator();
          String prevFile = null;
          long previousCRC = 0;

          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
              if (fileInfo.getId() == previousCRC)
              {
                  result.append(prevFile);
                  result.append(" is the same as ");
                  result.append(fileInfo.getpath());
                  result.append(" CRC = " + previousCRC);
                  result.append("\n");
              }
              prevFile = fileInfo.getpath();
              previousCRC = fileInfo.getId();
              System.out.println(fileInfo.getpath());
          }

          return (result.toString());
     }
}

class IdComparator implements Comparator
{
     public int compare(Object firstObj, Object secondObj)
     {
          long first = ((Info)firstObj).getId();
          long second = ((Info)secondObj).getId();

          if (first < second) return -1;
          if (first == second) return 0;
          return 1;
     }
}

class CalcCrc
{
      private final int BF_SIZE = 8192;

      public static void main(String[] args)
      {
            long crc;
            CalcCrc cc = new CalcCrc();
            String filename = args.length > 0 ? args[0] : "CalcCrc.java";

            File fil = new File( filename );
            crc = cc.getCrc( fil );

            System.out.println("Information for: " +filename);
            System.out.println("CRC            : " + Long.toHexString(crc).toUpperCase());
            System.out.println("Name           : " + fil.getName());
            System.out.println("Path Name      : " + fil.getAbsolutePath());
            System.out.println("File Size      : " + fil.length());
            System.out.println("Directory      : " + fil.isDirectory());
            System.out.println("File?          : " + fil.isFile());
            System.out.println("Hidden?        : " + fil.isHidden());
            System.out.println("Exits?         : " + fil.exists());

            long ts = fil.lastModified();
            Format sdf = new SimpleDateFormat("MM/dd/yy hh:mm:ss aa");
            System.out.println("Last Modified  : " + sdf.format(new Date(ts)));
            System.out.println("Last Modified  : " + (new Date(ts)));


      }

      public long getCrc (File aFile)
      {
            byte[] buffer = new byte[BF_SIZE];
            int len = 0;
            long crcValue = -1;
            CRC32 crc = new CRC32();
            crc.reset();

            try
            {
                  BufferedInputStream bis = new BufferedInputStream(new FileInputStream(aFile));

                  while (( len = bis.read(buffer)) > -1)
                  {
                        crc.update(buffer,0,len);
                  }
                  crcValue = crc.getValue();
                  bis.close();
            }
            catch(Exception e)
            {
                  e.printStackTrace();
            }

            return crcValue;
      }

}
Have you built it with the last TestDups.java that I posted?

Make sure you've got some duplicate files in a directory ;-)

The easiest way to test that is to copy one of these file (eg. copy TestDups.java to copy.txt), then run it with:

java TestDups .

If I do that, I get:

./IdComparator.class
./TestDups.java
./copy.txt
./Info.java
./CalcCrc.class
./FindDups.class
./IdComparator.java
./FindDups.java
./CalcCrc.java
./Info.class
./TestDups.class
./TestDups.java is the same as ./copy.txt CRC = 280334153


(BTW: I had each of the classes in separate files)
i had the files in separate classes but TestDups would give me an error sayin that private CalcCrc calcCrc ... is incorrect. If u can show me I can try and see what I did wrong. THanks agian for your help. In the meantime I will try what u said
Providing the file CalcCrc.java is in the same directory as the rest of your Java files, it should be found OK.  If you're still having trouble, please post the complete error you're getting ;-)
Hi. This is the error I get.
Exception in thread "main" java.lang.NullPointerException
        at TestDups.listDirectory(TestDups.java:17)
        at TestDups.buildAndSort(TestDups.java:33)
        at TestDups.main(TestDups.java:54)
Press any key to continue . . .

and I have everything exactly the way you wanted.

Thank you
;-)

Whenever you post an exception like this, please quote the line where the error is occurring too.  It usually makes things much quicker ;-)

I'm guessing that line 17 in TestDups.java is referring to calcCrc and this is the object that is null.  Make sure that at the top of the file you declare it *and* define it in the one statement:

    private CalcCrc calcCrc = new CalcCrc();

(By declare and define I mean "private CalcCrc calcCrc;" would be a declaration, adding the " = new CalcCrc();" is the definition)

If this isn't the case, just post line 17 ;-)
Hi. Sorry about that. I was sorta in a hurry for classes and completely forgot about that error line. I am school now. I was lookin at the code and I tried different folders. It sorta worked but not quite.
The line of error is this one: for (int k = 0; k < listing.length; k++)

The way the program output is supposed to be is like this:
         38afafd            1,450  04/17/03 07:17:42 PM  C:\RITCourses\219-Fall\HW-masters\219_023HWFromKevinB\HW4\Client.class
         38afafd            1,450  04/14/03 12:48:28 AM  C:\RITCourses\219-Fall\HW-masters\219_023HWFromKevinB\HW4\nimservice1\Client.class
         38afafd            1,450  10/17/03 08:23:34 AM  C:\RITCourses\219-Fall\HW-masters\219-031 HW04 Nim\Client.class

         42197cd            1,963  10/07/02 12:51:58 AM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\Practicum\Fall2002\VowelServer.java
         42197cd            1,963  10/07/02 01:51:58 AM  C:\RITCourses\219-Fall\HW-masters\Practicum1-JK\219-023 Practical\VowelServer.java

        d69ccd6a              599  04/08/02 06:34:30 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\Practicum\Spring2002\Child.class
        d69ccd6a              599  10/10/02 10:44:42 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-JK\219-023 Practical\Child.class

        ed24f7ba              881  04/08/02 06:34:30 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\Practicum\Spring2002\Pract1$1.class
        ed24f7ba              881  10/09/02 01:35:58 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\PracticumReview\Pract1$1.class

Sorry and thanks
Oh.

That means that the "listing" array is null.

This should only happen if there is an I/O error or the file (currentDirectory) is not a directory.

The first thing I can think of is that you've changed from home to school again.  Is this happening because you are specifying a directory that doesn't exist?

If it does exist, add the following to the top of the listDirectory method:

 System.out.println("dir = " + dir);

This should give you an idea about which directory is causing a problem.

As far as the formatting goes, you'll need to recreate the File objects for each entry so that you can access the length() method and the lastModified() method.

To display the CRC in hex, look at the method Integer.toHexString()

Also, inserting a tab into the string is done like this:  "here is a string with a tab at the end \t"
Alright ... I did what u said about puttin "System.out.println("dir = " + dir);" before the listDirectory method and I tried it .... it works for certain folders but I tried it on Program Files Folder. I realized that it takes so much time to load up so that is why the program doesn't know the duplicate files which is supposed to be shown at the end of the program. Is there a way for the program to ONLY show the duplicates? Thank you
The reason for the System.out.println() was to find out which directory was causing the application to crash.  If it's working, you can delete this line again.

Also, remove the last System.out.println() from the FindDups.java file (at the end of the while loop).  The app should then just display the duplicates.
Hi. Alright I am tryin to do something else now. The example I showed you earlier to how we are supposed to have the output. I am trying to get the date on the output. I used the CalcCRC to try to do it but can't get it to work properly. This is the FindDups code:

import java.io.*;
import java.util.zip.*;
import java.text.*;
import java.util.*;

public class FindDups
{
       private  CalcCrc cc = new CalcCrc();
     public String processInput(ArrayList drListings)
     {
          long crc;

              String filename = drListings(0).length > 0 ? drListings[0] : "FindDups.java";

              File fil = new File( it );
              crc = cc.getCrc( fil );
              long ts = fil.lastModified();


          StringBuffer result = new StringBuffer();

          Iterator it = drListings.iterator();

          Info fileInfo = null;

          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
          }

          Comparator comp = new IdComparator();
          Collections.sort(drListings, comp);


          it = drListings.iterator();
          String prevFile = null;
          long previousCRC = 0;

          while (it.hasNext())
          {
              fileInfo = (Info)it.next();
              if (fileInfo.getId() == previousCRC)
              {
                  /*result.append(prevFile);
                  result.append(" is the same as ");
                  result.append(fileInfo.getpath());
                  result.append(" CRC = " + previousCRC);
                  result.append("\n");*/

                  System.out.println((new Date(ts)) + prevFile );
                  System.out.println(fileInfo.getpath());
                  System.out.println(" CRC = " + previousCRC);
                  System.out.println("\n");
               }
              prevFile = fileInfo.getpath();
              previousCRC = fileInfo.getId();
             // System.out.println(fileInfo.getpath());
          }

          return (result.toString());
     }
}


class IdComparator implements Comparator
{
     public int compare(Object firstObj, Object secondObj)
     {
          long first = ((Info)firstObj).getId();
          long second = ((Info)secondObj).getId();

          if (first < second) return -1;
          if (first == second) return 0;
          return 1;
   
ASKER CERTIFIED SOLUTION
Avatar of jimmack
jimmack

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Alright mostly everything works now except for the important part I believe. I realized that the only time that the program works is when you read of the current file such as when 'filename = ".";' but the program can't work when 'filename = "f:\Program files\";' ... It would read the files that the program is currently placed in. I don't think u tried that ... see if u get the same problem. And thanks for the question on the date problem.

I think there is something wrong in this part of code in the TestDups.java:
        for (int k = 0; k < listing.length; k++)
        {
            curFile = new File(dir + File.separator + listing[k]);
            if (curFile.isDirectory())
            {
                listDirectory(dir + File.separator + listing[k]);
            }
            else
            {
                drListings.add(new Info(curFile.getPath(), calcCrc.getCrc(curFile)));
            }
        }
    }

I am not excatly sure. Thanks again for your help
Nope.  Works perfectly OK on my system.  I can provide any directory and it checks for duplicates correctly.

I see you've shown them in your comment, so I assume you are using the inverted commas around directories that include spaces:

java TestDups "f:\Program files"

If you don't do this, TestDups.main() will assume 2 args, "f:\Program" and "files", so will use the default directory instead of the one you've provided.

There's nothing wrong with the code you've posted.
Alright it works ... god i am so stupid ... haha .. it works fine but it takes a while to go through the files .. but it works fine .. thanks a lot ... I do have 2 more and final questions real quick .. How can I make the output of the duplicates save on a word document file such as FindDups.txt from the program? And the last question is that how can I show the number of files that are duplicates such as the follwoing:
         38afafd            1,450  04/17/03 07:17:42 PM  C:\RITCourses\219-Fall\HW-masters\219_023HWFromKevinB\HW4\Client.class
         38afafd            1,450  04/14/03 12:48:28 AM  C:\RITCourses\219-Fall\HW-masters\219_023HWFromKevinB\HW4\nimservice1\Client.class
         38afafd            1,450  10/17/03 08:23:34 AM  C:\RITCourses\219-Fall\HW-masters\219-031 HW04 Nim\Client.class

         42197cd            1,963  10/07/02 12:51:58 AM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\Practicum\Fall2002\VowelServer.java
         42197cd            1,963  10/07/02 01:51:58 AM  C:\RITCourses\219-Fall\HW-masters\Practicum1-JK\219-023 Practical\VowelServer.java

        d69ccd6a              599  04/08/02 06:34:30 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-KB\Practicum\Spring2002\Child.class
        d69ccd6a              599  10/10/02 10:44:42 PM  C:\RITCourses\219-Fall\HW-masters\Practicum1-JK\219-023 Practical\Child.class

If u notice such numbers as 599, 1963, etc. How can I get that? I think those are the number of files that are duplicates?
Thanks once agian .. u have been a great help
Those numbers are the file sizes ;-)

java TestDups "f:\Program files" > temp.txt

will redirect the output to the file temp.txt.

To count the total number of duplicates, add an int before the while loop in FindDups and increment it when you do the result.append()s.  If you want, you can append it to the result StringBuffer just before you return it.
Hope I'm getting plenty of points for this ;-)

If you're quick, you could be the one to give me my "Master Level Certification" ;-)
haha ... i will give u all the points i have left ... for this i beleive i had 125 points assigned for this problem .. i have 95 left if u want that too .. thanks agian .. i am gonna try this and see if it works ...
Oh yeah forgot to ask .. if those numbers are file sizes .. how can I get those to show up? Thanks
Add the two lines into the result string building section of FindDups.java:

                  curFile = new File(fileInfo.getpath());
                  result.append(Long.toHexString(previousCRC));
                  result.append("\t");

                  result.append(curFile.length());  // Display the file size
                  result.append("\t");                  // Add another tab

                  result.append(new Date(curFile.lastModified()));
Alright .. FINALLY .. done .. thanks for alll your help man .. I really apprecaite .. i am gonna increase the points 95 more. Thanks again.
Phew!

;-)

I'm glad it finally works OK.  I'll be happier if you believe that you have learned something along the way ;-)

Thanx.

Jim.
Yes I did alot ... I apprecaite it alot. This was the last hw for the course i am takin (java 3). Hell. But I got a final project to do ... I gotta make a robocode. Fun stuff. Alright well thanks again. Maybe I will post a question sometime .. haha. Later
;-)  Best of luck.