Link to home
Start Free TrialLog in
Avatar of ryno71
ryno71

asked on

Best way to see if a file is compressed

Hi

If I want to see if a file is compressed , what is the best way?  If see if the zipentry is null... might work but if the the file is corrupted, that wouldn't be accurate


Thanks
ryno71
Avatar of sciuriware
sciuriware

Test for an extension .zip or .ZIP

If a zip-file is named another way somebody is cheating.

;JOOP!
Avatar of ryno71

ASKER

Thats just it, if I dont know... say it was sent to me in a byte array I wouldn't know.
File f;   // Got it from a directory listing, then you don't know its name.

     if(f.getAbsolutePath().toUpper().endsWith(".ZIP"))
     {
 //    compressed


;JOOP!
Avatar of ryno71

ASKER

If I am the one naming the new file I receive and don't know what the original name was this won't work.  
OK then, if the first 8 bytes are (octal):

 0120, 0113, 003, 004, 024, 000, 002, 000

then you may assume it's a ZIP compressed file.

;JOOP!
You can read those 8 bytes from a raw InputStream from the file.

;JOOP!
Correction: I did some research on other zip files:

only 4 bytes are enough:

0120, 0113, 003, 004

;JOOP!
Avatar of ryno71

ASKER

Something like this?

I keep getting false

public class BytesFromFile
{

public static byte[] getBytesFromFile(File file) throws IOException {
        InputStream is = new FileInputStream(file);
   
        // Get the size of the file
        long length = file.length();
   
        // You cannot create an array using a long type.
        // It needs to be an int type.
        // Before converting to an int type, check
        // to ensure that file is not larger than Integer.MAX_VALUE.
        if (length > Integer.MAX_VALUE) {
            // File is too large
        }
   
        // Create the byte array to hold the data
        byte[] bytes = new byte[4];
            
            is.read(bytes);
            
      
            
            byte[] bytes1= new byte[] {0120,0113,003,004};

            boolean bo = Arrays.equals(bytes, bytes1);
            boolean bo1 = bytes.equals(bytes1);
            
            System.out.println("result is "+ result);
            System.out.println("bo is "+ bo);
            System.out.println("bo1 is "+ bo1);
        // Read in the bytes
        int offset = 0;
        int numRead = 0;
        while (offset < bytes.length
               && (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
            offset += numRead;
        }
   
        // Ensure all the bytes have been read in
        if (offset < bytes.length) {
            throw new IOException("Could not completely read file "+file.getName());
        }
   
        // Close the input stream and return bytes
        is.close();
        return bytes;
    }

      
      
      
      
      public static void main(String args[]) {
               
                    BytesFromFile da = new BytesFromFile();
               try
               {
                    System.out.println(" ");
                              System.out.println("Zip1 ");
                              System.out.println(" ");
                              String fileName=args[0];
                              byte[] bytes =null;
                              File temp=null;
                              temp = new File(fileName);
                    bytes=da.getBytesFromFile(temp);
                        
                              System.out.println("bytes are "+bytes);
                              
                              
               }
               catch (Exception e1)
               {
                    e1.printStackTrace();
               }
          }
      }
ASKER CERTIFIED SOLUTION
Avatar of sciuriware
sciuriware

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of ryno71

ASKER

Thanks.  that will work better!
Avatar of ryno71

ASKER

sciuriware

Where did you find that you need to look at four bytes for a compressed file (pkzip or Gzip), couldn't it be two?

ryno71
From the SUN JAVA sources:

ZIP      "PK\003\004"
GZIP    0x8B1F

seems to be 4 bytes all the time.

;JOOP!
Avatar of ryno71

ASKER

where did you find this?  Been looking and can't sem to find it!

Thanks alot!
The JDK is accompanied by "src.zip"

;JOOP!