Solved

Best way to see if a file is compressed

Posted on 2006-11-06
15
140 Views
Last Modified: 2010-08-05
Hi

If I want to see if a file is compressed , what is the best way?  If see if the zipentry is null... might work but if the the file is corrupted, that wouldn't be accurate


Thanks
ryno71
0
Comment
Question by:ryno71
  • 9
  • 6
15 Comments
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
Test for an extension .zip or .ZIP

If a zip-file is named another way somebody is cheating.

;JOOP!
0
 

Author Comment

by:ryno71
Comment Utility
Thats just it, if I dont know... say it was sent to me in a byte array I wouldn't know.
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
File f;   // Got it from a directory listing, then you don't know its name.

     if(f.getAbsolutePath().toUpper().endsWith(".ZIP"))
     {
 //    compressed


;JOOP!
0
 

Author Comment

by:ryno71
Comment Utility
If I am the one naming the new file I receive and don't know what the original name was this won't work.  
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
OK then, if the first 8 bytes are (octal):

 0120, 0113, 003, 004, 024, 000, 002, 000

then you may assume it's a ZIP compressed file.

;JOOP!
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
You can read those 8 bytes from a raw InputStream from the file.

;JOOP!
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
Correction: I did some research on other zip files:

only 4 bytes are enough:

0120, 0113, 003, 004

;JOOP!
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 

Author Comment

by:ryno71
Comment Utility
Something like this?

I keep getting false

public class BytesFromFile
{

public static byte[] getBytesFromFile(File file) throws IOException {
        InputStream is = new FileInputStream(file);
   
        // Get the size of the file
        long length = file.length();
   
        // You cannot create an array using a long type.
        // It needs to be an int type.
        // Before converting to an int type, check
        // to ensure that file is not larger than Integer.MAX_VALUE.
        if (length > Integer.MAX_VALUE) {
            // File is too large
        }
   
        // Create the byte array to hold the data
        byte[] bytes = new byte[4];
            
            is.read(bytes);
            
      
            
            byte[] bytes1= new byte[] {0120,0113,003,004};

            boolean bo = Arrays.equals(bytes, bytes1);
            boolean bo1 = bytes.equals(bytes1);
            
            System.out.println("result is "+ result);
            System.out.println("bo is "+ bo);
            System.out.println("bo1 is "+ bo1);
        // Read in the bytes
        int offset = 0;
        int numRead = 0;
        while (offset < bytes.length
               && (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
            offset += numRead;
        }
   
        // Ensure all the bytes have been read in
        if (offset < bytes.length) {
            throw new IOException("Could not completely read file "+file.getName());
        }
   
        // Close the input stream and return bytes
        is.close();
        return bytes;
    }

      
      
      
      
      public static void main(String args[]) {
               
                    BytesFromFile da = new BytesFromFile();
               try
               {
                    System.out.println(" ");
                              System.out.println("Zip1 ");
                              System.out.println(" ");
                              String fileName=args[0];
                              byte[] bytes =null;
                              File temp=null;
                              temp = new File(fileName);
                    bytes=da.getBytesFromFile(temp);
                        
                              System.out.println("bytes are "+bytes);
                              
                              
               }
               catch (Exception e1)
               {
                    e1.printStackTrace();
               }
          }
      }
0
 
LVL 24

Accepted Solution

by:
sciuriware earned 250 total points
Comment Utility
better:

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;

public class TestForCompression
{
   /**
    * Program entry point.
    *
    * @param commandLine program command line vector.
    * @throws IOException
    */
   public static void main(String[] commandLine) throws IOException
   {
      byte[] bytes = new byte[] {0120,0113,003,004};
      byte[] data = new byte[4];
      RandomAccessFile ra;

      if(commandLine.length > 0)
      {
         ra = new RandomAccessFile(new File(commandLine[0]), "r");
         if(ra.read(data) != data.length)
         {
            System.out.println("Short file, not compressed.");
            return;
         }
         
         for(int i = 0;  i < data.length;  ++i)
         {
            if(data[i] != bytes[i])
            {
               System.out.println("Not compressed.");
               return;
            }
         }
         System.out.println("Compressed.");
      }
   }
}

Always test your code!

;JOOP!
0
 

Author Comment

by:ryno71
Comment Utility
Thanks.  that will work better!
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
:)
0
 

Author Comment

by:ryno71
Comment Utility
sciuriware

Where did you find that you need to look at four bytes for a compressed file (pkzip or Gzip), couldn't it be two?

ryno71
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
From the SUN JAVA sources:

ZIP      "PK\003\004"
GZIP    0x8B1F

seems to be 4 bytes all the time.

;JOOP!
0
 

Author Comment

by:ryno71
Comment Utility
where did you find this?  Been looking and can't sem to find it!

Thanks alot!
0
 
LVL 24

Expert Comment

by:sciuriware
Comment Utility
The JDK is accompanied by "src.zip"

;JOOP!
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers learn how to read error messages and identify possible mistakes that could cause hours of frustration. Coding is as much about debugging your code as it is about writing it. Define Error Message: Line Numbers: Type of Error: Break Down…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now