Solved

Best way to see if a file is compressed

Posted on 2006-11-06
15
141 Views
Last Modified: 2010-08-05
Hi

If I want to see if a file is compressed , what is the best way?  If see if the zipentry is null... might work but if the the file is corrupted, that wouldn't be accurate


Thanks
ryno71
0
Comment
Question by:ryno71
  • 9
  • 6
15 Comments
 
LVL 24

Expert Comment

by:sciuriware
ID: 17883143
Test for an extension .zip or .ZIP

If a zip-file is named another way somebody is cheating.

;JOOP!
0
 

Author Comment

by:ryno71
ID: 17883150
Thats just it, if I dont know... say it was sent to me in a byte array I wouldn't know.
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17883158
File f;   // Got it from a directory listing, then you don't know its name.

     if(f.getAbsolutePath().toUpper().endsWith(".ZIP"))
     {
 //    compressed


;JOOP!
0
 

Author Comment

by:ryno71
ID: 17883184
If I am the one naming the new file I receive and don't know what the original name was this won't work.  
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17883212
OK then, if the first 8 bytes are (octal):

 0120, 0113, 003, 004, 024, 000, 002, 000

then you may assume it's a ZIP compressed file.

;JOOP!
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17883226
You can read those 8 bytes from a raw InputStream from the file.

;JOOP!
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17883356
Correction: I did some research on other zip files:

only 4 bytes are enough:

0120, 0113, 003, 004

;JOOP!
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:ryno71
ID: 17885670
Something like this?

I keep getting false

public class BytesFromFile
{

public static byte[] getBytesFromFile(File file) throws IOException {
        InputStream is = new FileInputStream(file);
   
        // Get the size of the file
        long length = file.length();
   
        // You cannot create an array using a long type.
        // It needs to be an int type.
        // Before converting to an int type, check
        // to ensure that file is not larger than Integer.MAX_VALUE.
        if (length > Integer.MAX_VALUE) {
            // File is too large
        }
   
        // Create the byte array to hold the data
        byte[] bytes = new byte[4];
            
            is.read(bytes);
            
      
            
            byte[] bytes1= new byte[] {0120,0113,003,004};

            boolean bo = Arrays.equals(bytes, bytes1);
            boolean bo1 = bytes.equals(bytes1);
            
            System.out.println("result is "+ result);
            System.out.println("bo is "+ bo);
            System.out.println("bo1 is "+ bo1);
        // Read in the bytes
        int offset = 0;
        int numRead = 0;
        while (offset < bytes.length
               && (numRead=is.read(bytes, offset, bytes.length-offset)) >= 0) {
            offset += numRead;
        }
   
        // Ensure all the bytes have been read in
        if (offset < bytes.length) {
            throw new IOException("Could not completely read file "+file.getName());
        }
   
        // Close the input stream and return bytes
        is.close();
        return bytes;
    }

      
      
      
      
      public static void main(String args[]) {
               
                    BytesFromFile da = new BytesFromFile();
               try
               {
                    System.out.println(" ");
                              System.out.println("Zip1 ");
                              System.out.println(" ");
                              String fileName=args[0];
                              byte[] bytes =null;
                              File temp=null;
                              temp = new File(fileName);
                    bytes=da.getBytesFromFile(temp);
                        
                              System.out.println("bytes are "+bytes);
                              
                              
               }
               catch (Exception e1)
               {
                    e1.printStackTrace();
               }
          }
      }
0
 
LVL 24

Accepted Solution

by:
sciuriware earned 250 total points
ID: 17887234
better:

import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;

public class TestForCompression
{
   /**
    * Program entry point.
    *
    * @param commandLine program command line vector.
    * @throws IOException
    */
   public static void main(String[] commandLine) throws IOException
   {
      byte[] bytes = new byte[] {0120,0113,003,004};
      byte[] data = new byte[4];
      RandomAccessFile ra;

      if(commandLine.length > 0)
      {
         ra = new RandomAccessFile(new File(commandLine[0]), "r");
         if(ra.read(data) != data.length)
         {
            System.out.println("Short file, not compressed.");
            return;
         }
         
         for(int i = 0;  i < data.length;  ++i)
         {
            if(data[i] != bytes[i])
            {
               System.out.println("Not compressed.");
               return;
            }
         }
         System.out.println("Compressed.");
      }
   }
}

Always test your code!

;JOOP!
0
 

Author Comment

by:ryno71
ID: 17894340
Thanks.  that will work better!
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17895959
:)
0
 

Author Comment

by:ryno71
ID: 17899117
sciuriware

Where did you find that you need to look at four bytes for a compressed file (pkzip or Gzip), couldn't it be two?

ryno71
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17899244
From the SUN JAVA sources:

ZIP      "PK\003\004"
GZIP    0x8B1F

seems to be 4 bytes all the time.

;JOOP!
0
 

Author Comment

by:ryno71
ID: 17899579
where did you find this?  Been looking and can't sem to find it!

Thanks alot!
0
 
LVL 24

Expert Comment

by:sciuriware
ID: 17899869
The JDK is accompanied by "src.zip"

;JOOP!
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
@SBGen Method 3 38
Java - sorting a list of objects where the properties of the objects can change during the sort 7 47
maven project error 5 48
eclipse shortcuts 9 45
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Viewers learn about the “for” loop and how it works in Java. By comparing it to the while loop learned before, viewers can make the transition easily. You will learn about the formatting of the for loop as we write a program that prints even numbers…
Viewers learn about the scanner class in this video and are introduced to receiving user input for their programs. Additionally, objects, conditional statements, and loops are used to help reinforce the concepts. Introduce Scanner class: Importing…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

25 Experts available now in Live!

Get 1:1 Help Now