Link to home
Start Free TrialLog in
Avatar of roger v
roger vFlag for United States of America

asked on

Coldfusion and tar gz files unzipping - experts only

I have a unique situation where I get a .tar.gz file from an external source. This zipped file is basically a group of pdf files that are zipped into one .tar.gz file. I'm using CF 8 and I used the cfzip tag but it works for a .zip file but not for the .tar.gz file type. I need to find a way using cfscript or some java library so I can take this file and unzip all it's .pdf contents. Can somebody provide me with the script/code?

-roger
SOLUTION
Avatar of gdemaria
gdemaria
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roger v

ASKER

NO it's running on windows platform
So first you need to get a product that will unzip the file,  I believe winzip may be able to.   It needs to have a command line capability, I believe it has that as well..  please verify though
I just found this script that may do it in Coldfusion.

Replace the inFile with your filename/path

I think the outFilename may be a directory instead of a file, that would be more logical, but not 100% sure

<cfscript>
try {
        // Open the compressed file
        inFilename = "C:\Inetpub\wwwroot\practice\gzip\gzip-1.2.4.tar.gz";
          file_in = createobject('java','java.io.FileInputStream').init(inFilename);
          fin = createobject('java','java.util.zip.GZIPInputStream').init(file_in);
   
        // Open the output file
        outFilename = "C:\Inetpub\wwwroot\practice\gzip\gzip124";
        out = createobject('java','java.io.FileOutputStream').init(outFilename);
            safecount = 1;
        buf =  repeatString(" " ,100).getBytes();
          flen = fin.read(buf);
        while (flen GT 0) {
            out.write(buf, 0, flen);
               flen = fin.read(buf);
        }
        // Close the file and stream
          file_in.close();
        fin.close();
        out.close();
         
    } catch (IOException e) {
    }
</cfscript>

Open in new window

Avatar of roger v

ASKER

@gdemaria:
So first you need to get a product that will unzip the file,  I believe winzip may be able to.

All of this has to be an automated process. So for example, the compressed folder (.tar format) with a bunch of .pdf files is downloaded via ftp onto our server. From here, an automated process (cfzip if the compression format was zip) would unzip the folder and then take the files and do it's thing. Since cfzip will  not work with .tar format I need an alternate program/code/script that will do the same job as that of cfzip i.e. uncompress the folder and all the .pdf files in it.

-roger
> All of this has to be an automated process.

Note that I also said

  "It needs to have a command line capability, I believe it has that as well."

The command line capability will allow coldfusion to execute it using cfexecute.

However, try the script provided instead, it uses the built in java library and may work

Avatar of roger v

ASKER

I tried that script with a .7z file but got an object instantiation exception:

 Object Instantiation Exception.
An exception occurred when instantiating a Java object. The class must not be an interface or an abstract class. Error: ''.
Avatar of roger v

ASKER

Also, my compressed file format is in .tar format. Will the above code work with .tar format?
Avatar of roger v

ASKER

@gdemaria:

Nope, no luck. Tried that on a .tar file, got an object instantiation exception. Tried on a separate machine with different files just in case, same result!  :(
I did some research and the error may be caused by not having the correct value in the input file variable

inFilename = "C:\Inetpub\wwwroot\practice\gzip\gzip-1.2.4.tar.gz"

Can you verify that you are using the full path and that it is correct and finds a file there?

You can add <cfif NOT fileExists(inFileName)> <cfthrow message="file does not exist"></cfif>  before the createObject() function
Avatar of roger v

ASKER

@gdemaria:

I have the cfm file and the test.tar file in the same folder so there should not be any problem of finding the file. Plus, I don't know how to use cfif  inside a cfscript block.
> I have the cfm file and the test.tar file in the same folder

you need to use the full path to the file, so if it's in the same folder and you are just using  inFilename = "theFile.tar.gz"
then that is likely the problem
Avatar of roger v

ASKER

No, I added the complete path to the file like "S:\mywebroot\theFile.tar" but did not work. I downloaded a java package called javatar that has similar methods like the util package. Any idea of if and how I could use that in conjunction with the util package's methods to open the tar file?

I can't add the entire package so I'm adding the API document.
index.html
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roger v

ASKER

@azadi,

I am able to use cfexecute (for other programs). But my compressed file is coming in from an external server (by way of ftp code which I run before, hopefully, uncompressing the file), and at this point I'm not sure if it will be in the format of xxxxx99999.tar or xxxxxx99999.tar.gz. Would the above snippet work for both .tar and .tar.gz compressed files? The reason I don't know yet is cuz the first run will be this monday and the folks in charge on the other end are unable to tell me what exact format it will be in but I'd like to have my code working and ready so I can run it as soon as the external file becomes available next week. thank,

roger
Avatar of roger v

ASKER

OK this is what I tried. I skipped the process of extracting the tar from the gz and instead did this:

<!--- Using 7zip via cfexecute  --->
    <cfset exePath = "C:\7zip\7za.exe"><!--- full abs path to 7za.exe --->
    <cfset outputDir = expandpath('unpacked')><!--- full abs path to destination dir --->
    <cfset gzPath = expandpath('USPS.tar')><!--- full abs path to .tar.gz file --->
    <cfset tarFile = 'USPS.tar'><!--- this gets .tar filename from .gz filename --->  
    <!--- unpack .tar archive --->
      <cfexecute name="#exePath#" arguments="x #outputDir & '/' & tarFile# -o#outputDir#" timeout="20"></cfexecute>

The page executed without an error but I don't see any uncompressed pdf files from the tar file!


Not sure if you can contact within #s ?


 <cfexecute name="#exePath#" arguments="x #outputDir & '/' & tarFile# -o#outputDir#" timeout="20">
                                                                                     ^^^^^^^^^^

How about...

 <cfexecute name="#exePath#" arguments="x #outputDir# & '/' & #tarFile# -o#outputDir#" timeout="20">


Avatar of roger v

ASKER

@gdemaria:

Tried that as well. Did not get an error when the page was executed but I could not find any .pdf files (the contents of that .tar file) in the folder of the executing cfm template nor did I find any pdf files in a folder called unpacked which I created (just in case).  :(

P.S. The cfm template executes pretty fast for something that should uncompress a tar file and then put those uncompressed pdf files in the folder. You'd think it would take atleast 3-4 secs to do the whole process. Just sayin...
gdemaria >> Not sure if you can contact within #s

of course, you can. this is just your standadr cf string concatenation.
your 'how about' code will work only if you remove & and ' from it: #outputDir#/#tarFile#

roger_v >> The page executed without an error but I don't see any uncompressed pdf files from the tar file!

you got the vars of location of tar file mixed up there.
change cfexecute to this and it should work:
<cfexecute name="#exePath#" arguments="x #gzPath# -o#outputDir#" timeout="20"></cfexecute>

however, if you are not sure if you will get .tar.gz or just .tar files, you better modify my original code to check for archive file extension and appropriate <cfexecute> calls.

Azadi
Avatar of roger v

ASKER

Ok guys, it worked! I'll go ahead and give the points but after all this, now my boss tells me that we're not gonna be using cfexecute anymore (security risk). :mad:  And he's darn sure that a java package will do it (obviously he's done it before, he just doesn't have the code that did it) because the util package that I tried yesterday didn't do the trick (kept throwing an object instantiation exception).

Is there any way that the security risk in the cfexecute could be magically mitigated or is there a java package that could do it? Any suggestions/directions/help is greatly appreciated.
-roger
attached edited code does a check for archive file extension (.tar or .tar.gz only) and acts appropriately.

i have also added a -y switch to the cfexecute arguments, which means "assume YES to any questions", and which will force overwrite of any files with unpacked ones. without it <cfexecute> seems to throw a timeout error because 'behind the scenes' 7-zip is sitting waiting for overwrite confirmation...

Azadi
<cfset exePath = "D:\7zip\7za.exe"><!--- full abs path to 7za.exe --->
<cfset outputDir = expandpath('unpacked')><!--- full abs path to destination dir --->
<cfset gzPath = expandpath('temp.tar')><!--- full abs path to .tar.gz file --->
<cfif listlast(gzPath, ".") is "tar">
  <!--- archive is .tar - unpack to output folder --->
  <cfexecute name="#exePath#" arguments="x #gzPath# -o#outputDir# -y" timeout="20"></cfexecute>
<cfelseif findnocase(".tar.gz", gzPath)>
  <!--- archive is .tar.gz - decompress .gz file to output folder --->
  <cfexecute name="#exePath#" arguments="x #gzPath# -o#outputDir# -y" timeout="20"></cfexecute>
  <!--- unpack .tar archive --->
  <cfexecute name="#exePath#" arguments="x #outputDir & '/' & replacenocase(getfilefrompath(gzPath), '.gz', '')# -o#outputDir# -y" timeout="20"></cfexecute>
<cfelse>
  <!--- archive is not a .tar or .tar.gz - do whatever you need to do --->
</cfif>

Open in new window

actually, better replace
<cfelseif findnocase(".tar.gz", gzPath)>

in there with
<cfelseif refindnocase("\.tar\.gz$", gzPath)>

Azadi
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roger v

ASKER

@Bang-O-Matic:

That would be great, I'm trying to see if I could find a java package that I could use but as a standby, your solution would work. thanks,

Oh btw, do I have to post my email here or is there some other way you could get to it?
yes, whats a good email address to send it to?