Link to home
Start Free TrialLog in
Avatar of roger v
roger vFlag for United States of America

asked on

Having trouble with cfscript to uncompress files - experts only

Hello all,

Could someone help with some cfscript issues I’m having? I’ve got most of it working but for some reason can’t make a few things work. Basically I’ve got a directory where a bunch of .tar files are at. These tar files each have pdf files in them. I need to untar those files, take the pdf files and move them to another directory. Before I do this move, I need to capture the name of the pdf file (without the file extension) and store it in a list which I will use to loop over each element of that list.

-roger
<!---this is where the tar files are in, let’s say in this case there are 3 tar files – dop123.tar, dop234.tar, dop345.tar and each of them have a bunch of pdf files with a format like 99991111222233334444.pdf (20numbers for each pdf file name)--->
<cfdirectory action="list" name="getFiles" directory="#tempDir#" filter="dop*.tar" />
<cfset tarFileCount = getFiles.RecordCount>
<!--- Untar the tar file --->
   <cfscript>
		TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
		TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
		FileInputStream = createobject("java", "java.io.FileInputStream");
		FileOutputStream = createobject("java", "java.io.FileOutputStream");
		FileIO = createobject("java", "java.io.File");
		for(tfc=1;tfc lte tarFileCount;tfc++)<!---this is where I’m having a problem of looping over each .tar file, untar it, then move onto the next .tar file and untar that one --->
		{
			tarfile = expandpath("#tfc#.tar");
			outdir = expandpath("unpacked");

			tin = TarInputStream.init(FileInputStream.init(tarfile));

			te = tin.getNextEntry();
			while (structkeyexists(variables, 'te'))
			{
         		destpath = outdir & "\" & te.getName();
            		if (te.isDirectory())
                		destpath.mkdir();
                	else
                	{
                		fos = FileOutputStream.init(destpath);
                    	tin.copyEntryContents(fos);
                    	fos.close();
                	}
                	te = tin.getNextEntry();
			}
		}
			tin.close();
<!---Once I get the untarring done, this is wherer I need to loop over all those untarred pdf files, grab the name of the file but not the file extension (99996666555544448888), and then populate them into a list over which I can later loop over.--->
	</cfscript>

Open in new window

Avatar of roger v
roger v
Flag of United States of America image

ASKER

Correction:

The pdf file will be in the format like:

dop99998888777744445555.pdf

And I'll  need to get just the numbers from out of that string.
ASKER CERTIFIED SOLUTION
Avatar of duncancumming
duncancumming
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roger v

ASKER

@duncancumming:

Yes the 'untarring' part is working fine. It's putting together the rest of the cfscript/cfml that I'm having trouble with. This is what I have now:

That throws an error "object instantiation exception". Where is the error in that code and is there a better way of doing that? thanks,

-roger
Please Wait...<Br>
<Cfflush> 
<!--- Get file names of TARs to be untarred --->
<cfdirectory action="list" name="getPPS" directory="#ppsdir#" filter="dop*.tar" />
<cfset tarFileCount = getPPS.RecordCount>


<cfset tarFileList = quotedvaluelist(getPPS.name, ",")>

   <cfscript>
		TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
		TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
		FileInputStream = createobject("java", "java.io.FileInputStream");
		FileOutputStream = createobject("java", "java.io.FileOutputStream");
		FileIO = createobject("java", "java.io.File");
		for(file_idx=1;file_idx lte listlen(tarFileList);file_idx++)
		{
			tarfile = listgetat(tarFileList,file_idx,",");
			tarfile_quotefree = expandpath(ReReplace(tarfile,"'","","ALL"));
			outdir = expandpath("unpacked");
			tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));
			te = tin.getNextEntry();
			while (structkeyexists(variables, 'te'))
			{
         			destpath = outdir & "\" & te.getName();
            			if (te.isDirectory())
                			destpath.mkdir();
                		else
                		{
                			fos = FileOutputStream.init(destpath);
                    		tin.copyEntryContents(fos);
                    		fos.close();
                		}
                		te = tin.getNextEntry();
				}
			}
				tin.close();
		
	</cfscript>

<cfdirectory action="list" name="getPdfNames" directory="unpacked" filter="DOP*.pdf">
<cfset lstCertNum = quotedvaluelist(getPdfNames.name)>
<cfdump var="#lstCertNum#">

<cfabort>

Open in new window

That error could be happening on any of these lines:
TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
FileInputStream = createobject("java", "java.io.FileInputStream");
FileOutputStream = createobject("java", "java.io.FileOutputStream");
FileIO = createobject("java", "java.io.File");

or maybe this one:
tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));

or even
fos = FileOutputStream.init(destpath);

The line number of the error message should tell you which one is the problem.  Alternatively comment them all out and uncomment each one individually until that error is thrown.  

That REReplace isn't necessary, this would also work:
tarfile_quotefree = expandpath(Replace(tarfile,"'","","ALL"));



SOLUTION
Avatar of _agx_
_agx_
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of roger v

ASKER

OK folks this is perplexing:

I'm getting an object intantiation exception and like agx says it's generic, the actual error is Caused by: java.io.FileNotFoundException: \\xxxxx\xxxx\xxxxx\xxxxx\xxxxxx\xxxxxxxxx\DOP0323100001.tar (The system cannot find the file specified).

But the problem is that when I do a cfdump at the very begining, after line 5, it shows that there are 3 files as there should be (with the exact same name DOP0323100001.tar and so on), but when it come down to executing the following line (which is in bold meaning it errored out at that line), it gives me that error message:



 Object Instantiation Exception.
An exception occurred when instantiating a Java object. The class must not be an interface or an abstract class. Error: ''.
 
The error occurred in \\xxxxxxxx\xxxxxx\xxxxxxxx\webroot\imaging\xxxxxxx\step2.cfm: line 35

33 : 			outdir = expandpath("unpacked");
34 : 
35 : 			tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));
36 : 
37 : 			te = tin.getNextEntry();

Open in new window

Avatar of roger v

ASKER

Allright, I've made some progress but hit a snag. I've managed to unpack all the tar files into a directory. Now I want to create a list or array containing the names of all pdf files in that list. Any suggestions on how best to do that?

<cfdirectory action="list" name="getPdfNames" directory="#unpacked#"><!---this is the directory that has all the pdf files that I want in an array--->
<cfset lstCertNum = getPdfNames.RecordCount>

<!---I'm trying to use the list which returns as a query object to loop and populate an array but not able to do so--->

<cfset arrCertNum = ArrayNew(1)>

<cfloop query="getPdfNames" startrow="1" endrow="#lstCertNum#" index="idx">
        <cfset arrCertNum[idx] = ArrayAppend(arrCertNum,#name#)>
</cfloop>

<!---does not work, obviously I'm doing something wrong--->

> does not work, obviously I'm doing something wrong

    Assuming you really need an array... the "index" attribute isn't allowed with a query loop.  
    So remove that.  Also, arrayAppend doesn't return an array.  It modifies the array in place.  
    Just use

    <cfloop query="getPdfNames" startrow="1" endrow="#getPdfNames.recordCount#">
      <cfset arrayAppend(arrCertNum, name)>
     </cfloop>

> the actual error is Caused by: java.io.FileNotFoundException
> but when it come down to executing the following line ....
> it gives me that error message:

FileNotFoundException doesn't always mean the file doesn't exist.  It can also mean the file can't be accessed. For example if the code opened certain types of file objects, but forget to close them,  the file might be left in an inaccessible state.


> destpath = outdir & "\" & te.getName();

    Since you already know the file names, you could also store them while inside your "unzip" loop instead.
As CFDirectory returns a query object, you can just do this:

<cfdirectory action="list" name="getPdfNames"  directory="#unpacked#">
<cfset lstFiles = ValueList(getPdfNames.name)>

If you'd rather they were in an array, do:
<cfset arrFiles = ListToArray(ValueList(getPdfNames.name))>
> As CFDirectory returns a query object, you can just do this:

   Yes, there are easier ways to get the results. Technically they don't even need to do a cfdirectory.
   They could just store the file names as they're unzipping them.  But any of the options mentioned
   would work.

> <cfset arrFiles = ListToArray(ValueList(getPdfNames.name))>

   It may not apply here, but be careful to select the right delimiter with list functions. The default is a ","
   which isn't always safe choice.  File names/paths can contain commas, which would cause the final
   results to be wrong.
Avatar of roger v

ASKER

The response time was woeful, by the time I got a response, I'd already solved the problem and moved on to the next one. I believe one of the main reasons for using this venue is to get answers relatively quickly which is as important if not more, of getting an answer. The whole purpose is defeated if that first part is not the case.
"The response time was woeful, by the time I got a response, I'd already  solved the problem and moved on to the next one. I believe one of the  main reasons for using this venue is to get answers relatively quickly  which is as important if not more, of getting an answer. The whole  purpose is defeated if that first part is not the case."

Unfortunately some of us have a life outside of this website and aren't able to respond immediately to every question.  I think suffixing your question "experts only" might have put some people off even looking at it.  I'm not sure which part of the response time was woeful: the 7hrs before I made my first comment, or the overall response times from myself and _agx_.  However if response time was so important, why wait 5 days before replying to my initial comment?

From the EE Help pages:
Please remember that all of the Experts are volunteers, so sometimes  they have real life issues that arise. They are not paid for answering  questions; they earn certificates, t-shirts for those certificates and  their membership to Experts Exchange. The most important reason for them  being here and participating is because they enjoy being here and  helping people.

I look forward to helping you with any future questions!