Having trouble with cfscript to uncompress files - experts only

Hello all,

Could someone help with some cfscript issues I’m having? I’ve got most of it working but for some reason can’t make a few things work. Basically I’ve got a directory where a bunch of .tar files are at. These tar files each have pdf files in them. I need to untar those files, take the pdf files and move them to another directory. Before I do this move, I need to capture the name of the pdf file (without the file extension) and store it in a list which I will use to loop over each element of that list.

-roger
<!---this is where the tar files are in, let’s say in this case there are 3 tar files – dop123.tar, dop234.tar, dop345.tar and each of them have a bunch of pdf files with a format like 99991111222233334444.pdf (20numbers for each pdf file name)--->
<cfdirectory action="list" name="getFiles" directory="#tempDir#" filter="dop*.tar" />
<cfset tarFileCount = getFiles.RecordCount>
<!--- Untar the tar file --->
   <cfscript>
		TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
		TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
		FileInputStream = createobject("java", "java.io.FileInputStream");
		FileOutputStream = createobject("java", "java.io.FileOutputStream");
		FileIO = createobject("java", "java.io.File");
		for(tfc=1;tfc lte tarFileCount;tfc++)<!---this is where I’m having a problem of looping over each .tar file, untar it, then move onto the next .tar file and untar that one --->
		{
			tarfile = expandpath("#tfc#.tar");
			outdir = expandpath("unpacked");

			tin = TarInputStream.init(FileInputStream.init(tarfile));

			te = tin.getNextEntry();
			while (structkeyexists(variables, 'te'))
			{
         		destpath = outdir & "\" & te.getName();
            		if (te.isDirectory())
                		destpath.mkdir();
                	else
                	{
                		fos = FileOutputStream.init(destpath);
                    	tin.copyEntryContents(fos);
                    	fos.close();
                	}
                	te = tin.getNextEntry();
			}
		}
			tin.close();
<!---Once I get the untarring done, this is wherer I need to loop over all those untarred pdf files, grab the name of the file but not the file extension (99996666555544448888), and then populate them into a list over which I can later loop over.--->
	</cfscript>

Open in new window

LVL 1
roger vAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

roger vAuthor Commented:
Correction:

The pdf file will be in the format like:

dop99998888777744445555.pdf

And I'll  need to get just the numbers from out of that string.
0
duncancummingCommented:
I'd change the for loop slightly.  When you do tarfile = expandpath("#tfc#.tar"); all you're getting is "1.tar", "2.tar" etc, which isn't correct.  Instead you really want to loop over your query.  I'm assuming you want to keep it all in cfscript rather than mix up CFML and cfscript.  So something like this:

for(tfc=1;tfc lte tarFileCount;tfc++)
                {
                        tarfile = expandpath("#getFiles['name'][tfc]#.tar");

I assume all the stuff with the TAR unpacking already works?  I don't know much about that, so won't comment on it.  What part of the te/ tin/ fos etc has the names of the PDF files?  What you want to do is put them into an array (or list, but arrays are faster in ColdFusion).  The part for extracting just the filename without the extension isn't too hard.  Various ways to do it, I'd probably just use
filename = ListFirst(yourPDFfileName, ".");

and put it into an array with ArrayAppend().
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
roger vAuthor Commented:
@duncancumming:

Yes the 'untarring' part is working fine. It's putting together the rest of the cfscript/cfml that I'm having trouble with. This is what I have now:

That throws an error "object instantiation exception". Where is the error in that code and is there a better way of doing that? thanks,

-roger
Please Wait...<Br>
<Cfflush> 
<!--- Get file names of TARs to be untarred --->
<cfdirectory action="list" name="getPPS" directory="#ppsdir#" filter="dop*.tar" />
<cfset tarFileCount = getPPS.RecordCount>


<cfset tarFileList = quotedvaluelist(getPPS.name, ",")>

   <cfscript>
		TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
		TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
		FileInputStream = createobject("java", "java.io.FileInputStream");
		FileOutputStream = createobject("java", "java.io.FileOutputStream");
		FileIO = createobject("java", "java.io.File");
		for(file_idx=1;file_idx lte listlen(tarFileList);file_idx++)
		{
			tarfile = listgetat(tarFileList,file_idx,",");
			tarfile_quotefree = expandpath(ReReplace(tarfile,"'","","ALL"));
			outdir = expandpath("unpacked");
			tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));
			te = tin.getNextEntry();
			while (structkeyexists(variables, 'te'))
			{
         			destpath = outdir & "\" & te.getName();
            			if (te.isDirectory())
                			destpath.mkdir();
                		else
                		{
                			fos = FileOutputStream.init(destpath);
                    		tin.copyEntryContents(fos);
                    		fos.close();
                		}
                		te = tin.getNextEntry();
				}
			}
				tin.close();
		
	</cfscript>

<cfdirectory action="list" name="getPdfNames" directory="unpacked" filter="DOP*.pdf">
<cfset lstCertNum = quotedvaluelist(getPdfNames.name)>
<cfdump var="#lstCertNum#">

<cfabort>

Open in new window

0
PMI ACP® Project Management

Prepare for the PMI Agile Certified Practitioner (PMI-ACP)® exam, which formally recognizes your knowledge of agile principles and your skill with agile techniques.

duncancummingCommented:
That error could be happening on any of these lines:
TarEntry = createobject("java", "org.apache.tools.tar.TarEntry");
TarInputStream = createobject("java", "org.apache.tools.tar.TarInputStream");
FileInputStream = createobject("java", "java.io.FileInputStream");
FileOutputStream = createobject("java", "java.io.FileOutputStream");
FileIO = createobject("java", "java.io.File");

or maybe this one:
tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));

or even
fos = FileOutputStream.init(destpath);

The line number of the error message should tell you which one is the problem.  Alternatively comment them all out and uncomment each one individually until that error is thrown.  

That REReplace isn't necessary, this would also work:
tarfile_quotefree = expandpath(Replace(tarfile,"'","","ALL"));



0
_agx_Commented:
> That throws an error "object instantiation exception"

   That's just a generic header message.  The real error is in the stack trace. What does it say?
0
roger vAuthor Commented:
OK folks this is perplexing:

I'm getting an object intantiation exception and like agx says it's generic, the actual error is Caused by: java.io.FileNotFoundException: \\xxxxx\xxxx\xxxxx\xxxxx\xxxxxx\xxxxxxxxx\DOP0323100001.tar (The system cannot find the file specified).

But the problem is that when I do a cfdump at the very begining, after line 5, it shows that there are 3 files as there should be (with the exact same name DOP0323100001.tar and so on), but when it come down to executing the following line (which is in bold meaning it errored out at that line), it gives me that error message:



 Object Instantiation Exception.
An exception occurred when instantiating a Java object. The class must not be an interface or an abstract class. Error: ''.
 
The error occurred in \\xxxxxxxx\xxxxxx\xxxxxxxx\webroot\imaging\xxxxxxx\step2.cfm: line 35

33 : 			outdir = expandpath("unpacked");
34 : 
35 : 			tin = TarInputStream.init(FileInputStream.init(tarfile_quotefree));
36 : 
37 : 			te = tin.getNextEntry();

Open in new window

0
roger vAuthor Commented:
Allright, I've made some progress but hit a snag. I've managed to unpack all the tar files into a directory. Now I want to create a list or array containing the names of all pdf files in that list. Any suggestions on how best to do that?

<cfdirectory action="list" name="getPdfNames" directory="#unpacked#"><!---this is the directory that has all the pdf files that I want in an array--->
<cfset lstCertNum = getPdfNames.RecordCount>

<!---I'm trying to use the list which returns as a query object to loop and populate an array but not able to do so--->

<cfset arrCertNum = ArrayNew(1)>

<cfloop query="getPdfNames" startrow="1" endrow="#lstCertNum#" index="idx">
        <cfset arrCertNum[idx] = ArrayAppend(arrCertNum,#name#)>
</cfloop>

<!---does not work, obviously I'm doing something wrong--->

0
_agx_Commented:
> does not work, obviously I'm doing something wrong

    Assuming you really need an array... the "index" attribute isn't allowed with a query loop.  
    So remove that.  Also, arrayAppend doesn't return an array.  It modifies the array in place.  
    Just use

    <cfloop query="getPdfNames" startrow="1" endrow="#getPdfNames.recordCount#">
      <cfset arrayAppend(arrCertNum, name)>
     </cfloop>

> the actual error is Caused by: java.io.FileNotFoundException
> but when it come down to executing the following line ....
> it gives me that error message:

FileNotFoundException doesn't always mean the file doesn't exist.  It can also mean the file can't be accessed. For example if the code opened certain types of file objects, but forget to close them,  the file might be left in an inaccessible state.


0
_agx_Commented:
> destpath = outdir & "\" & te.getName();

    Since you already know the file names, you could also store them while inside your "unzip" loop instead.
0
duncancummingCommented:
As CFDirectory returns a query object, you can just do this:

<cfdirectory action="list" name="getPdfNames"  directory="#unpacked#">
<cfset lstFiles = ValueList(getPdfNames.name)>

If you'd rather they were in an array, do:
<cfset arrFiles = ListToArray(ValueList(getPdfNames.name))>
0
_agx_Commented:
> As CFDirectory returns a query object, you can just do this:

   Yes, there are easier ways to get the results. Technically they don't even need to do a cfdirectory.
   They could just store the file names as they're unzipping them.  But any of the options mentioned
   would work.

> <cfset arrFiles = ListToArray(ValueList(getPdfNames.name))>

   It may not apply here, but be careful to select the right delimiter with list functions. The default is a ","
   which isn't always safe choice.  File names/paths can contain commas, which would cause the final
   results to be wrong.
0
roger vAuthor Commented:
The response time was woeful, by the time I got a response, I'd already solved the problem and moved on to the next one. I believe one of the main reasons for using this venue is to get answers relatively quickly which is as important if not more, of getting an answer. The whole purpose is defeated if that first part is not the case.
0
duncancummingCommented:
"The response time was woeful, by the time I got a response, I'd already  solved the problem and moved on to the next one. I believe one of the  main reasons for using this venue is to get answers relatively quickly  which is as important if not more, of getting an answer. The whole  purpose is defeated if that first part is not the case."

Unfortunately some of us have a life outside of this website and aren't able to respond immediately to every question.  I think suffixing your question "experts only" might have put some people off even looking at it.  I'm not sure which part of the response time was woeful: the 7hrs before I made my first comment, or the overall response times from myself and _agx_.  However if response time was so important, why wait 5 days before replying to my initial comment?

From the EE Help pages:
Please remember that all of the Experts are volunteers, so sometimes  they have real life issues that arise. They are not paid for answering  questions; they earn certificates, t-shirts for those certificates and  their membership to Experts Exchange. The most important reason for them  being here and participating is because they enjoy being here and  helping people.

I look forward to helping you with any future questions!

0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
ColdFusion Language

From novice to tech pro — start learning today.