Link to home
Start Free TrialLog in
Avatar of tweaver1973
tweaver1973

asked on

itterate over a directory

Windows platform
What I have: a directory called MFG that have 79 folders and these folder have sub folders.  In the folders and sub folders are files.
what I want:  in the end I need a one folder for every manufacturer with one index.txt and all pdf files.
example: MFG\KOHLER\INDEX.TXT
              MFG\KOHLER\FOO.PDF
              MFG\KOHLER\BAR.PDF
              MFG\KOHLER\etc...
         
I need to itterate through a directory structure and and list the files in an INDEX.txt and folders in a directory then MOVE the files to the BASE dir and RM the old folders.

The folowing script itterates through the dir but will not crawl into subdirs.  second, it does not output to a TXT or MOVE and delete old dirs...

this is urgent and technical. - 500 points.

(severly lacking somewhat useful base of a) script:
#!/usr/bin/perl

my $top = "C:/NETSHARE/PDF/MFG";

chdir ($top) || die "Cannot chdir to $top  ($!)";

foreach my $folder (grep { -d } glob "*") {
    print "Folder $folder =>\n";
   
    my $dir = "$top/$folder";
    chdir ($dir) || die "Cannot chdir to $dir  ($!)";
    foreach my $file (grep { -f } glob "*") {     # consider files only
        print "\tFile $file\n";
    }
}
 
Avatar of kandura
kandura

can you tell us how we're supposed to tell which file belongs to which manufacturer?
Avatar of tweaver1973

ASKER


the pdf files are specifications for items sold by each manufacturer. So, in MFG/Kohler there are some pdf's that in the top folder and there are pdf's in sub folder such as "sinks" and "fixtures" - i.e. MFG/Kohler/Sinks/foo.pdf.

I need to put foo.pdf in MFG/Kohler and create a TXT file that reads Kohler \t  foo.pdf - like an index. ultimately, the boss wants an exel file with two columns an INDEX of the pdf's in the folder. the first column (something I will create manually) is a part number the second column is the corresponding pdf where this part is specified.

To answer your question - if I understand it correctly, we know that foo.pdf belongs in Kohler becasue we found it in the Kohler dir or a Kohler subdir.

New and improved itteration - now I just need hep accessing this array..

@FOO = qw(C:/NETSHARE/PDF/MFG);


use File::Find;
find sub { print $File::Find::name, -d && "/", "\n" }, @FOO;
use File::Find ;
find (\&do_it, 'path/to/file') ;
sub do_it {
if (!-d&&/\.pdf$/) {
my ($par_dir) = $File::Find::name =~ m/\/(.*?)\/.*?$/ ;  ##extract parent directory
!-d "C:/MFG/$par_dir" && mkdir "C:/MFG/$par_dir" ;  ##create parent dir if it doesn't exist
rename $File::Find::name "C:/MFG/$par_dir/$_" ;  ##move pdf to the created directory
}

$par_dir will exist because we found one or more PDF's in the $par_dir..
I want to be sure I have this right - so, for clarity I have questions:

1. This script starts with if (!-d&&/\.pdf$/) - this means if the string in $_ is a dir AND the string ends with pdf  drop into the if - what does $/ mean?

2. first line of if reads my ($par_dir) = $File::Find::name =~ m/\/(.*?)\/.*?$/ ; here you are setting $par_dir to the string in $_  but removing all sub dirs down to C:\NETSHARE, I need the par_dir to be the base dir that I start in - :C:/NETSHARE/PDF/MFG" so, $par_dir needs to be "C:/NETSHARE/PDF/MFG"

3. I do not understand the mkdir because in all cases the dir exists.  Maye you want me to think differently about the task I am performing and you have a better suggestion.  As I said - I need to flatten the directories and create an index.txt file.
If you have a better suggestion please guide me.




so really $par_dir should be "C:/NETSHARE/PDF/MFG\" . "last dir found on this level" - find all pdf's - move to $par_dir, rm all other dirs - move to next dir

then "C:/NETSHARE/PDF/MFG\" . "next dir found on this level"
need help - daylight is passing.   Original question remains unresolved.  
manav_mathur are you still interested - I think the answer is in $par_dir but I have limited exp in regexp.  and I have tried to place parens around different areas of the expression and even tried my had at a few greedy expressions but I cannot figure out how to get the different peices of the string..
If I capture (.*\/).*  would this mean grab everything up to the last / ??

so on the following string:
C:/NETSHARE/PDF/MFG/FOO/SUB_FOO1/SUB_FOO2/SUB_FOO3/foo.pdf

$1 would be:
C:/NETSHARE/PDF/MFG/FOO/SUB_FOO1/SUB_FOO2/SUB_FOO3/

What I need to do is put  "foo.pdf" in FOO and then put a text file in FOO that reads "foo.pdf"
and delete the sub directories..

ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Manav was on his way with File::Find, but I suspect he's having a busy day too.
Let me take a brief stab at it:

1. You know that all directories immediately below $top are your vendor directories.
2. I suggest just getting that list with readdir()
3. then we loop over each, and use File::Find and File::Spec to handle all files below $top/$vendor recursively

[...]

But then ozo fixed the problem. hehe!
Yes, Manav was well on the way to a solution.  I really like  File::Find and I was beginning to learn more about this package.  I feel that if I had a better handle on Regexp I would have fumbled around enough to get what I wanted from what Manav put together.  However, Ozo came up with a script that addresses my needs and is a complete solution to my question.

I will be back with a few follow up question for cleaning upother areas around this project.  Thank you for the input.

T