Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Copying millions of files in one directory

Posted on 2004-09-01
7
Medium Priority
?
518 Views
Last Modified: 2010-05-18
Currently, we have a lot of raw data flat files that are generated with every request at our business.  These are dumped all into one folder and there are millions of them.  Whenever we try to backup these files, it takes forever!!  Most are under 1kb and the hard drive has a 4K block size.  On top of that, the entire hard drive is extremely fragmented..  The problem lies in the time it takes the hard drive to do millions of seeks (which take about 5ms on a scsi hd) and then it ends up taking days to copy! Is there anyway that you could speed this up and maybe do a rawsector copy generated from the locations of the files in the MFT and not start and stop on file transfers and seek so often??  Changing the organization of these files is also not really an option-so we'll ave to figure something out.  Thanks!
0
Comment
Question by:cdesimone
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
7 Comments
 

Author Comment

by:cdesimone
ID: 11960200
Also, defragmentation is not really an option either...
0
 
LVL 93

Expert Comment

by:nobus
ID: 11960904
If rearranging them in more folders is not an option, there is nothing much you can do, or you should add those files on to the previous one, creating so only one file.
0
 

Expert Comment

by:infinitydsm
ID: 11961772
boot in to dos / command line and copy them trough that
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 2

Expert Comment

by:TheTinkeringToad
ID: 11962687
If  you have millions of them no matter what you  do it will take some time.
But why not try the xcopy command.
EG:
from the dos prompt.
xcopy /E /V /I d:\temp\scrm d:\test\xcopy\files
Here you see the variables /E /V /I
/E           Copies directories and subdirectories, including empty ones.
/V           Verifies each new file.
/I           If destination does not exist and copying more than one file,
               assumes that destination must be a directory.
here you see the source
d:\temp\scrm
here is the destination
d:\test\xcopy\files

As you can tell the destination directories will be created and the files from d:\temp\scrm will be coppied to d:\test\xcopy\files
You can use any destination you choose all you have to do is make sure that you actually have that drive or partition present then the directories will be created.
If you are overwriting files already present in the destination directory you may be prompted to confirm overwrite.
The /Y variable will supress any overwriting prompting.
Still with millions of files its going to take sometime.
It may be a faster avenue to write the files to a cd-r or cd-rw disk then use the the disk to copy over the files. Though im not really sure that would be a faster avenue. Xcopy works fairly well all by itself and if the files are like just a few k each it may not take to long. Ive never tried this with millions of files but i have tried it with tens of thousands and it works really well.
0
 
LVL 4

Expert Comment

by:cyrnel
ID: 11966597
How are the files used? Daily access or just safekeeping? Would it make sense to archive them weekly/monthly/etc to reduce the number of them on your filesystem at any one time? Either xcopy to another volume or zip to an archive file. You could schedule a nightly process to grab anything older than your chosen limit. If that sounds feasible we can talk about details and throw together a batch and the appropriate scheduling method for you.

Dave
0
 

Author Comment

by:cdesimone
ID: 11978555
cyrnel, this is the idea that we have thought of if nothing works.   It would be the best idea that we have come up with so far.  Here are some details..  

    The files are autonumbered through a program and stamped with the date created.  This date will never change, as once a file is created, it is never overwritten.  We could first archive all files into groups of 10,000 upto the current date and throw that information onto a fileserver.  Then, we could do an Xcopy  to grab a date created after a cerntain period with the date switch /D:m-d-y and archive that into a current zip files.  The problems with this is...

First, how could we generate the syntax for the date script dynamically
Next how would we generate consecutive dates that would cover all files and verify that they are all there and the archives are complete.


This would be a D2D2T when we copy this in backup exec.  We could run a prescript with the bat file for this with the general layout like this...

First 10,000.......
XCOPY C:\path Z:\archive /D:<GET last archive date>
....
zip and archive files (delete after copy) from the intermediary folder once copied (name of file will be like 230xxxxx_CREATEDDATE.zip
verify archive

Repeat with next 10,000 until finished..


This will be tough because of the organization of the files into groups of 10,000.  This means that we would have to open an uncomplete archive and add the files from the day and reclose it and backup.  
0
 
LVL 4

Accepted Solution

by:
cyrnel earned 1500 total points
ID: 11979084
Date components can be represented by numbers. We can loop through numbers. No need to fight with date math for this kind of task. We just loop a week or month at a time.

Are the files somewhat evenly distributed by date? At this point tt appears simpler to take chunks of so many days at a time than a fixed number of files. This would simplify the loops, and likely later organization. Or is there another reason you'd prefer a fixed number of files?
0

Featured Post

Ask an Anonymous Question!

Don't feel intimidated by what you don't know. Ask your question anonymously. It's easy! Learn more and upgrade.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Finding original email is quite difficult due to their duplicates. From this article, you will come to know why multiple duplicates of same emails appear and how to delete duplicate emails from Outlook securely and instantly while vital emails remai…
Compliance and data security require steps be taken to prevent unauthorized users from copying data.  Here's one method to prevent data theft via USB drives (and writable optical media).
This video teaches viewers how to encrypt an external drive that requires a password to read and edit the drive. All tasks are done in Disk Utility. Plug in the external drive you wish to encrypt: Make sure all previous data on the drive has been …
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…
Suggested Courses

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question