Solved

Super Fast Data Copy

Posted on 2004-08-29
16
725 Views
Last Modified: 2008-03-10
Hi

Can anyone come up with suggestions to speed up data copying when doing server and workstation migrations please?

Specifically, I am often required to transfer several Gbs of data from server shares across a 100Mb network to a new server/SAN/NAS drive(s) and this can take many hours. I'm thinking I could use Gigabit LAN but maybe there are some utilities out there that would significantly speed up the copy process. For example it took 90 mins to copy just 3.5Gbs of data (many thousands of files admittedly) across a short segment of a 100Mb LAN yesterday. It would be great if this could be done in just a few minutes - not least because some of the data migrations now run to tens and increasingly hundreds of Gb's.

Any thoughts gratefully received!

thanks

Rob
0
Comment
Question by:WebAdviser
  • 6
  • 4
  • 4
  • +2
16 Comments
 
LVL 15

Assisted Solution

by:Cyber-Dude
Cyber-Dude earned 100 total points
ID: 11925330
3.5Gb in 90 minutes?
100 Mbps Ethernet is 12.5 Mbytes PS. It suppose to take less than 10 minutes to get all data done. There are a few factors you need to take in mind when you use an ethernet networking while transferring data.
1. The backbone ability of your switching equiptment.
2. The total capacity the network can handle.
3. Can the servers stand a high data transfer frequency?

Also, if the network architecture allows you to transfer large amount of data between two computers than, in that case thing about upgrading to fiber channel.

I would like to know if data relocation is commited in the SAN/NAS environment. That place consist of high capacity and designation thus problems may occure also, after upgrading to Fiber Channel. To determine this I will need more info referring the architecture you are using, How much data are we talking about? Does your switching system support large amount of backbone data transfer? (i.e. the switch can handle the total of up to 10Gbps data transfer between its ports)? Does both servers are conected to the same switch? and forth.

Cyber
0
 
LVL 6

Expert Comment

by:prof666
ID: 11925605
Two questions:

1) What NAS is it? (Make / Model)
2) Have you disabled the virus checking for the initial migration (This can have huge performace effects for mass storage migration onto NAS. If you turn off the checking for migration you will get a performance increase, but care must be taken to ensure the source data is virus free as most NAS will only virus check on writes to the NAS , and not reads, as it assumes stuff that is already on the box is virus free.

Proff
0
 

Author Comment

by:WebAdviser
ID: 11925803
Thanks for your comments guys.

As I spend most of my time transferring data across basic SME servers then I'd like to concentrate on this aspect for the moment please.

To answer the points raised:

The backbone is typically 100Mb Cat 5 ethernet. In the case of yesterday's exercise both the target server and my laptop were on the same 100Mb ethernet switch (unmanaged) over a cable run of about 5 metres in total. The target server spec was PIII 500 Mhz with 512Mb SDRAM - about 4 years old. My laptop is an HP with 1Gb RAM and 3.0Ghz P4. All workstations were switched off and there was no other traffic across the switch.

One of the purposes of my question is to be able to more accurately assess the time a given job will take. The time that legacy data takes to move from A->B seems to be a major issue. As many of the target servers will be 3-5 years old they are not going to be very quick but I fear someone coming along with 200Gb of data to shift as there is no way I can quote to get a job if it is going to take forever to move.

On a slightly different point (but related) is anyone aware of a utility that will either copy open files or miss them out, carry on the copying task and then create a log for later analysis. Simply copying and pasting data is fine until you hit a problem and then the whole process unwinds. This is a real nightmare sometimes...

Any further thoughts appreciated.

regards

Rob
0
 
LVL 7

Assisted Solution

by:LimeSMJ
LimeSMJ earned 150 total points
ID: 11927081
There is a lot of overhead if you are copying from within Windows.  You might want to try to optimize the TCP/IP :

http://techrepublic.com.com/5100-6268-1061241.html

Just make sure you export and save the keys you are editing just in case something goes wrong.
0
 

Author Comment

by:WebAdviser
ID: 11927097
Thanks LimeSMJ I will read and digest!

Rob
0
 
LVL 6

Expert Comment

by:durindil
ID: 11927911
I do data migrations for a living, and alas, would love to be able to copy lots of data in minutes. ..  

Remember that you have an extremely high amount of overhead when you deal with small files.  Those read/write/verify operations on smaller files can make the migration of 1 gigabyte of 1 kilobyte files take up to 20 times as long as 10 x 100 megabyte files.  In a typical, real-world migration with two-way traffic and overhead, you should expect to see about 1/4 to 1/3 of your bandwidth in actual data transfer rates.

One thing that can speed you up to a certain degree, is multiple copy streams.  This would be more effective when you use one or more gigabit ethernet interfaces, as you eluded to.  And as limeSMJ said, tune your TCP.  If you are copyign large files, increase your window and MTU sizes.  Decrease for smaller sizes.  The default MTU of 1500 was a compromise because, obviously, everyone's traffic needs differ.

Robocopy will allow you to copy those open files, and come back to copy them later.  I do many large migrations online, and, as you have probably seen, this makes them go even longer as well.  In fact, I am planning to move 4.5TB of data from one NAS to another, and believe that it will take about 22 days.  This is a bit unusual, however, since there is a lot of user traffic, and all of the files are really small.
0
 

Author Comment

by:WebAdviser
ID: 11928915
Thanks durindil - really interesting reading!

In most cases I will be able to add a gigabit nic into the target machine and crossover cable to my laptop (for say < 50Gb's data) or a USB2 ports PCI card and plug a local USB2 drive into this. This should give me 10 / 4.8x the nominal speed of 100Mb which will help considerably.

I take the various points raised about MTU sizes but inevitably file sizes within folders will vary considerably in a large file copying operation so on balance I am tempted to leave the MTU set to the default.

Three questions please:

1. Bearing in mind lots of small files take a lot longer to copy than one large file is it realistic to add a whole drive's files to a zip archive (assuming disk space is available) and then copy the zip archive across the network?

2. If it is feasible to hook in a spare large IDE drive to the originating server's mainboard IDE bus (will usually exist for CD drive at least) will this be faster than USB2?

3. A lot of machines I come across are several years old and not exactly fast. As I will always conduct an onsite pre-installation visit is there any formula or rough rule of thumb I can use to work out how long it will take for data copying assuming there is no other network traffic? Most of my clients want a fixed cost for a job so speed of data copying (which takes the longest period of time) is critical to making a realistic profit on a job.

thanks

Rob
0
 
LVL 6

Expert Comment

by:durindil
ID: 11930967
1)  If moving the data is your priority, then yes, one single file moves faster than a lot of small files.  I don't know how much time it would take to move that data, but on some servers (Unix) I tar (make a single file) and zip log files to send them to an archive server.

2)  IDE is MUCH faster than USB2.

3)  What I do is actually calculate the amount of time to prepare a job, and then wrap up the job.  This should be easier to do, because you have to consider setting up tools, etc.  Then, I will try to do a sample migration, and then extrapolate that.  For instance (numbers are for demonstration only!) if I copied 10 GB and it took 1 hour, then I could say that 350GB would take 35 hours.  This is the most accurate way to do it, but if you can not, then the second best way is to keep a detailed record of the migrations you do.  Keep track of the network, server, amount of data, and time.  Then you can calculate the average time that your jobs have taken.  Add an amount for "unforseen circumstances" such as 20%.

Actually, the longer the job takes, the easier it is to calculate if you don't have to be there during the whole thing.  You calculate the setup, completion, and time in between, but if you are not on-site during the migration, you can be doing something else for someone else.  
0
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

 

Author Comment

by:WebAdviser
ID: 11932590
Thanks for your further comments durindil.

1) This is fine provided there is enough disk space. The zip archive seems to work in the temp directory whilst it is building. I've found not using compression is the best option for speed but obviously requires the most free disk space.

2) I'm thinking of getting a 400Gb IDE drive just to hook into the server (where possible) to help speed transfer. I can then take this out of the old server and put it temporarily into the new one to copy data to the new server. This is a little bit untidy but should speed things up significantly. Is there such a thing as an external IDE drive enclosure that takes a standard IDE cable and power plug out there?

3) Excellent commonsense advice - thanks! Experience is of great value here so I will have to learn from my mistakes if I get the estimations wrong. Is Robocopy a foolproof way of ensuring that files actually do get copied and that the process doesn't crash out at any stage? Is it possible to script a copy/xcopy/robocopy command to start the copying again if it falls over? I'd hate to be copying 200Gb of data and the thing falls over at 199Gb copied! This is a REAL concern. Any advice on this please / utilities you know very welcome!

thanks again

Rob
0
 
LVL 6

Expert Comment

by:durindil
ID: 11933684
2) I have seen the "swappable" enclosures for IDE drives, but I haven't used them personally, so I think someone else here may have better advice on that.

3)  As for Robocopy, Nothing is foolproof, but it does give you a log of what was copied and what was not.  Also, if you do a copy for a certain point in time, you can later go back and just copy anything that has changed since your initial copy.  That makes subsequent copies much faster.  When the files are copied with Robocopy, they are copied, so if you fail at 199GB, then you would just have to manually copy the 1GB left.
0
 
LVL 7

Expert Comment

by:LimeSMJ
ID: 11933990
There really isn't such as thing as an external IDE connector but that's not to say it can't be done.  Although it's risky, you could just run a long IDE cable out the back of the server to a hard drive.  Along those lines, there is no external IDE hard drive enclosures that I've seen... only those USB ones which you should not use because USB 2.0 is fast but not as fast as you probably want it to be.

If there's space in the server's case, you can use durindil's idea of using these:

http://www.vantecusa.com/product-storage.html

You'd have buy two at the least - one for the 1st computer and another for the 2nd.  Some of these Vantec drive bays are hot swap too...  Vantec is not the only company that makes these either.
0
 

Author Comment

by:WebAdviser
ID: 11934508
Thanks durindil and LimeSMJ - good points from you both.

I found this:

http://www.pchardware.co.uk/highpointsata1511&rmate.htm

But on reading this article I see that the 32bit PCI bus are limited to 100Mb. Is this 100 Megabits per second or 100 Megabytes/sec? There's a bit of a difference between the two! If it's 100 megabits/sec then there wouldn't seem to be too much point going for USB2 @480Mbits/sec or Gigabit LAN for that matter.


Rob
0
 
LVL 7

Expert Comment

by:LimeSMJ
ID: 11935476
100mb/s translates into megabytes (e.g. ATA100 IDE)... whereas 100Mbps translates into megabits (e.g. ethernet).  Kinda confusing between the upper and lowercase "m" but there's a big difference.
0
 

Author Comment

by:WebAdviser
ID: 11939033
Thnks LimeSMJ.

What's the formula please for converting Mbits/sec to Mb/sec?

Is it divide by 32 for 32bit computers?

Thanks

Rob
0
 
LVL 7

Expert Comment

by:LimeSMJ
ID: 11939197
8 Bits in a byte... so if something says 100MBps (megabits per second) just divide by 8 and you'll get megabytes.  For USB 2.0 = 480MBps = 60Mb/s...
0
 
LVL 6

Accepted Solution

by:
durindil earned 250 total points
ID: 11942305
Then for hardware, such as drives, etc. manufacturers think they can simplify things, which complicates things for us!  They use a 1000 x 1000 formula to determing the storage capacity.  This is the reason that a vendor's storage report will differ from a Windows or Linux storage report.

The generally accepted busses and speeds looks like this, but will differ slightly depending on the method you use to calculate.   Most vendors will use nice values that are multiples to make things easier to read and remember.

32 bits x 33 MHz = 133 megabytes/second
64 bits x 33 MHz = 266 megabytes/second
64 bits x 64 MHz = 533 megabytes/second
64 bits x 100 MHz = 800 megabytes/second (Dell's PCI-X)
64 bits x 133 MHz = 1066 megabytes/second (PCI-X)
0

Featured Post

Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

Join & Write a Comment

Suggested Solutions

AWS Glacier is Amazons cheapest storage option and is their answer to a ‘Cold’ storage service.  Customers primarily use this service for archival purposes and storage of infrastructure backups.  Its unlimited storage potential and low storage cost …
I previously wrote an article addressing the use of UBCD4WIN and SARDU. All are great, but I have always been an advocate of SARDU. Recently it was suggested that I go back and take a look at Easy2Boot in comparison.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now