We help IT Professionals succeed at work.

Check out this week's podcast, "Dairy Farms to Databases: Community's Hand in Technology"Listen Now

x

Best solution to sync HUGE QTY of files across slow WAN?

93 Views
Last Modified: 2019-03-26
Best solution to sync HUGE QTY of files across slow WAN?

Hello creative experts!

I have 1.7 million files (4 TB) in one of our remote offices.  

We presently use Vice-Versa to replicate the remote office data to headquarters.

Vice-versa is installed on a server here in HQ.

The initial "comparing source vs. destination" part of the run takes 21 hours over our 50 mbps wan connection.  

The actual file copy of changed files typically takes about 3 hours.

I'm guessing it's soooo slow because it's having a chatty conversation across a slow WAN connection to determine which files got added and deleted.  [This is a total guess as I don't know how the software is written]

What's the best solution?

Is there a solution that perhaps has agent software running on the other side such that each side determines local changes and THEN compares notes?

Is there a solution that works something like OneDrive?  (For example: there's no day long process evaluating local vs cloud before figuring out what to sync.  I presume if you delete a local folder, the agent gives the path of what to delete to the cloud and it's deleted.  Likewise, if a file gets added locally, just that file gets uploaded).

As I type this: I wonder if there's a microsoft solution which could leverage our E3 Office 365 subscription for our 400 users?

We started getting quotes for cloud backup and it was surprisingly expensive: maybe $90,000 per year for 8TB  - that's super approximate but gives me an order of magnitude.

Thanks,
Mike
Comment
Watch Question

David FavorFractional CTO
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
Seems like the problem relates to Vice-Versa, which... fails to implement sensible file comparison algorithms.

For example, rsync (standard everywhere) checks things like file stamps first + only syncs files which have changed... then rsync only syncs parts of files which have changed, rather than the entire file.

Walking a directory of 1.7M files should be fairly quick. No more than a few minutes.

As an experiment, install one of the many rsync ports for Windows + test time required to do your file sync.
AlanConsultant
CERTIFIED EXPERT

Commented:
Plus one for rsync - it would always be my first choice 'go to' for file replication / sync.


Alan.
Fractional CTO
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
David FavorFractional CTO
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
rsync - Unsung hero of the Internet.

Author

Commented:
Thanks David.

Yes it's from a remote office to HQ over our VPLS WAN (50mbps).

Were those impressive times you mentioned rsync across the internet or WAN?  (or local pc to external USB3 hard drive)?

The pricing you mentioned is dirt cheap.  Though I never heard of OVH,  I'll have to check out raw storage from some more familiar names like Amazon :-)

Thanks so much!

Author

Commented:
Thanks!
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.