Avatar of mike2401
mike2401
Flag for United States of America asked on

Best solution to sync HUGE QTY of files across slow WAN?

Best solution to sync HUGE QTY of files across slow WAN?

Hello creative experts!

I have 1.7 million files (4 TB) in one of our remote offices.  

We presently use Vice-Versa to replicate the remote office data to headquarters.

Vice-versa is installed on a server here in HQ.

The initial "comparing source vs. destination" part of the run takes 21 hours over our 50 mbps wan connection.  

The actual file copy of changed files typically takes about 3 hours.

I'm guessing it's soooo slow because it's having a chatty conversation across a slow WAN connection to determine which files got added and deleted.  [This is a total guess as I don't know how the software is written]

What's the best solution?

Is there a solution that perhaps has agent software running on the other side such that each side determines local changes and THEN compares notes?

Is there a solution that works something like OneDrive?  (For example: there's no day long process evaluating local vs cloud before figuring out what to sync.  I presume if you delete a local folder, the agent gives the path of what to delete to the cloud and it's deleted.  Likewise, if a file gets added locally, just that file gets uploaded).

As I type this: I wonder if there's a microsoft solution which could leverage our E3 Office 365 subscription for our 400 users?

We started getting quotes for cloud backup and it was surprisingly expensive: maybe $90,000 per year for 8TB  - that's super approximate but gives me an order of magnitude.

Thanks,
Mike
StorageStorage SoftwareMicrosoft 365

Avatar of undefined
Last Comment
mike2401

8/22/2022 - Mon
David Favor

Seems like the problem relates to Vice-Versa, which... fails to implement sensible file comparison algorithms.

For example, rsync (standard everywhere) checks things like file stamps first + only syncs files which have changed... then rsync only syncs parts of files which have changed, rather than the entire file.

Walking a directory of 1.7M files should be fairly quick. No more than a few minutes.

As an experiment, install one of the many rsync ports for Windows + test time required to do your file sync.
Alan

Plus one for rsync - it would always be my first choice 'go to' for file replication / sync.


Alan.
ASKER CERTIFIED SOLUTION
David Favor

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
David Favor

rsync - Unsung hero of the Internet.
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
mike2401

ASKER
Thanks David.

Yes it's from a remote office to HQ over our VPLS WAN (50mbps).

Were those impressive times you mentioned rsync across the internet or WAN?  (or local pc to external USB3 hard drive)?

The pricing you mentioned is dirt cheap.  Though I never heard of OVH,  I'll have to check out raw storage from some more familiar names like Amazon :-)

Thanks so much!
mike2401

ASKER
Thanks!