Link to home
Create AccountLog in
Avatar of jdressing
jdressingFlag for United States of America

asked on

Best practices for synchronizing differences from the server to the client

I have a VB.Net 2008 windows application that reads data from SQL Server.  Some of the tables have really large number of rows that need to come back.  This data is sent over the internet, so that also affects speed.   So for example, I have a table that has 100,000 rows in it... the first time I would not have anything locally, so I would get all 100,000 rows.   (There may be 20 different people reading and writing to this table, so it is not exclusive to 1 user.)    The second time I read it, I want to recognize that only 5 rows changed and 2 were added and 1 was deleted.   Somehow, I only want to get those 7 rows and know to remove the deleted row from my local cache database.

So here are my questions:
1.  What is the best format to store this data locally?   (Access MDB,  XML or what is the fastest)
2.  How do I only read the differences instead of all the rows?
3. How is the best way to get it from the dataset result coming back from SQL Server into the locally stored cached format?   If I want to store it in XML, I can simply use dataset.ReadXML and dataset.WriteXML    ...seems like the simplest, fastest way to locally read and write the entire result set.

As far as sending data back to the server, I can use GetChanges on a datatable to determine what rows have changed and only send back the differences.

I read and write to the database via a webservice, so the client will not have direct access to the SQL database, the client must pass everything through the webservice.
Avatar of jdressing
jdressing
Flag of United States of America image

ASKER

If I can figure out how to read the differences, how is the best way to merge them back to the local cached data to get a complete set of rows?

By the way there are 100s of tables that I need to deal with.

I have a single table that tells the application what tables the app needs to read.  I was thinking about storing a version number in this table for each table...  then any time a table gets updated, raise that version and only get the tables that have a newer version than the version stored locally.  That would handle me only reading tables that have changed instead of ALL the tables... then ones that it read would just replace the cached tables.   This would cut down on the number of tables being read...  but I wanted to take a step further and only read the ROWs that need to be read and merge those differences to the local cached tables.

I am trying to cut down on traffic over the internet between the application and the database.
ASKER CERTIFIED SOLUTION
Avatar of Nasir Razzaq
Nasir Razzaq
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Perfect!!

Thank you!
Glad to help :-)
Can the sync framework work over a webservice?

I am looking at the example and I'm not sure how to use this over a webservice.

I have heard all the hype about cloud computing...  I have no idea what that really means.  Can I incorporate that somehow?
Cloud computing is about hosting your application in a data centre and having the flexibility to reduce or increase the resources that you require. So you could be using a single 1GHz processor with 1GB memory or you could be using 10 3GHz 8GB machines depending on your demand.

I have not used Sync framework myself (only used Sync Toy which uses Sync framework). Do you have direct access to the remote SQL Server?
Yes - I have access to the remote SQL Server.

I was just looking at information about SQLAzure on Technet
http://msdn.microsoft.com/en-us/sync

It sounds like using the MS Sync Framework is the latest and greatest solution to do this.
I think this article sounds like it should work:
http://social.technet.microsoft.com/wiki/contents/articles/sql-server-to-sql-azure-synchronization.aspx

I think I can use the sqlAzure object to let it do all the synchronization behind the scenes for me.  I think it should automatically create a local sql azure database that it will keep in sync with whatever I get from the SQL Server.  I have no idea what I am getting myself into.  Hopefully this will do the trick.  I have not seen any good clear examples on how to do this.   I was hoping to find a simple example that shows how to handle from "SELECT" reading the database to updating the database with changes.  Surely there is a sample out there somewhere that does this.

What do you think about this?
You have to pay for the cloud on per instance per hour basis!

I think synching manually in code would be a real headache for you.
Yeah, I see that.   So it sounds like I might still be able to use the MS Sync Framework, but not do it using Azure because of costs...   So I would just let it create a local Compact SQL Database

I downloaded the sample you had originally:
http://code.msdn.microsoft.com/sync/Release/ProjectReleases.aspx?ReleaseId=615

I am having a hard time understanding how to use it.  ...and somehow need to make it work over a web service.

I currently send a select statement to a webservice method and the webservice method simply returns a dataset.   Somehow I need to be able to get this sync mechanism to work over a webservice.

It simply uses:

Private mda As SqlDataAdapter
Private msqlConn As SqlConnection

mda.SelectCommand = New SqlCommand
mda.SelectCommand.CommandText = strSQL
mda.SelectCommand.Connection = msqlConn
mda.Fill(ds)
Return ds

Any ideas?





What do you think about something like this:

On the WebService, create a function like this:

Public Function GetSyncProvider(ByRef objDataProvider As Microsoft.Synchronization.SyncProvider) As Microsoft.Synchronization.SyncOperationStatistics
        Try
            Dim agent As Microsoft.Synchronization.SyncOrchestrator = New SyncOrchestrator()
            agent.Direction = Microsoft.Synchronization.SyncDirectionOrder.DownloadAndUpload
            agent.LocalProvider = providerA
            agent.RemoteProvider = objDataProvider
            Return agent.Synchronize()
        Catch ex As Exception
        End Try
End Function

I'm not sure how to point a provider to a SQL Server connection.  
I'm also not sure if the overhead on putting this into a webservice would be way too high.

Any thoughts?
I could have a background process on the client that keeps synchronizing every X minutes and it it sees changes, notify the client app any time data changes so the client can dynamically reflect changes on the fly.
It looks like this sample should be similar... it would be nice if the sample was a little simpler and done in VB.Net

http://code.msdn.microsoft.com/sync/Release/ProjectReleases.aspx?ReleaseId=3422
or
http://code.msdn.microsoft.com/sync/Release/ProjectReleases.aspx?ReleaseId=4835

I'm still not sure how to make this work over a web service
I don't see a way to make this work over a web service...  can it be done?

If not, it sounds like I may need to build my own synchronization mechanism.

I was thinking about versioning all tables and all rows and keep this collection locally... then send it to the webservice.  Have the webservice locally on the server determine the current version of each table, then on any changed tables, return all rows that have a newer version.  Any time new rows are added or any row is updated, simply raise the version on the row.... then just get all rows that have a version greater than the version locally stored.   It will definitely be a lot more work, but is doable.

I really would like to reuse existing technology rather than reinventing the wheel...   Any ideas?
I dont think webservice is a good choice for this. Can you not use a stored procedure?
It has to work over the internet through a firewall on port 443    ...I guess I could open SQL server through a port on the firewall.  I don't know if my company will allow that.

It works great over the webservice except for speed.  Maybe I just need to optimize my calls to the database so it doesn't call it as often.   I was hoping to do this local cache synchronization to make it run really fast.

I guess I need to do the versioning approach on the tables and rows and build my own synchronization mechanism that works over web services.
I might be able to move more business logic to the server via stored procedures and have it send less over the internet
Stored procedure would be best approach as you will minimize the network traffic. Also, webservices have to send a lot of metadata for deserialization especially if you return XML.
It ran slow returning a dataset from the webservice, so I save it to an XML file on the server, then ZIP it up and stream the file to the client, then on the client, it unzips it and reads it back from the XML file to a dataset.   The process actually goes pretty fast.   The problem is that I am trying to read too much data at a time and that is why I wanted to start reading the differences instead of everything.

The SQL is all processed on the server in the web service, so it reads the database pretty fast. The slowness is sending it to the client over the web.

The other problem is the client reads the data at too many different places, so instead of making 1 call that does it all, it makes several calls to the webservice for data.  I need to move some of the business logic that causes it to make several calls to a stored procedure.

Yes try to move most code to the server and only keep the presentation at the client. This will help more than trying to create your own caching mechanism.