Interesting Challenge - retrieving large remote recordset quickly

ok first what i'm trying to do - i have one server with records that contain row addresses.  let's say it's a customer and row address 1234.  i also have another db server that contains the application and the actual row addresses.  here that address for that customer should be 1234 - sometime it's not, and the original server is pointing to the wrong one.  i want to identify these records.

the page needs to do this -
get records from server1
get records from server2
compare records and display results.

i have tried this varying ways which resulted in different speeds.  it's working pretty good now but only when the webserver running this is in the same building.  unfortunately my other 2 servers are often located elsewhere, elsewhere being in europe and australia, while the webserver is in illinois.

i figure probably the quickest way to do the compare now is with sql to 2 text file result sets from the servers, (which i haven't mastered yet either) - but getting those resultsets is where my problem lies.  i tried pulling down the resultset and writing it to text, iterating with movenext through my rs.  it was taking forever, so it occured to me that i'm not pulling the whole rs to memory on my webserver, just one record at a time.  

i spoke to someone that recommend i try getting everything all at once as XML, and had high hopes  unfortunately i have some of those running in the background now, and writing that xml file seems to be taking just as long.  i opened my recordset and just did a .save after that, which seems like it may be just slowing streaming that line by line across the world as well.

i am hoping this program to handle up to 50k lines from each source.  each line is only 2 or 3 short fields.  i need to get this resultset into memory or a local file on my webserver to do the compare, i think best would be an sql connection and join onto 2 text files, but building them in a timely manner is really presenting quite a challenge to me.

i do not want to add/alter anything on the source databases, and i have high doubts that i am allowed to at all.  any ideas ??
Who is Participating?

If you think this is the best method and your only concern is the automation then you should be able to create a batch file (*.bat) to do the job for you as follow
1-      run the query on the source
2-      save it to a file
3-      ftp it over to webserve
I use batch files for most of the jobs which require repeated tasks on the command line  
Hi taveirne
If you use Saxon transformation engine for XSLT transformation. You can write a style sheet which interacts with both XML and databases.
The SQL extension supplied Saxon regarded to be the easiest, safest and the fastest for this type of jobs.
You style sheet should connect to the two servers, store the results locally, then compare them and produce what you need    
taveirneAuthor Commented:
wadalhag, my problem really isn't doing the comparison or manipulation of data, it's doing the retrieval of it in a speedy fashion.  i originally thought the bottleneck was in the compare, but is in fact in the retrieval of the resultsets.  right now my working page iterates through one recordset and puts it into an array in memory, then iterates through the second and does a bin search in that array for the associated row, then matches the record.  takes ~70 seconds to do a report on 30k records, server in same datacenter, which is acceptable performance for this to me.  i think the compare can be made faster with an sql connection to 2 local text files and running a join on them as tables, i'd prefer CSV and the only reason i was trying to get the XML was to speed up my data resultset process, however this method does not appear to be any quicker.  in fact it may even be slower with the extra data that XML fluffs out the values with.  right now the best way seems to be to run the query on the source, save it to a file, and ftp it over to my webserver, which is not something i want to do myself (or automatically if i can help it)!
taveirneAuthor Commented:
i didn't want to go through that "fun" if i could avoid it.  a guy from p2p wrox forum actually suggested something that is working great for me - since both source servers are in that same remote location, this compares both and i just bring back the results on the webserver.  he recommended openrowset -

SELECT c.*, o.*
FROM Northwind.dbo.books AS c INNER JOIN
   'SELECT * FROM pubs.dbo.authors ORDER BY au_lname, au_fname')
   AS o
   ON c.au_lname = o.au_lname

it's working great!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.