SQL Server linked server performance issue

Posted on 2013-11-21
Last Modified: 2014-01-27
We are fetching prod data from two different servers using attached query with linked server setup.

In development server (without linked server setup, two DB's on same server), the attached query will take around 3 minutes to return 10000 records.

But in prod linked server, it was taking around 2 hour 40 minutes of time to return 10000 records.

Please review the attached sql query, indexes and execution plan and advise me how to speed-up the attached query using linked server setup in prod environment.
Question by:sqldba2013

Assisted Solution

BradySQL earned 250 total points
ID: 39668247
I don't know if you are going to love the answer with the size of you query but here you go.

When running a query like that over a linked server your performance is going to be slow due to the nature of a linked server. If you want to get better performance I would advise putting it in a dynamic query and running it directly on the remote machine. Or you can use a stored procedure that you can call on the remote database.


SELECT @sql = 'SELECT * FROM [Database].dbo.Person WHERE Name LIKE ''A%'''
EXEC (@SQL) AT [LinkedServer]


EXECUTE ('SELECT * FROM [Database].dbo.Person WHERE Name LIKE ''A%''') AT [LinkedServer]

OR - If you need to work with it later.

INERT INTO #temp EXECUTE ('SELECT * FROM [Database].dbo.Person WHERE Name LIKE ''A%''') AT [LinkedServer]

When you run a query like SELECT * FROM [LinkedServer].[Database].dbo.Person SQL will actually access the remote table like it would a local table, this is what takes it so long to run, you obviously don't have the same bandwidth to the remote database as you do to a local database, so this approach rarely works. However when you run a query like EXEC ('SELECT * FROM [Database].dbo.Person) AT [LinkedServer] you are instructing sql to run the script locally on the remote database and only return you the results. This speeds things up considerably.

If you want to use these results later the best thing to do is use a temp table or global temp table to hold them depending on your situation.

Hopefully this helps. A liked server is never an easy solution to run queries like this.
LVL 39

Expert Comment

ID: 39670410
I agree with previous post that a SQL Stored proc on the remote server is the best and execute that one to return you the exact record set you need.

Also may I ask what is the purpose for "fetching prod data from two different servers using attached query with linked server setup."?

Same thing can be achieved in many different ways and much faster if data is on the same server so you could bring one set of data into a staging table or temp table from the remote server and use it/aggregate it if needed on one server only.
Try create a SQL Stored proc (or even a view if there are just simple selects) that gives you exactly the record set you need from linked server and run it on that liked server to see how long it takes. Should be quick and if it's like that the use data returned by that sproc to populate a staging table or temp table then use that one rather than distributed query against two servers and databases - those will be the slowest no matter what.

Author Comment

ID: 39670972
We have to fetch data from two DB's and these two DB's are located in different servers. Because of this we are planning to implement linked server.

Is there any other solution instead of linked server setup? Our main goal is to reduce/speed-up the execution of query.
Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

LVL 35

Accepted Solution

David Todd earned 250 total points
ID: 39685829

My advice is to develop against a linked server. Linked server queries are difficult for SQL to optimise as well as queries contained within the server. The core problem is that the query has to run on a server. That is, it can run here or there, and if it runs here then it transfers the entire table from there and filters it here.

To do the linked server, you can create a linked server to itself, but I'm not sure if this shows the usual linked server problems.

The other way around some of this, is to not write queries with the four part naming convention (server.database.schema.table) but to use the openquery function. This allows part of the query to run against the linked server, and part to run against the local server.

select blah from openquery( linkedservername, 'select blah from table etc' ) as ll inner join somelocal table t on t.column = ll.column

The key issue is making the openquery dynamic and passing parameters to the where clause, and this is often handled using dynamic SQL.

Other technics:
Create procedures on the linked server and execute them via openquery, to perform filtering and calculations on the distant server.
Bring the results back via openquery (or four part query) to a temp table, and then continue with the local processing.

LVL 51

Expert Comment

ID: 39686483
You are running what is known as "Distributed Queries" and they are guaranteed to be slow if you are doing JOIN's the mix tables from multiple servers.  To make matters worse, you are doing a lot of collation casting which will slow things down even more.

This is an excellent article on the topic:

I'd recommend using the OPENQUERY object that dtodd mentioned.  Done well it can trim the data transmitted from the remote server to local to minimize latency
LVL 32

Expert Comment

ID: 39686826
The most important problem arises when running distributed queries from the fact that you may not use the statistics from the remote tables. See Linked servers: permissions and distributed query performance .

Author Closing Comment

ID: 39812028

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
Many companies are looking to get out of the datacenter business and to services like Microsoft Azure to provide Infrastructure as a Service (IaaS) solutions for legacy client server workloads, rather than continuing to make capital investments in h…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now