Solved

SQL Server Query Takes Too Long - Join on indexed tables

Posted on 2010-11-15
17
831 Views
Last Modified: 2012-05-10
hi there,

i have a query that takes forever.  it is a join between two tables on two columns.  each table has an index using both of those columns.  one of the tables has 139 million records, and the other has 145 million.  the first table has many records where the indexed columns are the same, the second one does not.

the query updates one field in one of the tables from a field in the other table.  the two tables are on different databases.

any suggestions on how to speed this up?

i'm running sql server 2008 on a dual quad core server running server 2008.  each database's data file and log file are on different physical drives.  one database has the mdf and ldf on separate SAS 15K drives.  the other has the MDF on a SATA 2 drive, and the log file on one of the SAS drives.
0
Comment
Question by:RBECKMN
  • 5
  • 4
  • 4
  • +2
17 Comments
 
LVL 5

Accepted Solution

by:
jcott28 earned 42 total points
ID: 34139113
You can try partitioning the table.  that's helped me out on large tables.
0
 
LVL 23

Assisted Solution

by:Racim BOUDJAKDJI
Racim BOUDJAKDJI earned 166 total points
ID: 34139196
What kind of queries are ran on these tables ?  I mean do you use random oriented predicates (WHERE =) sequentials oriented predicates (range scans BETWEEN AND; WHERE..> =) ?   Make sure you match the index types with the type of queries (clustered sorted for range scans and nonclustered for random).

Besides, do you have a lot of tempdb work involved ? Finally, see if the slowdown is not due to the heavy logging on the target database due to having the log on the same dirve than data.  To confirm this, you may want to disable temporarily the lggin on the target by setting up the database to recovery mode SIMPLE.

But definitely this looks like a good candidate for patitionning.

 Ideally , please post the structure and sample code to be ran for further advise...

Hope this helps...
0
 

Author Comment

by:RBECKMN
ID: 34139667
logging is set to simple for all databases.

here is the SQL (which has been running for 15 hours now):

use [DBTwo]
UPDATE    DataFinal
SET              ThruDate = [DBOne].dbo.Cl.THRU_DT
FROM         DataFinal INNER JOIN
                      [DBOne].dbo.Claims ON DataFinal.Datasource = [DBOne].dbo.Cl.DataSource AND
                      DataFinal.Cnumber = [DBOne].dbo.Cl.Cnumber

code used to make the indices (same on both tables other than the tablename of course):

CREATE NONCLUSTERED INDEX IX_DataFinal ON dbo.DataFinal
      (
      Datasource,
      Cnumber
      ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
ALTER TABLE dbo.DataFinal SET (LOCK_ESCALATION = TABLE)
GO
COMMIT


file storage:

DBOne: mdf on drive D, ldf is on drive D
DBTwo:  mdf is on drive Q, ldf on Drive C
tempdb: mdf and ldf on drive D

Drive C and D are SAS 15, Drive Q is SATA 2.  adequate space on all drives (>70g free right now during the operation)


structure:

table DATAFinal, 11 columns:  one varchar(100), one char(5), one char(8), rest int.   Cnumber and Datasource are int.  data space 8g, index space 2.5g, 116,200,200 rows.
table CL, 70 columns, evenly mixed between int, real, char(8),char(1). Cnumber and Datasource are int.  data space is 43g, index space is 3g.  139,400,000 rows.

i do have DBOne partitioned into four filegroups, if that's what you mean by partitioning.  this table is in the primary partition, which is over 100g.
0
 
LVL 22

Assisted Solution

by:8080_Diver
8080_Diver earned 126 total points
ID: 34139721
Just for grins, I would have tried executing a SELECT statement with that JOIN in order to get an idea as to how many rows are going to be impacted.  Probably, based upon your indications as to the row counts, I would also have done another SELECT to get the counts by the WHERE constraints.

What I sus[pect is that you have a heck of a lot of duplications (i.e. you are getting combinations of matches for some entries in the the two tables rather than distinct matches).  In effect, that will multiply your effective rows being impacted.  
0
 
LVL 23

Assisted Solution

by:Racim BOUDJAKDJI
Racim BOUDJAKDJI earned 166 total points
ID: 34139780
<<In effect, that will multiply your effective rows being impacted.  >>
A wise view.  We need to keep in mind that INNER JOIN is a cartesian product between the dupplicates cardinalities of the two tables joined.  Plus I can see that the query requests a lock escalation on the entire table explicitely.  It may happen that the escalation never gets granted if the table is used.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 166 total points
ID: 34139841
If the volume of UPDATEs is high, I suspect you may be waiting for the log to grow too, to accomodate the UPDATEs.

What is the log autogrowth amount on the db being UPDATEd?

What is the log size of the db being updated?  How much space was free *before* the UPDATE started? [Obviously unless you capture this info on some schedule, you can't tell now :-) .]

How fragmented was the log?  How many VLFs? [Again, if you're captured such info (which you should for info purposes, btw :-) ).]

Typically it helps such queries *a HUGE amount* if you *pre-allocate* the added log space the UPDATE will need prior to running the UPDATE.  Remember, log space *must* be pre-formatted before being used.  So basically log growth and writes are *double* the writes.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 166 total points
ID: 34139863
Btw, I suggest you *not* cancel the UPDATE now.  The ROLLBACK will likely be far worse than just letting the UPDATE finish.

If the log is indeed autogrowing, you'd be better off instead adding og file(s) to the db being UPDATEd.  If you can get a reasonably large log file or two added before it's needed to the db, you should be able to speed up the UPDATE.
0
 

Author Comment

by:RBECKMN
ID: 34139869
in table CL, Cnumber + Datasource are unique.  in table DataFinal, there are probably about 5-10 rows for each Cnumber+Datasource value.  

i could create another table from DataFinal with a unique set of Cnumber+datasource, then try the same query (i mean select distinct Cnumber, datasource, throughdate from DataFinal into xxx, then run the same query using xxx instead of DataFinal).  if the problem is duplicates from the inner join, this query should run relatively quickly.

the lock escalation is only in the code that creates the index.  are you suggesting that it stays on after the commit?  it did not take very long to index the tables.
0
Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

 

Author Comment

by:RBECKMN
ID: 34139884
i guess this is a separate question, but if i stop the query that's now running in order to run that test, is there any way to speed up the cancellation.  the last time i stopped a query like this, it took many hours to stop.
0
 
LVL 22

Assisted Solution

by:8080_Diver
8080_Diver earned 126 total points
ID: 34140033
If you stop the query now, you can expect it to take approximately as long to roll back as it has take to run so far.  So, I have to agree with Scott that stopping the update is probably a less than optimal solution.  However, it is also possible that you may run into space issues.

0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 166 total points
ID: 34140857
From my experience, it typically takes *twice* as long to rollback as it did to apply the updates.  In its own way, it makes sense, since the log must be read -- I think *backwards* -- and the updates undone in a less orderly fashion than they were applied.

By all means clean up the UPDATE as much as you can.  But before running a *big* UPDATE, pre-allocate the log space -- the difference in total run time can be *huge*.  Also, bump up the autogrow to a signficant amount, at least 200M.

Please check the db and post what the autogrow amount for that log is.  If it's small, unfortunately you'll be in for an even longer haul :-(.
0
 

Author Comment

by:RBECKMN
ID: 34141073
the autogrow is set to 10%.  the log is at 6g, so 10% should be about 600mb.   there's plenty of room on the drives.

i won't stop the query, but i wonder how much longer it's going to take - it's at 17 1/2 hours now.

another separate question that i think i know the answer to, but i suppose there's no way to run without any logging?

in regard to free space on the log before the query started, i shrunk it down to the minimum.  i was not able to add another log file nor to make it bigger, nor to change the percentage or switch it to unrestricted growth.

i'm going to write the query i mentioned above - where there is one-to-one match on the indexed fields between the two tables.  if the query doesn't stop by tomorrow morning, i'll try this new query with everything as is.

i appreciate the comments and suggestions.  i'm kind of stumped about this, because most of the queries i've run on these databases have not been slow at all.  

getting back to table partitioning, that's another thing to try, once this dang query stops.  would you recommend partitioning across drives or does it not matter?
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 166 total points
ID: 34141183
>> in regard to free space on the log before the query started, i shrunk it down to the minimum <<

So the early 10%s were getting very small amounts.  And it's up to 6G?  Yikes.

[Btw:
In the future, before you run a huge UPDATE like that, you should pre-allocate the log, not reduce it.

Make your very best estimate of the UPDATE size, and then increase the log allocation to that much plus a fudge factor, 10% or 20%.

For example, if you estimate the total UPDATE will be ~10G, increase the log to 10/11G.  Yes, there will be a pause after you run the command to increase the space, as SQL must format it, but the UPDATE will be *VASTLY* faster.  I've had 30 hour run times reduced to 45 mins by pre-allocating.  I don't know exactly SQL is sometimes so slow auto-growing the log.  Certainly you will get vastly less file fragmentation, and VLFs, from pre-allocation, which will reduce overhead considerably.  
]
0
 
LVL 23

Assisted Solution

by:Racim BOUDJAKDJI
Racim BOUDJAKDJI earned 166 total points
ID: 34142089
For better large transaction handling of the autogrowth by sql, use the Instant File Initialization fearure.  It requires to grant additional rights granted to the sql service account (performance maintainance tasks) plus a service restart but it takes away some significant part of the overhead.  

Hope this helps...
0
 
LVL 23

Assisted Solution

by:Racim BOUDJAKDJI
Racim BOUDJAKDJI earned 166 total points
ID: 34143621
More  info on database instant file initialization:

http://www.mssqltips.com/tip.asp?tip=1572
http://sqlblog.com/blogs/tibor_karaszi/archive/2009/03/09/do-you-have-instant-file-initialization.aspx

Though the log file is not concerned, the overhead of putting latest transactions on data files can be reduced significantly.
0
 
LVL 22

Assisted Solution

by:8080_Diver
8080_Diver earned 126 total points
ID: 34145765
in regard to free space on the log before the query started, i shrunk it down to the minimum.

That was probably not the best idea.  If youi had left it as large as it was, you would not have had to grow it as soon and the growth would have been in larger increments from the start.  In general, unless you have seriously bloated a log file doing some massive update (which this one may qualify as ;-), shrinking your log file to the minimum initial allocation just means that it will have to grow to its normal daily size.
0
 

Author Closing Comment

by:RBECKMN
ID: 34155360
thanks for all the help.  i set up another identical table, populated it with 3 million records and ran the same update on the same server.  this one took six minutes.  So it must have something to do with the number of records but i'm not sure what.
i've stopped the query that's been running since Saturday, and once it's stopped i'll try out the ideas in your comments.
Thanks again.
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.
Viewers will learn how the fundamental information of how to create a table.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now