Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win


SQL Join Query on large table getting slower

Posted on 2010-09-16
Medium Priority
Last Modified: 2012-05-10
We are trying to update a very large SQL table (about 150 million records), by joining it with a smaller table (90,000 records), using SQL Server Workgroup 2008.

The query we are running is:
Update table1
set Bad_Account = table2.Bad_Account
from table1 inner join table2 on table1.LOCNUM = table2.LOCNUM
where table1.EVENTID >= 438651 and table1.EVENTID <= 439183

Table 1 is indexed with a nonclustered index only on the LOCNUM field. Table2 has a primary key on LOCNUM.

The query takes several hours to run. The reason that we have to select certain EVENTID's is because we have limited space on the server, and the resulting log file output would be too large if we tried to run all EVENTIDs at the same time.

What we do is:
-Run the above query for a range of about 500 EVENTID's
-Shrink the database manually in SQL server management studio
- Run the query again with the next 500 EVENTID's.

We have noticed that the query takes longer and longer to run each time. The first time it took about half an hour. By the time we ran the last query it took over 3 hours.

I suspect that there could be an issue with the indexes becoming fragmented. However, the query to initially create the index took 16 hours, so I don't think it would make sense to re-index very frequently.

Do you have any advice as to why the query may be taking longer each time, or how we could optimize the query in general?

Question by:jnelson219
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
LVL 11

Expert Comment

ID: 33696850
if you could, I think you should try create new index on table1 to support this query

create index of EVENTID (and check query plan, if it isn't uses this index, we should give it INDEX HINT)
index by 2 fields  LOCNUM and EVENTID

since current situation, you have index only on LOCNUM,
even it help to narrow down the data to delete.
but in case data of each LOCNUM still large amount,
change the index column or got more index column should provide better result.
LVL 11

Expert Comment

ID: 33696902
and  instead of delete base on eventid,
why can't just delete it?

using TOP or SET ROWCOUNT to help
so you can remove (and ignore) WHERE condition of EVENTID

another note is, should not try to SHRINK database every loop of running, it's waste of time and not provide much help

Author Comment

ID: 33697038
That is a good idea. Let's say we wanted to break the query into segments of 10 million rows.
I understand how we could run the query to only execute for the first 10 million.

Set rowcount = 10000000;
Update table1
set Bad_Account = table2.Bad_Account
from table1 inner join table2 on table1.LOCNUM = table2.LOCNUM

But what would happen the second time that we ran the query -- how would we tell SQL to only update the rows that have not yet been updated?

I agree - I would like to not SHRINK the database each time, but the log file is taking up almost all of the space on our small server, and this is the only way I know to reduce the size of the log file (set to simple recovery mode).
LVL 11

Accepted Solution

JoeNuvo earned 750 total points
ID: 33697930
Oh, I get a point now.
I misconception with my own work on large table.
But that time it's the delete task, so it will never delete the same rows twice :-p

for your case. my suggestion is
1) either new index to support the query for EventID or LOCNUM + EventID (and uses your method of block update by eventid)
    but this way, you need to have enough disk space for index.
2) make a loop, get 1 LOCNUM from table2 for each round, and update that LOCNUM on table1
    this is expecting 1 LOCNUM should not have data more than 10 million (base on your rowcount setting)

Anyway, how much of space left for your log?

extra note : since you are perform update over and over.
LOG will consume back the space after you shrink it.
should be better to finish all the update and shrink after.

Author Closing Comment

ID: 33737934
The response did not solve the slow running of the query, but it was helpful information for optimizing the query execution.

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

     When we have to pass multiple rows of data to SQL Server, the developers either have to send one row at a time or come up with other workarounds to meet requirements like using XML to pass data, which is complex and tedious to use. There is a …
How to leverage one TLS certificate to encrypt Microsoft SQL traffic and Remote Desktop Services, versus creating multiple tickets for the same server.
Have you created a query with information for a calendar? ... and then, abra-cadabra, the calendar is done?! I am going to show you how to make that happen. Visualize your data!  ... really see it To use the code to create a calendar from a q…
Is your data getting by on basic protection measures? In today’s climate of debilitating malware and ransomware—like WannaCry—that may not be enough. You need to establish more than basics, like a recovery plan that protects both data and endpoints.…

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question