Aleks
asked on
Better a query that takes tooooo long to run
I have the query below, is there a way to change the syntax and get the same results while making it faster ? It takes about 30-40 seconds to return results :( should take a second or two the most. Help is appreciated.
---
SELECT activities.activityid as id, b.firstnm + ' ' + b.lastnm + ' ' + '(' + processcatalog.processcata log + ')' + ' ' + '-' + ' ' + activities.processtep as actdesc,activities.partyli st as longdesc, activities.schdemailon as datestarted, activities.schdemailon as datefinished, activities.actnote as repeat,activities.actnote as apptype FROM dbo.Activities
Inner join cases on activities.caseid = cases.id Inner join atts on cases.id = atts.caseid INNER JOIN Users ON atts.userid = users.userid INNER JOIN Users as b On cases.alienid = b.userid INNER JOIN Processcatalog ON cases.Process = Processcatalog.Processcata logID WHERE activities.FirmId = 71 and ActType ='HISTORY' and sent_by = 1 and atts.userid = 127273 and Schdemailon is not null
UNION ALL SELECT activities.activityid as id, activities.ActName as actdesc, activities.actdesc as longdesc,activities.datein itiated as datestarted,activities.las tmodified as datefinished,activities.pr ocesstep as repeat,activities.Partylis t as apptype
FROM ((dbo.Activities LEFT JOIN dbo.Cases ON dbo.Activities.CaseId = dbo.Cases.Id) ) LEFT JOIN dbo.Atts ON dbo.Atts.CaseId = dbo.Cases.Id INNER JOIN Users ON Activities.sent_by = Users.UserId
WHERE Activities.FirmId = 71 AND (Activities.ActType = 'Calendar') AND sent_by = 127273
----
---
SELECT activities.activityid as id, b.firstnm + ' ' + b.lastnm + ' ' + '(' + processcatalog.processcata
Inner join cases on activities.caseid = cases.id Inner join atts on cases.id = atts.caseid INNER JOIN Users ON atts.userid = users.userid INNER JOIN Users as b On cases.alienid = b.userid INNER JOIN Processcatalog ON cases.Process = Processcatalog.Processcata
UNION ALL SELECT activities.activityid as id, activities.ActName as actdesc, activities.actdesc as longdesc,activities.datein
FROM ((dbo.Activities LEFT JOIN dbo.Cases ON dbo.Activities.CaseId = dbo.Cases.Id) ) LEFT JOIN dbo.Atts ON dbo.Atts.CaseId = dbo.Cases.Id INNER JOIN Users ON Activities.sent_by = Users.UserId
WHERE Activities.FirmId = 71 AND (Activities.ActType = 'Calendar') AND sent_by = 127273
----
ASKER
Actually the top part for this specific query does not have a problem. The problem is in the bottom part.
---
SELECT activities.activityid as id, activities.ActName as actdesc, activities.actdesc as longdesc,activities.datein itiated as datestarted,activities.las tmodified as datefinished,activities.pr ocesstep as repeat,activities.Partylis t as apptype
FROM ((dbo.Activities LEFT JOIN dbo.Cases ON dbo.Activities.CaseId = dbo.Cases.Id) ) LEFT JOIN dbo.Atts ON dbo.Atts.CaseId = dbo.Cases.Id INNER JOIN Users ON Activities.sent_by = Users.UserId
WHERE Activities.FirmId = 71 AND (Activities.ActType = 'Calendar') AND sent_by = 127273
---
That is the part that takes forever.
---
SELECT activities.activityid as id, activities.ActName as actdesc, activities.actdesc as longdesc,activities.datein
FROM ((dbo.Activities LEFT JOIN dbo.Cases ON dbo.Activities.CaseId = dbo.Cases.Id) ) LEFT JOIN dbo.Atts ON dbo.Atts.CaseId = dbo.Cases.Id INNER JOIN Users ON Activities.sent_by = Users.UserId
WHERE Activities.FirmId = 71 AND (Activities.ActType = 'Calendar') AND sent_by = 127273
---
That is the part that takes forever.
Ok. This subquery starts with Cases, and left joins the Activities, Atts, and User tables.
Assuming you have the proper indexes on these tables, you may well see an improvement by filtering the rows from the Activities table before you left join the other tables. See my example below, though I'm just guessing that "sent_by" is in the Activities table.
Kent
Assuming you have the proper indexes on these tables, you may well see an improvement by filtering the rows from the Activities table before you left join the other tables. See my example below, though I'm just guessing that "sent_by" is in the Activities table.
Kent
SELECT
activities.activityid as id,
activities.ActName as actdesc,
activities.actdesc as longdesc,
activities.dateinitiated as datestarted,
activities.lastmodified as datefinished,
activities.processtep as repeat,
activities.Partylist as apptype
FROM
(
SELECT * FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
) t0
LEFT JOIN dbo.Cases
ON t0.CaseId = dbo.Cases.Id
LEFT JOIN dbo.Atts
ON dbo.Atts.CaseId = dbo.Cases.Id
INNER JOIN Users
ON Activities.sent_by = Users.UserId
Sorry, it starts with Activities, and left joins those other tables. :)
ASKER
So, should I try the query above instead ?
Sure. :)
ASKER
Ran the above query and got this error:
Msg 4104, Level 16, State 1, Line 21
The multi-part identifier "Activities.sent_by" could not be bound.
Msg 4104, Level 16, State 1, Line 2
The multi-part identifier "activities.activityid" could not be bound.
Msg 4104, Level 16, State 1, Line 3
The multi-part identifier "activities.ActName" could not be bound.
Msg 4104, Level 16, State 1, Line 4
The multi-part identifier "activities.actdesc" could not be bound.
Msg 4104, Level 16, State 1, Line 5
The multi-part identifier "activities.dateinitiated" could not be bound.
Msg 4104, Level 16, State 1, Line 6
The multi-part identifier "activities.lastmodified" could not be bound.
Msg 4104, Level 16, State 1, Line 7
The multi-part identifier "activities.processtep" could not be bound.
Msg 4104, Level 16, State 1, Line 8
The multi-part identifier "activities.Partylist" could not be bound.
Msg 4104, Level 16, State 1, Line 21
The multi-part identifier "Activities.sent_by" could not be bound.
Msg 4104, Level 16, State 1, Line 2
The multi-part identifier "activities.activityid" could not be bound.
Msg 4104, Level 16, State 1, Line 3
The multi-part identifier "activities.ActName" could not be bound.
Msg 4104, Level 16, State 1, Line 4
The multi-part identifier "activities.actdesc" could not be bound.
Msg 4104, Level 16, State 1, Line 5
The multi-part identifier "activities.dateinitiated"
Msg 4104, Level 16, State 1, Line 6
The multi-part identifier "activities.lastmodified" could not be bound.
Msg 4104, Level 16, State 1, Line 7
The multi-part identifier "activities.processtep" could not be bound.
Msg 4104, Level 16, State 1, Line 8
The multi-part identifier "activities.Partylist" could not be bound.
The query is what I intended, I just misspoke when I described the process. :(
Apologies....
Oops. Take the qualifiers off of the selected items. (Or rename the qualifier to 't0').
ASKER
The query is not working :(
ASKER
Ummmm ... I may sound REALLY stupid but I wouldn't know how to do it, i tried the below query but same errors:
SELECT
activities.activityid as id,
activities.ActName as actdesc,
activities.actdesc as longdesc,
activities.dateinitiated as datestarted,
activities.lastmodified as datefinished,
activities.processtep as repeat,
activities.Partylist as apptype
FROM
(
SELECT * FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
) t9
LEFT JOIN dbo.Cases
ON t9.CaseId = dbo.Cases.Id
LEFT JOIN dbo.Atts
ON dbo.Atts.CaseId = dbo.Cases.Id
INNER JOIN Users
ON Activities.sent_by = Users.UserId
SELECT
activities.activityid as id,
activities.ActName as actdesc,
activities.actdesc as longdesc,
activities.dateinitiated as datestarted,
activities.lastmodified as datefinished,
activities.processtep as repeat,
activities.Partylist as apptype
FROM
(
SELECT * FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
) t9
LEFT JOIN dbo.Cases
ON t9.CaseId = dbo.Cases.Id
LEFT JOIN dbo.Atts
ON dbo.Atts.CaseId = dbo.Cases.Id
INNER JOIN Users
ON Activities.sent_by = Users.UserId
ASKER
Also, if I make changes to this part of the query, will it affect the first part before the UNION ?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks, unfortunately it did not make it any faster, it went over 30 seconds then I had to stop it.
This is what I used:
SELECT
t0.activityid as id,
t0.ActName as actdesc,
t0.actdesc as longdesc,
t0.dateinitiated as datestarted,
t0.lastmodified as datefinished,
t0.processtep as repeat,
t0.Partylist as apptype
FROM
(
SELECT * FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
) t0
LEFT JOIN dbo.Cases
ON t0.CaseId = dbo.Cases.Id
LEFT JOIN dbo.Atts
ON dbo.Atts.CaseId = dbo.Cases.Id
INNER JOIN Users
ON sent_by = Users.UserId
-
This is what I used:
SELECT
t0.activityid as id,
t0.ActName as actdesc,
t0.actdesc as longdesc,
t0.dateinitiated as datestarted,
t0.lastmodified as datefinished,
t0.processtep as repeat,
t0.Partylist as apptype
FROM
(
SELECT * FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
) t0
LEFT JOIN dbo.Cases
ON t0.CaseId = dbo.Cases.Id
LEFT JOIN dbo.Atts
ON dbo.Atts.CaseId = dbo.Cases.Id
INNER JOIN Users
ON sent_by = Users.UserId
-
Can you identify the indexes on these tables?
ASKER
There are no indexes. How can I create an index that would help the query above ?
The principal join entity seems to be the ID column in each of these tables. Create an index on each of them
CREATE INDEX idxact001 on dbo.Activities (Activityid);
CREATE INDEX idxcas001 on dbo.Cases (id);
CREATE INDEX idxatt001 on dbo.Atts (Caseid);
CREATE INDEX idxusr001 on dbo.Users (UserId);
ASKER
I am not familiar with indexes, what would the above code do ? why those numbers ?
I just need to ask this so I fully understand what this will do since it is a LIVE database. My provider recommended the query below, it looks very different from the recomendation above, why would this be ?
What are the 'create statistics' statements for ?
CREATE CLUSTERED INDEX [_dta_index_Activities_c_1 1_17022971 24__K17_K5 ] ON [dbo].[Activities]
(
[sent_by] ASC,
[ActType] ASC
)WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go
CREATE STATISTICS [_dta_stat_1702297124_1_9_ 10] ON [dbo].[Activities]([Activi tyId], [DateInitiated], [LastModified])
go
CREATE STATISTICS [_dta_stat_1702297124_1_19 _17] ON [dbo].[Activities]([Activi tyId], [Schdemailon], [sent_by])
go
CREATE STATISTICS [_dta_stat_1702297124_17_1 _9_10] ON [dbo].[Activities]([sent_b y], [ActivityId], [DateInitiated], [LastModified])
go
CREATE STATISTICS [_dta_stat_1702297124_2_17 _1_9_10] ON [dbo].[Activities]([CaseId ], [sent_by], [ActivityId], [DateInitiated], [LastModified])
go
CREATE STATISTICS [_dta_stat_1702297124_1_19 _9_10_17] ON [dbo].[Activities]([Activi tyId], [Schdemailon], [DateInitiated], [LastModified], [sent_by])
go
CREATE STATISTICS [_dta_stat_1702297124_17_5 _1_19_9_10 ] ON [dbo].[Activities]([sent_b y], [ActType], [ActivityId], [Schdemailon], [DateInitiated], [LastModified])
go
CREATE STATISTICS [_dta_stat_1702297124_17_2 _5_1_19_9_ 10] ON [dbo].[Activities]([sent_b y], [CaseId], [ActType], [ActivityId], [Schdemailon], [DateInitiated], [LastModified])
go
CREATE NONCLUSTERED INDEX [_dta_index_Users_11_39029 2450__K1_6 _7] ON [dbo].[Users]
(
[UserId] ASC
)
INCLUDE ( [LastNm],
[FirstNm]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go
CREATE STATISTICS [_dta_stat_309576141_1_67_ 16] ON [dbo].[Cases]([Id], [Process], [AlienId])
go
CREATE STATISTICS [_dta_stat_199671759_4_3] ON [dbo].[Atts]([UserId], [CaseId])
go--
I just need to ask this so I fully understand what this will do since it is a LIVE database. My provider recommended the query below, it looks very different from the recomendation above, why would this be ?
What are the 'create statistics' statements for ?
CREATE CLUSTERED INDEX [_dta_index_Activities_c_1
(
[sent_by] ASC,
[ActType] ASC
)WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go
CREATE STATISTICS [_dta_stat_1702297124_1_9_
go
CREATE STATISTICS [_dta_stat_1702297124_1_19
go
CREATE STATISTICS [_dta_stat_1702297124_17_1
go
CREATE STATISTICS [_dta_stat_1702297124_2_17
go
CREATE STATISTICS [_dta_stat_1702297124_1_19
go
CREATE STATISTICS [_dta_stat_1702297124_17_5
go
CREATE STATISTICS [_dta_stat_1702297124_17_2
go
CREATE NONCLUSTERED INDEX [_dta_index_Users_11_39029
(
[UserId] ASC
)
INCLUDE ( [LastNm],
[FirstNm]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [PRIMARY]
go
CREATE STATISTICS [_dta_stat_309576141_1_67_
go
CREATE STATISTICS [_dta_stat_199671759_4_3] ON [dbo].[Atts]([UserId], [CaseId])
go--
ASKER
I checked, the indexes where already there, still takes forever to display. :(
Hi BlueDot,
Let's see how much data we're dealing with. Run this query:
SELECT count(*)
FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
My guess is that it's a pretty big number, (database wise).
Kent
Let's see how much data we're dealing with. Run this query:
SELECT count(*)
FROM dbo.Activities
WHERE Activities.FirmId = 71
AND (Activities.ActType = 'Calendar')
AND sent_by = 127273
My guess is that it's a pretty big number, (database wise).
Kent
ASKER
The problem where the joins, once I did that then we did have a LOT of data. thx for the help
ASKER
Where can I learn more about indexes and how to create them to make my database faster ?
Interesting question. :)
Indexes that are added to a database after it has been operational for a while are usually the result of a bad design, or of a change in needs. In your example, most of the fields that you're using as your join keys are probably primary keys, meaning that the designers intended for them to be used this way and indexed them.
Creating indexes on a running system can speed up queries, but there may be a serious performance trade-off. Whenever a row is added, ALL of the indexes on the table have to be updated. When a row is modified, every index that covers the changed column(s) must be updated. Add a lot of indexes and you might find that inserts and updates take too long for your application to be effective. (That's one of the reasons that OLTP (recording databases) and OLAP (reporting databases) systems are so different. The OLTP puts a premium on efficiently storing data while the OLAP focuses on the retrieval of data.
You might start by reading about OLTP and OLAP systems. A basic understanding of them will go a long way toward understanding how data is organized and how indexes can help/hurt a database.
Kent
Indexes that are added to a database after it has been operational for a while are usually the result of a bad design, or of a change in needs. In your example, most of the fields that you're using as your join keys are probably primary keys, meaning that the designers intended for them to be used this way and indexed them.
Creating indexes on a running system can speed up queries, but there may be a serious performance trade-off. Whenever a row is added, ALL of the indexes on the table have to be updated. When a row is modified, every index that covers the changed column(s) must be updated. Add a lot of indexes and you might find that inserts and updates take too long for your application to be effective. (That's one of the reasons that OLTP (recording databases) and OLAP (reporting databases) systems are so different. The OLTP puts a premium on efficiently storing data while the OLAP focuses on the retrieval of data.
You might start by reading about OLTP and OLAP systems. A basic understanding of them will go a long way toward understanding how data is organized and how indexes can help/hurt a database.
Kent
Run a small test and see which half of the query is taking all of the time.
Run the top half (above the UNION ALL) and see how long it takes. Repeat for the lower half (below the UNION ALL). Post the timings.
Kent