Solved

IMprove sql performace

Posted on 2015-02-18
13
56 Views
Last Modified: 2015-02-19
I Need help improving sql query. It consume a lot of RAM and have so many reads.
SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 1 AS InClassIn
	, 1 AS BTWIn
	, 1 AS RSIn
	, SM.sessionKey AS DCKey
	, (
		SELECT TOP 1 SM.sessionKey
		FROM sessionMap SM 
		WHERE U.userKey = SM.userKey
			AND SM.type IN ('BW','SBW')
			AND SM.sessionKey != 0
		ORDER BY SM.type, SM.sessionMapKey
	) AS BWKey
	, (
		SELECT TOP 1 SMX.sessionKey
		FROM sessionMap SMX 
			INNER JOIN sessionUnit SU  ON SMX.sessionKey = SU.sessionKey
		WHERE U.userKey = SMX.userKey
			AND SMX.type IN ('BW','SBW')
			AND EXISTS (
				SELECT 1
				FROM userScore US 
				WHERE SU.sessionUnitKey = US.sessionUnitKey
					AND SMX.userKey = US.userKey
					AND US.userScoreType = 'VASkill'
			)
		ORDER BY SU.sessionStart desc
	) AS RSKey
FROM users U
	INNER JOIN sessionMap SM  ON U.userKey = SM.userKey
		AND SM.type = 'DC'
	INNER JOIN session S  ON SM.sessionKey = S.sessionKey
WHERE U.availStateKey = 2
	AND ISNULL(U.VAFailIn, 0) = 0
	AND U.TrackReqIn = 1
	AND U.InClassCompleteIn = 0
UNION
SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 0 AS InClassIn
	, 1 AS BTWIn
	, 1 AS RSIn
	, NULL
	, (
		SELECT TOP 1 SM.sessionKey
		FROM sessionMap SM 
		WHERE U.userKey = SM.userKey
			AND SM.type IN ('BW','SBW')
			AND SM.sessionKey != 0
		ORDER BY SM.type, SM.sessionMapKey
	)
	, (
		SELECT TOP 1 SMX.sessionKey
		FROM sessionMap SMX 
			INNER JOIN sessionUnit SU  ON SMX.sessionKey = SU.sessionKey
		WHERE U.userKey = SMX.userKey
			AND SMX.type IN ('BW','SBW')
			AND EXISTS (
				SELECT 1
				FROM userScore US 
				WHERE SU.sessionUnitKey = US.sessionUnitKey
					AND SMX.userKey = US.userKey
					AND US.userScoreType = 'VASkill'
			)
		ORDER BY SU.sessionStart desc
	)
FROM users U
WHERE U.availStateKey = 2
	AND U.InClassCompleteIn = 1
	AND U.TrackReqIn = 1
	AND ISNULL(U.VAFailIn, 0) = 0
UNION
SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 1 AS InClassIn
	, 1 AS BTWIn
	, 0 AS RSIn
	, SM.sessionKey
	, (
		SELECT TOP 1 SM.sessionKey
		FROM sessionMap SM 
		WHERE U.userKey = SM.userKey
			AND SM.type IN ('BW','SBW')
			AND SM.sessionKey != 0
		ORDER BY SM.type, SM.sessionMapKey DESC
	) 
	, NULL
FROM users U
	INNER JOIN sessionMap SM  ON U.userKey = SM.userKey
		AND SM.type = 'DC'
	INNER JOIN session S  ON SM.sessionKey = S.sessionKey
WHERE U.availStateKey = 2
	AND U.VAFailIn = 1

Open in new window

0
Comment
Question by:erikTsomik
  • 5
  • 5
  • 2
  • +1
13 Comments
 
LVL 19

Author Comment

by:erikTsomik
ID: 40617120
Any suggestions
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40618193
yes.

Example of "correlated subquery"

      , (
            SELECT TOP 1 SM.sessionKey
            FROM sessionMap SM
            WHERE U.userKey = SM.userKey
                  AND SM.type IN ('BW','SBW')
                  AND SM.sessionKey != 0
            ORDER BY SM.type, SM.sessionMapKey
      ) AS BWKey

Alias U is a reference defined "outside" of the subquery (where only the alias SM is defined)
THAT is a "correlation", which means that for each value of U.userKey this subquery is executed!

Then ADD MORE EXPENSE by requiring each of those multiple subquery reads to be ORDERED!

and so any "correlated subquery" will cause multiple reads.

----------------------------------

Now make this potentially worse by containing all of those "correlated subqueries" inside UNION

UNION compares all rows to determine if the overall result will be a unique set of rows, so some rows can be ignored i.e. some (perhaps many) correlated subqueries have been fired off to produce result rows, but then those are eliminated by UNION. A large potential for wasted effort.

----------------------------------

UNION by itself is an "expensive operation"  and effort to avoid or minimize this by should be employed.

Note the "constants" used

1st query
      , 1 AS InClassIn
      , 1 AS BTWIn
      , 1 AS RSIn
2nd query
      , 0 AS InClassIn
      , 1 AS BTWIn
      , 1 AS RSIn
3rd query
      , 1 AS InClassIn
      , 1 AS BTWIn
      , 0 AS RSIn

this means there are no rows of the first query that can be exactly equal to rows of the second or third, etc.
So UNION will only remove "duplicates" WITHIN the first query or WITHIN the second query or WITHIN  the third query

It would be ever so much more efficient to eliminate duplicates in each of the 3 individual queries then use UNION ALL
(nb. UNION ALL is LESS "expensive" because it does not eliminate duplicated rows)
0
 
LVL 19

Author Comment

by:erikTsomik
ID: 40618208
Basically what you suggesting is I need to change to this

 , (
            SELECT TOP 1 SM.sessionKey
            FROM sessionMap SM
            WHERE SM.userKey= U.userKey
                  AND SM.type IN ('BW','SBW')
                  AND SM.sessionKey != 0
            ORDER BY SM.type, SM.sessionMapKey
      ) AS BWKey
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40618212
The following query does NOT incorporate everything I have suggested above, but it's a start. In essence I have replaced a correlated subquery performed in each of the 3 queries with a single CTE using ROW_NUMBER, than that is joined (using a left join which mimics the impact of the correlated subquery). This should reduce some effort by the query.

I have also "cheated" a little by replacing both UNIONs with UNION ALL
but the first and third queries now use SELECT DISTINCT
It is only the first & third queries that JOIN to tables that could produce duplicates
(or conversely I don't think the second query can produce duplicates so it does't need select distinct)

See if this helps at all please.
;
WITH CTE
AS (
      SELECT
            SM.userKey
          , SM.sessionKey
          , ROW_NUMBER() OVER (PARTITION BY SM.userKey ORDER BY SM.type, SM.sessionMapKey) AS RN
      FROM sessionMap SM
      WHERE SM.sessionKey != 0
)
SELECT DISTINCT
      U.userKey
    , DATEDIFF(yy, CAST(U.dateOfBirth AS datetime), GETDATE()) AS STUDENTAGE
    , 1 AS INCLASSIN
    , 1 AS BTWIN
    , 1 AS RSIN
    , SM.sessionKey AS DCKEY
    , CTE.sessionKey AS BWKEY
    , (
            SELECT TOP 1
                  SMX.sessionKey
            FROM sessionMap SMX
            INNER JOIN sessionUnit SU ON SMX.sessionKey = SU.sessionKey
            WHERE U.userKey = SMX.userKey
                  AND SMX.type IN ('BW', 'SBW')
                  AND EXISTS (
                        SELECT
                              1
                        FROM userScore US
                        WHERE SU.sessionUnitKey = US.sessionUnitKey
                              AND SMX.userKey = US.userKey
                              AND US.userScoreType = 'VASkill'
                  )
            ORDER BY SU.sessionStart DESC
      ) AS RSKEY
FROM users U
INNER JOIN sessionMap SM ON U.userKey = SM.userKey
            AND SM.type = 'DC'
INNER JOIN session S ON SM.sessionKey = S.sessionKey
LEFT JOIN CTE ON U.userKey = CTE.userKey
            AND CTE.RN = 1
WHERE U.availStateKey = 2
      AND ISNULL(U.VAFailIn, 0) = 0
      AND U.TrackReqIn = 1
      AND U.InClassCompleteIn = 0
UNION ALL
      SELECT
            U.userKey
          , DATEDIFF(yy, CAST(U.dateOfBirth AS datetime), GETDATE()) AS STUDENTAGE
          , 0 AS INCLASSIN
          , 1 AS BTWIN
          , 1 AS RSIN
          , NULL
          , CTE.sessionKey AS BWKEY
          , (
                  SELECT TOP 1
                        SMX.sessionKey
                  FROM sessionMap SMX
                  INNER JOIN sessionUnit SU ON SMX.sessionKey = SU.sessionKey
                  WHERE U.userKey = SMX.userKey
                        AND SMX.type IN ('BW', 'SBW')
                        AND EXISTS (
                              SELECT
                                    1
                              FROM userScore US
                              WHERE SU.sessionUnitKey = US.sessionUnitKey
                                    AND SMX.userKey = US.userKey
                                    AND US.userScoreType = 'VASkill'
                        )
                  ORDER BY SU.sessionStart DESC
            )
      FROM users U
      LEFT JOIN CTE ON U.userKey = CTE.userKey
                  AND CTE.RN = 1
      WHERE U.availStateKey = 2
            AND U.InClassCompleteIn = 1
            AND U.TrackReqIn = 1
            AND ISNULL(U.VAFailIn, 0) = 0
UNION ALL
      SELECT DISTINCT
            U.userKey
          , DATEDIFF(yy, CAST(U.dateOfBirth AS datetime), GETDATE()) AS STUDENTAGE
          , 1 AS INCLASSIN
          , 1 AS BTWIN
          , 0 AS RSIN
          , SM.sessionKey
          , CTE.sessionKey AS BWKEY
          , NULL
      FROM users U
      INNER JOIN sessionMap SM ON U.userKey = SM.userKey
                  AND SM.type = 'DC'
      INNER JOIN session S ON SM.sessionKey = S.sessionKey
      LEFT JOIN CTE ON U.userKey = CTE.userKey
                  AND CTE.RN = 1
      WHERE U.availStateKey = 2
            AND U.VAFailIn = 1
;

Open in new window


If you want more I suggest you produce an execution plan file (.sqlplan) and attach it to this question.
Please note, without knowledge of your tables and data it's quite hard to advise in detail on performance tuning.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40618214
>>Basically what you suggesting is I need to change to this
NO there is MUCH more, pleas re-read the bit about constants. If you don't understand let me know.

but yes that subquery is causing pain I think. I would like to eradicate the other one too!

i.e. I would like to do a lot more, but not having any access to your tables, indexes and data, all of which influence the outcome, I can only make suggestions.

e.g. I HATE "select distinct" and there are usually better ways (typically a group by)
see http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/A_12282-Select-Distinct-is-returning-duplicates.html

I would typically work with those 3 queries individually, optimise each one, and THEN put them together.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40618228
Your studentage calculation will produce inaccurate results, since DateDiff() subtracts the Year() results of the two date parameters.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 40618234
Sorry for the multiple posts but ideas pop in, so:
 
A DIFFERENT way to handle it could be along these lines:

-- overall select (an "outer query")
select several fields
FROM (
            -- 1st
            select fewer fields, + unique subquery
            from several tables (possible repeated rows)
            GROUP BY to eradicate repetition
            UNION ALL

            -- 2nd query
            select fewer fields
            from single table
            UNION ALL

            --3rd query
            select fewer fields
            from several tables (possible repeated rows)
            GROUP BY to eradicate repetition
    )
 NOW JOIN for the "common subquery" information
 LEFT JOIN (the CTE query goes here ) ON ...

and even that last remaining correleted subquery could be replaced
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 40618255
right, here's a more complete attempt, fingers crossed (no testing possible)

This replaces all correlated subqueries, no CTEs needed, no select distinct now (group by instead)
the age calc can be improved as noted by akimark but I haven't changed it (but have reduced to one place to change)
I have also introduced sargable predicates instead of this AND ISNULL(U.VAFailIn, 0) = 0
SELECT
      userKey
    , DATEDIFF(yy, CAST(dateOfBirth AS datetime), GETDATE()) AS STUDENTAGE --<< this calc can be improved
    , INCLASSIN
    , BTWIN
    , RSIN
    , DCKEY
    , SM.sessionKey AS BWKEY
    , SUX.sessionKey AS RSKEY
FROM (
            -- 1st query
            SELECT
                  U.userKey
                , U.dateOfBirth
                , 1 AS INCLASSIN
                , 1 AS BTWIN
                , 1 AS RSIN
                , SM.sessionKey AS DCKEY
            FROM users U
            INNER JOIN sessionMap SM ON U.userKey = SM.userKey
                        AND SM.type = 'DC'
            INNER JOIN session S ON SM.sessionKey = S.sessionKey
            WHERE U.availStateKey = 2
                  AND (U.VAFailIn = 0 OR U.VAFailIn IS NULL) --<< sargable
                  AND U.TrackReqIn = 1
                  AND U.InClassCompleteIn = 0
            GROUP BY
                  U.userKey
                , U.dateOfBirth
                , SM.sessionKey

            UNION ALL
            -- 2nd query
                  SELECT
                        U.userKey
                      , U.dateOfBirth
                      , 0 AS INCLASSIN
                      , 1 AS BTWIN
                      , 1 AS RSIN
                      , NULL AS DCKEY
                  FROM users U
                  WHERE U.availStateKey = 2
                        AND U.InClassCompleteIn = 1
                        AND U.TrackReqIn = 1
                        AND (U.VAFailIn = 0 OR U.VAFailIn IS NULL) --<< sargable
            UNION ALL
            -- 3rd query
                  SELECT
                        U.userKey
                      , U.dateOfBirth
                      , 1 AS INCLASSIN
                      , 1 AS BTWIN
                      , 0 AS RSIN
                      , SM.sessionKey AS DCKEY
                  FROM users U
                  INNER JOIN sessionMap SM ON U.userKey = SM.userKey
                              AND SM.type = 'DC'
                  INNER JOIN session S ON SM.sessionKey = S.sessionKey
                  WHERE U.availStateKey = 2
                        AND U.VAFailIn = 1
                  GROUP BY
                        U.userKey
                      , U.dateOfBirth
      ) U123
LEFT JOIN (
            SELECT
                  userKey
                , sessionKey
                , ROW_NUMBER() OVER (PARTITION BY userKey ORDER BY type, sessionMapKey) AS RN
            FROM sessionMap
            WHERE sessionKey != 0
            ) SM ON U123.userKey = SM.userKey
LEFT JOIN (
      SELECT
            SMX.userKey
          , SMX.sessionKey
          , ROW_NUMBER() OVER (PARTITION BY SMX.userKey ORDER BY SU.sessionStart DESC) AS RN
      FROM sessionMap SMX
      INNER JOIN sessionUnit SU ON SMX.sessionKey = SU.sessionKey
      WHERE SMX.type IN ('BW', 'SBW')
            AND EXISTS (
                  SELECT
                        1
                  FROM userScore US
                  WHERE SU.sessionUnitKey = US.sessionUnitKey
                        AND SMX.userKey = US.userKey
                        AND US.userScoreType = 'VASkill'
                  )
      ) SUX ON U123.userKey = SUX.userKey AND SUX.RN = 1 AND U123.RSIN = 1

Open in new window

0
 
LVL 19

Author Comment

by:erikTsomik
ID: 40619071
I ran this query but it return the way more records than the original query. Thge new query returns 13035 records vs old query 2353. I should only get 1 line per userKey
0
 
LVL 45

Expert Comment

by:Vitor Montalvão
ID: 40619211
I've transformed your subqueries in CTEs. Check if the performance is better now and the more important is that returns the same number of rows:
WITH CTE_BWKey (userKey, [type], sessionMapKey, BWKey) AS
(SELECT SM.userKey, SM.type, SM.sessionMapKey, MAX(SM.sessionKey)
 FROM users U
	INNER JOIN sessionMap SM ON U.userKey = SM.userKey
 WHERE SM.type IN ('BW','SBW')
			AND SM.sessionKey != 0
 GROUP BY SM.userKey, SM.type, SM.sessionMapKey),

CTE_RSKey (userKey, RSKey) AS
(SELECT userKey, MAX(SMX.sessionKey)
 FROM users U
	INNER JOIN sessionMap SMX ON U.userKey = SM.userKey 
	INNER JOIN sessionUnit SU ON SMX.sessionKey = SU.sessionKey
 WHERE SMX.type IN ('BW','SBW')
	AND EXISTS (
				SELECT 1
				FROM userScore US 
				WHERE SU.sessionUnitKey = US.sessionUnitKey
					AND SMX.userKey = US.userKey
					AND US.userScoreType = 'VASkill'
				)
 GROUP BY userKey)

SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 1 AS InClassIn
	, 1 AS BTWIn
	, 1 AS RSIn
	, SM.sessionKey AS DCKey
	, BW.BWKey
	, RS.RSKey
FROM users U
	INNER JOIN sessionMap SM ON U.userKey = SM.userKey AND SM.type = 'DC'
	INNER JOIN session S  ON SM.sessionKey = S.sessionKey
	INNER JOIN CTE_BWKey BW ON U.userKey = BW.userKey
	INNER JOIN CTE_RSKey RS ON U.userKey = RS.userKey
WHERE U.availStateKey = 2
	AND ISNULL(U.VAFailIn, 0) = 0
	AND U.TrackReqIn = 1
	AND U.InClassCompleteIn = 0
UNION
SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 0 AS InClassIn
	, 1 AS BTWIn
	, 1 AS RSIn
	, NULL
	, BW.BWKey
	, RS.RSKey
FROM users U
	INNER JOIN CTE_BWKey BW ON U.userKey = BW.userKey
	INNER JOIN CTE_RSKey RS ON U.userKey = RS.userKey
WHERE U.availStateKey = 2
	AND U.InClassCompleteIn = 1
	AND U.TrackReqIn = 1
	AND ISNULL(U.VAFailIn, 0) = 0
UNION
SELECT 
	U.userKey
	, DateDiff(yy, CAST(U.dateOfBirth AS DATETIME), GETDATE()) AS studentAge
	, 1 AS InClassIn
	, 1 AS BTWIn
	, 0 AS RSIn
	, SM.sessionKey
	, BW.BWKey
	, NULL
FROM users U
	INNER JOIN sessionMap SM  ON U.userKey = SM.userKey AND SM.type = 'DC'
	INNER JOIN session S  ON SM.sessionKey = S.sessionKey
	INNER JOIN CTE_BWKey BW ON U.userKey = BW.userKey
WHERE U.availStateKey = 2
	AND U.VAFailIn = 1

Open in new window

0
 
LVL 19

Author Comment

by:erikTsomik
ID: 40619308
HI Victor your query return multiple user records  as well and qeury takes 3 sec to run and returns 9907 records. I think the problem is in here. If you look at the original post  the BTWKey works differently for each section of the UNION

WITH CTE_BWKey (userKey, [type], sessionMapKey, BWKey) AS
(SELECT SM.userKey, SM.type, SM.sessionMapKey, MAX(SM.sessionKey)
 FROM users U
      INNER JOIN sessionMap SM ON U.userKey = SM.userKey
 WHERE SM.type IN ('BW','SBW')
                  AND SM.sessionKey != 0
 GROUP BY SM.userKey, SM.type, SM.sessionMapKey),
0
 
LVL 45

Expert Comment

by:aikimark
ID: 40619417
Here is a reliable expression I use for calculating patient age when they are scanned in one of my applications.
Int((year(scandate)+(month(scandate)/12 +day(scandate)/10000)) - (year(dob)+(month(dob)/12 +day(dob)/10000)))

Open in new window

0
 
LVL 19

Author Closing Comment

by:erikTsomik
ID: 40619724
Great
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now