Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

CALLING A FUNCTION FROM WHERE CLAUSE ... HOW TO REPLACE IT!

Posted on 2013-01-29
11
Medium Priority
?
297 Views
Last Modified: 2013-02-15
The following function gets called from a query  - from the WHERE CLAUSE - (which is/must be generated dynamically by web page).
 As expected, the function is called millions of times by the app and is causing significant performance degradation.

I want to design the query so that it uses a join to a table (that I want to create ... how?  based on the function ...) rather than expect the result from the UDF.

How do I do so?  
The function basically just returns the max(student_app_id) which is the PKey (clustered) of the main table the UDF queries.

The UDF, the table structure, and the query that calls the UDF are included below, for completness.

---- UDF CODE
CREATE FUNCTION [dbo].[fnStAppId_Prospect](@iuserId int,@iappId int)
RETURNS int
AS
BEGIN
      DECLARE @rv int,@SID int
      Set @rv = 0

      --get latest student_app_id which has been modified.
      select @SID = max(student_app_id)
      from application_status
      where user_id = @iuserId and confirmation_number is null
      AND (date_last_modified = (SELECT MAX(date_last_modified)
                                   FROM dbo.application_status
                                   WHERE user_id = @iuserId))
      AND application_id = @iappId

      IF @SID > 0
            BEGIN
                  IF EXISTS(select student_app_id from personal_info where student_app_id = @SID)
                        return @SID
                  ELSE IF EXISTS(select student_app_id from people_names where student_app_id = @SID)
                        return @SID
                  ELSE IF EXISTS(select student_app_id from special_cccapply_personal_info where student_app_id = @SID)
                        return @SID
            END
      
      RETURN 0

--*********** query
select distinct prospect_info.prospect_status_id as field_1_prospect_status_id, prospect_info.date_stamp as field_5_date_stamp, prospect_info.slots_downloaded
as field_6_slots_downloaded, table_1_people_names.last_name as field_2_last_name, table_1_people_names.first_name
as field_3_first_name, table_2_personal_info.ssn as field_4_ssn
into #org
from prospect_info left outer join system_xap.dbo.people_names as table_1_people_names
on table_1_people_names.user_id=prospect_info.user_id and table_1_people_names.application_id=0 and table_1_people_names.question_number=3
and (((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))>0 and table_1_people_names.student_app_id = dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))
or ((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5) = 0 and table_1_people_names.student_app_id is null)) )
left outer join system_xap.dbo.personal_info as table_2_personal_info on table_2_personal_info.user_id=prospect_info.user_id
and table_2_personal_info.application_id=0 and (((dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5))>0
and table_2_personal_info.student_app_id = dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5))
or ((dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5) = 0 and table_2_personal_info.student_app_id is null)) )
where 1=1 and prospect_info.application_id=5 and prospect_info.prospect_status_id in
(SELECT prospect_info.prospect_status_id
FROM prospect_info LEFT OUTER JOIN system_xap.dbo.application_status AS application_status
ON prospect_info.user_id=application_status.user_id and prospect_info.application_id=application_status.application_id
WHERE (prospect_info.user_id NOT IN
(SELECT user_id FROM application_status aps WHERE (aps.application_id = 5) AND (confirmation_number IS NOT NULL)))
AND (prospect_info.application_id = 5)) and CAST(prospect_info.date_stamp AS DATETIME) between '1/1/2013' and '1/31/2013 23:59:59'

--- *********** TABLE STRUCTURE
CREATE TABLE [dbo].[application_status](
      [student_app_id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
      [user_id] [int] NOT NULL,
      [application_id] [int] NOT NULL,
      [date_started] [smalldatetime] NULL,
      [date_last_modified] [smalldatetime] NULL,
      [application_type] [char](1) NULL,
      [on_page] [int] NULL,
      [confirmation_number] [char](40) NULL,
      [order_id] [int] NULL,
      [payment_type_id] [int] NULL,
      [date_completed] [smalldatetime] NULL,
      [date_batched] [smalldatetime] NULL,
      [date_downloaded] [smalldatetime] NULL,
      [certification_info] [varchar](64) NULL,
      [date_submitted] [smalldatetime] NULL,
      [submitted_term_year] [varchar](12) NULL,
      [app_method] [char](3) NULL,
      [intended_term] [varchar](12) NULL,
      [date_replica_downloaded] [smalldatetime] NULL,
      [slots_downloaded] [int] NULL,
      [submitted_major] [varchar](96) NULL,
      [date_downloaded2] [smalldatetime] NULL,
 CONSTRAINT [PK_StudAppID] PRIMARY KEY CLUSTERED
(
      [student_app_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
0
Comment
Question by:DBA2000
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 2
11 Comments
 
LVL 51

Expert Comment

by:Ted Bouskill
ID: 38833487
The overhead for calling the function is minimal.  For example, in this article the overhead averages out to an additional 18 milliseconds.

For one, you are calling a MAX() function in the query and that is an expensive operation especially if you don't have the right indexes created.  I think you need to use SQL's analyzer tools to find the true performance bottlenecks.
0
 

Author Comment

by:DBA2000
ID: 38833937
Thank you, TedBilly

... I forgot to mention that the function gets called over 40 million times in two hour! ... since the function is being called from a WHERE clause of the query, which means that is called for every record that the query returns, thus, the tens of millions of calls.
0
 
LVL 51

Expert Comment

by:Ted Bouskill
ID: 38834062
However, my answer is still valid.  If you move that query within the function out into the WHERE clause it will still execute millions of times using the MAX function.  Trust me, I've been down this road.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38836525
The following function gets called from a query  - from the WHERE CLAUSE -
What makes you think the function is called from the WHERE clause?  The way I see it fnStAppId_Prospect() is in the JOIN.  

Mind you I am not saying that is good in fact I think it could be re-written, just wanted to get the facts straight.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38836549
For example one thing you could do is re-write this:
      select @SID = max(student_app_id) 
      from application_status 
      where user_id = @iuserId and confirmation_number is null 
      AND (date_last_modified = (SELECT MAX(date_last_modified)
                                   FROM dbo.application_status
                                   WHERE user_id = @iuserId))
      AND application_id = @iappId

Open in new window

Like this:
SELECT  @SID = MAX(student_app_id)
FROM    application_status
WHERE   user_id = @iuserId
        AND confirmation_number IS NULL
        AND application_id = @iappId
HAVING  date_last_modified = MAX(date_last_modified)

Open in new window

0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38836624
You may actually be able to change the Multi-Statement UDF to a more efficient inline Function as in something like this:
CREATE FUNCTION [dbo].[fnStAppId_Prospect] (
		@iuserId int,
		@iappId int)

RETURNS int

AS 
BEGIN 
    RETURN (
    SELECT  CASE 
		WHEN a.[SID] <= 0 THEN 0
		ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
	     END
    FROM	(
	    SELECT  MAX(student_app_id) [SID]
	    FROM    application_status
	    WHERE   user_id = @iuserId
		    AND confirmation_number IS NULL
		    AND application_id = @iappId
	    HAVING  date_last_modified = MAX(date_last_modified)
	    ) a
	    LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
	    LEFT JOIN people_names n ON a.[SID] = n.student_app_id
	    LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id

	)   
END

Open in new window


But I agree with you that there are far better ways of skinning this cat other than using a UDF in a WHERE clause or even a JOIN.
0
 

Author Comment

by:DBA2000
ID: 38839315
Hi, acperkins

one more question,
... given that SQL does set operations better than function calls, do you think it makes sense to replace the call to the UDF by the body of the modified UDF that you came up with?  (... I, of course, have to workout the syntax when I replace the call to UDF)

as follows ...
 
... and (((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))>0 and table_1_people_names.student_app_id = dbo.fnStAppId_Prospect(table_1_people_names.user_id,5)) or ((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5) = 0 ...

... with the following ...

... and (((SELECT  CASE WHEN a.[SID] <= 0 THEN 0       ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)      END      FROM      (     SELECT  MAX(student_app_id) [SID]     FROM    application_status     WHERE   user_id = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId
          HAVING  date_last_modified = MAX(date_last_modified)
          ) a
          LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
          LEFT JOIN people_names n ON a.[SID] = n.student_app_id
          LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id )   > 0
and table_1_people_names.student_app_id = (SELECT  CASE WHEN a.[SID] <= 0 THEN 0       ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)      END      FROM      (     SELECT  MAX(student_app_id) [SID]     FROM    application_status     WHERE   user_id = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId
          HAVING  date_last_modified = MAX(date_last_modified)
          ) a
          LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
          LEFT JOIN people_names n ON a.[SID] = n.student_app_id
          LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id )
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38839367
do you think it makes sense to replace the call to the UDF by the body of the modified UDF that you came up with?
In general, yes.  However, I am not necessarily advocating that.  I believe a better approach may be to insert all the values into a temp table and join against that.  That way you will not have to join against the function (or derived table) six times.

Simply put the entire query and not just the function part needs to be re-evaluated and re-written.  It is a mess as it is.
0
 

Author Comment

by:DBA2000
ID: 38842039
Hi, acperkin

The UDF you wrote,  gave me the following error when I execute it.  I tried a number of changes, but none worked.  

Please help.
Thanks,

Msg 8121, Level 16, State 1, Procedure fnStAppId_Prospect_Optim_V1.1, Line 21

Column 'application_status.date_last_modified' is invalid in the HAVING clause because

it is not contained in either an aggregate function or the GROUP BY clause.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 2000 total points
ID: 38845721
You are right, not sure what I was thinking.  And now I am not exactly sure what you are trying to achieve.  So I would first try getting rid of the HAVING clause entirely as in:
CREATE FUNCTION [dbo].[fnStAppId_Prospect]
    (
     @iuserId int,
     @iappId int
    )
RETURNS int
AS 
    BEGIN 
        RETURN (
            SELECT  CASE WHEN a.[SID] <= 0 THEN 0
                         ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
                    END
            FROM    (SELECT MAX(student_app_id) [SID]
                     FROM   application_status
                     WHERE  user_id = @iuserId
                            AND confirmation_number IS NULL
                            AND application_id = @iappId
                    ) a
                    LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
                    LEFT JOIN people_names n ON a.[SID] = n.student_app_id
                    LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id

			)   
    END

Open in new window


If that does not produce the right results, try this instead:
CREATE FUNCTION [dbo].[fnStAppId_Prospect]
    (
     @iuserId int,
     @iappId int
    )
RETURNS int
AS 
BEGIN 
        RETURN (
        SELECT  CASE WHEN a.[SID] <= 0 THEN 0
                     ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
                END
        FROM    application_status s
                INNER JOIN (SELECT  MAX(student_app_id) [SID],
                                    MAX(date_last_modified) MAXdate_last_modified
                            FROM    application_status
                            WHERE   user_id = @iuserId
                                    AND confirmation_number IS NULL
                                    AND application_id = @iappId
                           ) a ON s.date_last_modified = a.MAXdate_last_modified
                LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
                LEFT JOIN people_names n ON a.[SID] = n.student_app_id
                LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id
        WHERE   [user_id] = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId

	)   
    END

Open in new window

0
 

Author Closing Comment

by:DBA2000
ID: 38895249
Thank you so much.  
The later version of the UDF works!
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Long way back, we had to take help from third party tools in order to encrypt and decrypt data.  Gradually Microsoft understood the need for this feature and started to implement it by building functionality into SQL Server. Finally, with SQL 2008, …
There have been several questions about Large Transaction Log Files in SQL Server 2008, and how to get rid of them when disk space has become critical. This article will explain how to disable full recovery and implement simple recovery that carries…
Video by: ITPro.TV
In this episode Don builds upon the troubleshooting techniques by demonstrating how to properly monitor a vSphere deployment to detect problems before they occur. He begins the show using tools found within the vSphere suite as ends the show demonst…
This tutorial will teach you the special effect of super speed similar to the fictional character Wally West aka "The Flash" After Shake : http://www.videocopilot.net/presets/after_shake/ All lightning effects with instructions : http://www.mediaf…

715 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question