Solved

CALLING A FUNCTION FROM WHERE CLAUSE ... HOW TO REPLACE IT!

Posted on 2013-01-29
11
285 Views
Last Modified: 2013-02-15
The following function gets called from a query  - from the WHERE CLAUSE - (which is/must be generated dynamically by web page).
 As expected, the function is called millions of times by the app and is causing significant performance degradation.

I want to design the query so that it uses a join to a table (that I want to create ... how?  based on the function ...) rather than expect the result from the UDF.

How do I do so?  
The function basically just returns the max(student_app_id) which is the PKey (clustered) of the main table the UDF queries.

The UDF, the table structure, and the query that calls the UDF are included below, for completness.

---- UDF CODE
CREATE FUNCTION [dbo].[fnStAppId_Prospect](@iuserId int,@iappId int)
RETURNS int
AS
BEGIN
      DECLARE @rv int,@SID int
      Set @rv = 0

      --get latest student_app_id which has been modified.
      select @SID = max(student_app_id)
      from application_status
      where user_id = @iuserId and confirmation_number is null
      AND (date_last_modified = (SELECT MAX(date_last_modified)
                                   FROM dbo.application_status
                                   WHERE user_id = @iuserId))
      AND application_id = @iappId

      IF @SID > 0
            BEGIN
                  IF EXISTS(select student_app_id from personal_info where student_app_id = @SID)
                        return @SID
                  ELSE IF EXISTS(select student_app_id from people_names where student_app_id = @SID)
                        return @SID
                  ELSE IF EXISTS(select student_app_id from special_cccapply_personal_info where student_app_id = @SID)
                        return @SID
            END
      
      RETURN 0

--*********** query
select distinct prospect_info.prospect_status_id as field_1_prospect_status_id, prospect_info.date_stamp as field_5_date_stamp, prospect_info.slots_downloaded
as field_6_slots_downloaded, table_1_people_names.last_name as field_2_last_name, table_1_people_names.first_name
as field_3_first_name, table_2_personal_info.ssn as field_4_ssn
into #org
from prospect_info left outer join system_xap.dbo.people_names as table_1_people_names
on table_1_people_names.user_id=prospect_info.user_id and table_1_people_names.application_id=0 and table_1_people_names.question_number=3
and (((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))>0 and table_1_people_names.student_app_id = dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))
or ((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5) = 0 and table_1_people_names.student_app_id is null)) )
left outer join system_xap.dbo.personal_info as table_2_personal_info on table_2_personal_info.user_id=prospect_info.user_id
and table_2_personal_info.application_id=0 and (((dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5))>0
and table_2_personal_info.student_app_id = dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5))
or ((dbo.fnStAppId_Prospect(table_2_personal_info.user_id,5) = 0 and table_2_personal_info.student_app_id is null)) )
where 1=1 and prospect_info.application_id=5 and prospect_info.prospect_status_id in
(SELECT prospect_info.prospect_status_id
FROM prospect_info LEFT OUTER JOIN system_xap.dbo.application_status AS application_status
ON prospect_info.user_id=application_status.user_id and prospect_info.application_id=application_status.application_id
WHERE (prospect_info.user_id NOT IN
(SELECT user_id FROM application_status aps WHERE (aps.application_id = 5) AND (confirmation_number IS NOT NULL)))
AND (prospect_info.application_id = 5)) and CAST(prospect_info.date_stamp AS DATETIME) between '1/1/2013' and '1/31/2013 23:59:59'

--- *********** TABLE STRUCTURE
CREATE TABLE [dbo].[application_status](
      [student_app_id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
      [user_id] [int] NOT NULL,
      [application_id] [int] NOT NULL,
      [date_started] [smalldatetime] NULL,
      [date_last_modified] [smalldatetime] NULL,
      [application_type] [char](1) NULL,
      [on_page] [int] NULL,
      [confirmation_number] [char](40) NULL,
      [order_id] [int] NULL,
      [payment_type_id] [int] NULL,
      [date_completed] [smalldatetime] NULL,
      [date_batched] [smalldatetime] NULL,
      [date_downloaded] [smalldatetime] NULL,
      [certification_info] [varchar](64) NULL,
      [date_submitted] [smalldatetime] NULL,
      [submitted_term_year] [varchar](12) NULL,
      [app_method] [char](3) NULL,
      [intended_term] [varchar](12) NULL,
      [date_replica_downloaded] [smalldatetime] NULL,
      [slots_downloaded] [int] NULL,
      [submitted_major] [varchar](96) NULL,
      [date_downloaded2] [smalldatetime] NULL,
 CONSTRAINT [PK_StudAppID] PRIMARY KEY CLUSTERED
(
      [student_app_id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
0
Comment
Question by:DBA2000
  • 5
  • 4
  • 2
11 Comments
 
LVL 51

Expert Comment

by:tedbilly
Comment Utility
The overhead for calling the function is minimal.  For example, in this article the overhead averages out to an additional 18 milliseconds.

For one, you are calling a MAX() function in the query and that is an expensive operation especially if you don't have the right indexes created.  I think you need to use SQL's analyzer tools to find the true performance bottlenecks.
0
 

Author Comment

by:DBA2000
Comment Utility
Thank you, TedBilly

... I forgot to mention that the function gets called over 40 million times in two hour! ... since the function is being called from a WHERE clause of the query, which means that is called for every record that the query returns, thus, the tens of millions of calls.
0
 
LVL 51

Expert Comment

by:tedbilly
Comment Utility
However, my answer is still valid.  If you move that query within the function out into the WHERE clause it will still execute millions of times using the MAX function.  Trust me, I've been down this road.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
The following function gets called from a query  - from the WHERE CLAUSE -
What makes you think the function is called from the WHERE clause?  The way I see it fnStAppId_Prospect() is in the JOIN.  

Mind you I am not saying that is good in fact I think it could be re-written, just wanted to get the facts straight.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
For example one thing you could do is re-write this:
      select @SID = max(student_app_id) 
      from application_status 
      where user_id = @iuserId and confirmation_number is null 
      AND (date_last_modified = (SELECT MAX(date_last_modified)
                                   FROM dbo.application_status
                                   WHERE user_id = @iuserId))
      AND application_id = @iappId

Open in new window

Like this:
SELECT  @SID = MAX(student_app_id)
FROM    application_status
WHERE   user_id = @iuserId
        AND confirmation_number IS NULL
        AND application_id = @iappId
HAVING  date_last_modified = MAX(date_last_modified)

Open in new window

0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
You may actually be able to change the Multi-Statement UDF to a more efficient inline Function as in something like this:
CREATE FUNCTION [dbo].[fnStAppId_Prospect] (
		@iuserId int,
		@iappId int)

RETURNS int

AS 
BEGIN 
    RETURN (
    SELECT  CASE 
		WHEN a.[SID] <= 0 THEN 0
		ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
	     END
    FROM	(
	    SELECT  MAX(student_app_id) [SID]
	    FROM    application_status
	    WHERE   user_id = @iuserId
		    AND confirmation_number IS NULL
		    AND application_id = @iappId
	    HAVING  date_last_modified = MAX(date_last_modified)
	    ) a
	    LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
	    LEFT JOIN people_names n ON a.[SID] = n.student_app_id
	    LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id

	)   
END

Open in new window


But I agree with you that there are far better ways of skinning this cat other than using a UDF in a WHERE clause or even a JOIN.
0
 

Author Comment

by:DBA2000
Comment Utility
Hi, acperkins

one more question,
... given that SQL does set operations better than function calls, do you think it makes sense to replace the call to the UDF by the body of the modified UDF that you came up with?  (... I, of course, have to workout the syntax when I replace the call to UDF)

as follows ...
 
... and (((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5))>0 and table_1_people_names.student_app_id = dbo.fnStAppId_Prospect(table_1_people_names.user_id,5)) or ((dbo.fnStAppId_Prospect(table_1_people_names.user_id,5) = 0 ...

... with the following ...

... and (((SELECT  CASE WHEN a.[SID] <= 0 THEN 0       ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)      END      FROM      (     SELECT  MAX(student_app_id) [SID]     FROM    application_status     WHERE   user_id = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId
          HAVING  date_last_modified = MAX(date_last_modified)
          ) a
          LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
          LEFT JOIN people_names n ON a.[SID] = n.student_app_id
          LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id )   > 0
and table_1_people_names.student_app_id = (SELECT  CASE WHEN a.[SID] <= 0 THEN 0       ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)      END      FROM      (     SELECT  MAX(student_app_id) [SID]     FROM    application_status     WHERE   user_id = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId
          HAVING  date_last_modified = MAX(date_last_modified)
          ) a
          LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
          LEFT JOIN people_names n ON a.[SID] = n.student_app_id
          LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id )
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
do you think it makes sense to replace the call to the UDF by the body of the modified UDF that you came up with?
In general, yes.  However, I am not necessarily advocating that.  I believe a better approach may be to insert all the values into a temp table and join against that.  That way you will not have to join against the function (or derived table) six times.

Simply put the entire query and not just the function part needs to be re-evaluated and re-written.  It is a mess as it is.
0
 

Author Comment

by:DBA2000
Comment Utility
Hi, acperkin

The UDF you wrote,  gave me the following error when I execute it.  I tried a number of changes, but none worked.  

Please help.
Thanks,

Msg 8121, Level 16, State 1, Procedure fnStAppId_Prospect_Optim_V1.1, Line 21

Column 'application_status.date_last_modified' is invalid in the HAVING clause because

it is not contained in either an aggregate function or the GROUP BY clause.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 500 total points
Comment Utility
You are right, not sure what I was thinking.  And now I am not exactly sure what you are trying to achieve.  So I would first try getting rid of the HAVING clause entirely as in:
CREATE FUNCTION [dbo].[fnStAppId_Prospect]
    (
     @iuserId int,
     @iappId int
    )
RETURNS int
AS 
    BEGIN 
        RETURN (
            SELECT  CASE WHEN a.[SID] <= 0 THEN 0
                         ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
                    END
            FROM    (SELECT MAX(student_app_id) [SID]
                     FROM   application_status
                     WHERE  user_id = @iuserId
                            AND confirmation_number IS NULL
                            AND application_id = @iappId
                    ) a
                    LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
                    LEFT JOIN people_names n ON a.[SID] = n.student_app_id
                    LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id

			)   
    END

Open in new window


If that does not produce the right results, try this instead:
CREATE FUNCTION [dbo].[fnStAppId_Prospect]
    (
     @iuserId int,
     @iappId int
    )
RETURNS int
AS 
BEGIN 
        RETURN (
        SELECT  CASE WHEN a.[SID] <= 0 THEN 0
                     ELSE COALESCE(i.student_app_id, n.student_app_id, s.student_app_id)
                END
        FROM    application_status s
                INNER JOIN (SELECT  MAX(student_app_id) [SID],
                                    MAX(date_last_modified) MAXdate_last_modified
                            FROM    application_status
                            WHERE   user_id = @iuserId
                                    AND confirmation_number IS NULL
                                    AND application_id = @iappId
                           ) a ON s.date_last_modified = a.MAXdate_last_modified
                LEFT JOIN personal_info i ON a.[SID] = i.student_app_id
                LEFT JOIN people_names n ON a.[SID] = n.student_app_id
                LEFT JOIN special_cccapply_personal_info s ON a.[SID] = s.student_app_id
        WHERE   [user_id] = @iuserId
                AND confirmation_number IS NULL
                AND application_id = @iappId

	)   
    END

Open in new window

0
 

Author Closing Comment

by:DBA2000
Comment Utility
Thank you so much.  
The later version of the UDF works!
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

In SQL Server, when rows are selected from a table, does it retrieve data in the order in which it is inserted?  Many believe this is the case. Let us try to examine for ourselves with an example. To get started, use the following script, wh…
In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now