Link to home
Start Free TrialLog in
Avatar of GordonPrince
GordonPrinceFlag for United States of America

asked on

using function to process string makes query very slow

I'm using SQL 2005. I have a query that returns a string from a table that is usually the name of the County. But a few of the entries have something added to the County field that I want to strip out. So I made a function to process the string. If one of a couple of patterns are matched, an altered string is returned from the function, ELSE the passed in string is returned unaltered. This lets me use the function in several queries and have only one point of maintenance.

But the query takes 100 times longer to run with the function call in the query. Seems like the query is processing all the strings through the function first, then running applying the where clause to restrint the result set. This is the case whether or not the function processed string is included in the where clause.

Any ideas how I can either make this function perform well or otherwise create reusable code to alter these County names, other than more data (another column to the original table with the corrected County name in it, or a lookup table with the wrong name and the corrected name in two columns, etc.)?
Avatar of David Kroll
David Kroll
Flag of United States of America image

Can you show us the query and the code of your function?
Avatar of GordonPrince

ASKER

I've attached both in the CODE window.
SET ANSI_NULLS OFF
GO

SET QUOTED_IDENTIFIER OFF
GO

CREATE FUNCTION [dbo].[FixedCountyName] (@strDesc varchar(50))  
RETURNS varchar(50)
AS  
BEGIN
  DECLARE @Desc varchar(50)
	IF left(@strDesc, 7) = 'SHELBY-' set @Desc='SHELBY'
	ELSE IF left(@strDesc, 10) ='ANDERSON -' OR left(@strDesc, 9) ='ANDERSON-' set @Desc='ANDERSON'
	ELSE IF @strDesc = 'HAMBLENINMO' set @Desc='HAMBLEN'
	ELSE IF left(@strDesc, 8) ='OBION-IN' OR left(@strDesc, 9) ='OBION NOT'  set @Desc='OBION'
	ELSE IF left(@strDesc, 11) ='WILLIAMSON ' OR left(@strDesc, 11) ='WILLIAMSON-' set @Desc='WILLIAMSON'
	ELSE set @Desc=@strDesc
	RETURN(@Desc)
END
/*
SELECT JurisID, jurCounty, dbo.FixedCountyName(jurCounty) FixedCounty, JurDesc, NoTaxes, jurState, jurCity, jurOth
FROM  Jurisdiction
WHERE (jurCounty LIKE 'WILLIAMSON%')

select top 9 * from AllOutOfTown2011 where jurState = 'TN' and jurCounty like 'SHELBY%'
alter view AllOutOfTown2011 SCHEMABINDING 
create index ix_All2011_County on AllOutOfTown2011(State, County)

SELECT JurisID, JurDesc, jurState, jurCounty
FROM  Jurisdiction where JurCounty like 'WILLIAMSON%' 

If strState = "TN" Then
    If Left(strCounty, 6) = "SHELBY" Then
        FixCountyName = "SHELBY"
    ElseIf Left(strCounty, Len("ANDERSON")) = "ANDERSON" Then
        FixCountyName = "ANDERSON"
    ElseIf Left(strCounty, Len("OBION")) = "OBION" Then
        FixCountyName = "OBION"
    ElseIf Left(strCounty, Len("WILLIAMSON")) = "WILLIAMSON" Then
        FixCountyName = "WILLIAMSON"
    Else
        FixCountyName = strCounty
    End If
Else
    FixCountyName = strCounty
End If
*/
GO

Open in new window

Sorry, too much code. The function is at the top of the code window, the first select statement in the comments is all the query you need. The rest is notes to myself while working on using the function, so you can ignore that.
Nothing to do with your code Gordon, but are you in the property tax business?
My client is, yes. Why. Are you?
Yes I am.
In the case of Williamson, it seems that you know the first 10 characters are "WILLIAMSON", so instead of using your function, just return the 10 left characters.


SELECT JurisID, jurCounty, left(jurCounty, 10) as FixedCounty, JurDesc, NoTaxes, jurState, jurCity, jurOth
FROM  Jurisdiction
WHERE (jurCounty LIKE 'WILLIAMSON%')

Open in new window

It looks like your counties have no uniformity, some having dashes or spaces and including extra characters.  Instead of having to use a function full of IF statements for every query, can you run an update once to clean up all of the county names?
The problem is the names of the counties can't be changed. Users came up with the strategy of using this "enhanced" county name for some other business reasons. I'm trying to come up with a way of making clean queries without making another data column somewhere -- one with the clean county name, one with the "enhanced" county name. The function seemed easy enough. But the performance was so bad.
Try changing those IF statements in the function to a CASE statement and see if that has any effect on the speed.
Also, do you have an index on jurCounty?  Also, instead of using like 'Williiamson%', try left(jurcounty, 10) = 'Williamson'
The thing is I don't know which county I'm going to query on. I've created a view that users run queries on. It's not practical to have them select which of the multiple queries to use based on which county they are interested in. Aso, sometimes they want a list with multiple counties in it, and each should be cleaned up. So it needs to be in one place.

I suppose I could put the set of IF statements from the function into the definition of the view as a CASE statement. But then I have to keep copying the CASE statement into multiple views, defeating my strategy of only putting each set of logic in one place.
It seems to me that the non-uniformity of this column is going to cause you major headaches unless you create a new field with the cleaned up county name.  What's going to happen when a new county gets added?  You're going to have to change the code in your function and that doesn't seem very productive to me.  Is there a reason you don't want to add a new column?
Only that the list doesn't change much -- it's a list of Counties. I mean, County names don't change very often. So a few are added every few months, and so far out of 5000 of them, only 6 have funny names that I want to correct. I was just holding off on making another data column since 99.99% of the data is the same in each of the columns. It seemed the call to the function to clean up the 6 or so out of 5,000 items was a low maintenance way of doing this. But I was surprised that it had such a negative impact on the performance.

I would have thought a CASE statement in a SQL statement would have processed about the same as a set of IF statements inside a function. But apparently that's not the case.
ASKER CERTIFIED SOLUTION
Avatar of David Kroll
David Kroll
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
There are 5212 rows in the Jurisdiction table.

Checking the indexes, turns out there's not one of the County field. I added one. Now the query performs great and the view (with 7 tables in the definition) performs great. So I was overlooking the basics. Good idea, dkrollCTN.
No problem Gordon, glad I could help.
Gordon, please make sure to accept my solution and award points. :)