GordonPrince
asked on
using function to process string makes query very slow
I'm using SQL 2005. I have a query that returns a string from a table that is usually the name of the County. But a few of the entries have something added to the County field that I want to strip out. So I made a function to process the string. If one of a couple of patterns are matched, an altered string is returned from the function, ELSE the passed in string is returned unaltered. This lets me use the function in several queries and have only one point of maintenance.
But the query takes 100 times longer to run with the function call in the query. Seems like the query is processing all the strings through the function first, then running applying the where clause to restrint the result set. This is the case whether or not the function processed string is included in the where clause.
Any ideas how I can either make this function perform well or otherwise create reusable code to alter these County names, other than more data (another column to the original table with the corrected County name in it, or a lookup table with the wrong name and the corrected name in two columns, etc.)?
But the query takes 100 times longer to run with the function call in the query. Seems like the query is processing all the strings through the function first, then running applying the where clause to restrint the result set. This is the case whether or not the function processed string is included in the where clause.
Any ideas how I can either make this function perform well or otherwise create reusable code to alter these County names, other than more data (another column to the original table with the corrected County name in it, or a lookup table with the wrong name and the corrected name in two columns, etc.)?
Can you show us the query and the code of your function?
ASKER
I've attached both in the CODE window.
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER OFF
GO
CREATE FUNCTION [dbo].[FixedCountyName] (@strDesc varchar(50))
RETURNS varchar(50)
AS
BEGIN
DECLARE @Desc varchar(50)
IF left(@strDesc, 7) = 'SHELBY-' set @Desc='SHELBY'
ELSE IF left(@strDesc, 10) ='ANDERSON -' OR left(@strDesc, 9) ='ANDERSON-' set @Desc='ANDERSON'
ELSE IF @strDesc = 'HAMBLENINMO' set @Desc='HAMBLEN'
ELSE IF left(@strDesc, 8) ='OBION-IN' OR left(@strDesc, 9) ='OBION NOT' set @Desc='OBION'
ELSE IF left(@strDesc, 11) ='WILLIAMSON ' OR left(@strDesc, 11) ='WILLIAMSON-' set @Desc='WILLIAMSON'
ELSE set @Desc=@strDesc
RETURN(@Desc)
END
/*
SELECT JurisID, jurCounty, dbo.FixedCountyName(jurCounty) FixedCounty, JurDesc, NoTaxes, jurState, jurCity, jurOth
FROM Jurisdiction
WHERE (jurCounty LIKE 'WILLIAMSON%')
select top 9 * from AllOutOfTown2011 where jurState = 'TN' and jurCounty like 'SHELBY%'
alter view AllOutOfTown2011 SCHEMABINDING
create index ix_All2011_County on AllOutOfTown2011(State, County)
SELECT JurisID, JurDesc, jurState, jurCounty
FROM Jurisdiction where JurCounty like 'WILLIAMSON%'
If strState = "TN" Then
If Left(strCounty, 6) = "SHELBY" Then
FixCountyName = "SHELBY"
ElseIf Left(strCounty, Len("ANDERSON")) = "ANDERSON" Then
FixCountyName = "ANDERSON"
ElseIf Left(strCounty, Len("OBION")) = "OBION" Then
FixCountyName = "OBION"
ElseIf Left(strCounty, Len("WILLIAMSON")) = "WILLIAMSON" Then
FixCountyName = "WILLIAMSON"
Else
FixCountyName = strCounty
End If
Else
FixCountyName = strCounty
End If
*/
GO
ASKER
Sorry, too much code. The function is at the top of the code window, the first select statement in the comments is all the query you need. The rest is notes to myself while working on using the function, so you can ignore that.
Nothing to do with your code Gordon, but are you in the property tax business?
ASKER
My client is, yes. Why. Are you?
Yes I am.
In the case of Williamson, it seems that you know the first 10 characters are "WILLIAMSON", so instead of using your function, just return the 10 left characters.
SELECT JurisID, jurCounty, left(jurCounty, 10) as FixedCounty, JurDesc, NoTaxes, jurState, jurCity, jurOth
FROM Jurisdiction
WHERE (jurCounty LIKE 'WILLIAMSON%')
It looks like your counties have no uniformity, some having dashes or spaces and including extra characters. Instead of having to use a function full of IF statements for every query, can you run an update once to clean up all of the county names?
ASKER
The problem is the names of the counties can't be changed. Users came up with the strategy of using this "enhanced" county name for some other business reasons. I'm trying to come up with a way of making clean queries without making another data column somewhere -- one with the clean county name, one with the "enhanced" county name. The function seemed easy enough. But the performance was so bad.
Try changing those IF statements in the function to a CASE statement and see if that has any effect on the speed.
Also, do you have an index on jurCounty? Also, instead of using like 'Williiamson%', try left(jurcounty, 10) = 'Williamson'
ASKER
The thing is I don't know which county I'm going to query on. I've created a view that users run queries on. It's not practical to have them select which of the multiple queries to use based on which county they are interested in. Aso, sometimes they want a list with multiple counties in it, and each should be cleaned up. So it needs to be in one place.
I suppose I could put the set of IF statements from the function into the definition of the view as a CASE statement. But then I have to keep copying the CASE statement into multiple views, defeating my strategy of only putting each set of logic in one place.
I suppose I could put the set of IF statements from the function into the definition of the view as a CASE statement. But then I have to keep copying the CASE statement into multiple views, defeating my strategy of only putting each set of logic in one place.
It seems to me that the non-uniformity of this column is going to cause you major headaches unless you create a new field with the cleaned up county name. What's going to happen when a new county gets added? You're going to have to change the code in your function and that doesn't seem very productive to me. Is there a reason you don't want to add a new column?
ASKER
Only that the list doesn't change much -- it's a list of Counties. I mean, County names don't change very often. So a few are added every few months, and so far out of 5000 of them, only 6 have funny names that I want to correct. I was just holding off on making another data column since 99.99% of the data is the same in each of the columns. It seemed the call to the function to clean up the 6 or so out of 5,000 items was a low maintenance way of doing this. But I was surprised that it had such a negative impact on the performance.
I would have thought a CASE statement in a SQL statement would have processed about the same as a set of IF statements inside a function. But apparently that's not the case.
I would have thought a CASE statement in a SQL statement would have processed about the same as a set of IF statements inside a function. But apparently that's not the case.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
There are 5212 rows in the Jurisdiction table.
Checking the indexes, turns out there's not one of the County field. I added one. Now the query performs great and the view (with 7 tables in the definition) performs great. So I was overlooking the basics. Good idea, dkrollCTN.
Checking the indexes, turns out there's not one of the County field. I added one. Now the query performs great and the view (with 7 tables in the definition) performs great. So I was overlooking the basics. Good idea, dkrollCTN.
No problem Gordon, glad I could help.
Gordon, please make sure to accept my solution and award points. :)