Delboy_UK
asked on
TSql Function in Sql Server
Dear Experts
I am using SQL Server 2008. I am writing a SQL Server function which cleans source system data for the target system. I am stuck. The following is some sample data:
Source Data:
29 30 31 32 39 42
9 37
9,12,19,21
6,7&20
1 3 5 30 31
13 Test Group
11 + 19
7,8,9
1-5 19-21 26-30
4 5 6-8 10 11 13-15 43
1,2,3,4
12,9 & 21
Cleaned Function Data:
29,30,31,32,39,42
09,37
09,12,19,21
06,07,20
01,03,05,30,31
Invalid
11,19
07,08,07
01,02,03,04,05,19,20,21,26 ,27,28,29, 30
04,05,06,08,10,11,13,14,15 ,43
01,02,03,04
12,09,21
This is my initial attempt and I am stuck. Please help:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[fn_Clas ses]') AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
DROP FUNCTION [dbo].[fn_Classes]
GO
CREATE FUNCTION [dbo].[fn_Classes]
(
@Class varchar(200) -- The Class being interepeted
)
Returns nvarchar(254)
AS
Begin
DECLARE @FinalClass NVARCHAR(254) -- Inprotech Class
SELECT @FinalClass = CASE when (PATINDEX('%[a-zA-Z]%',rtr im(@Class) ) = 0)
and (PATINDEX('%-%',rtrim(@Cla ss)) = 0 )
THEN replace(replace(replace(re place(repl ace(replac e(replace( replace(re place(repl ace(replac e(rtrim(@C lass),'&', ','),'+',' ,'),' ',','),'.',','),'and',',') ,'/',','), ';',','),' *',''),'-' ,','),',,, ',','),',, ',',') -- remove '&',' ','.','*','-' and replace with ',' -- e.g. Non Range values
when (PATINDEX('%[a-zA-Z]%',rtr im(@Class) ) = 0)
and (PATINDEX('%-%',rtrim(@Cla ss)) > 0 ) THEN ' Range' -- Range values e.g. 1-5 19-21 26-30
when (PATINDEX('%[a-zA-Z]%',rtr im(@Class) ) > 0) THEN 'Invalid Class' -- e.g INT.CLASS 12 (CLASS 19)
END
RETURN @FinalClass
END
go
I am using SQL Server 2008. I am writing a SQL Server function which cleans source system data for the target system. I am stuck. The following is some sample data:
Source Data:
29 30 31 32 39 42
9 37
9,12,19,21
6,7&20
1 3 5 30 31
13 Test Group
11 + 19
7,8,9
1-5 19-21 26-30
4 5 6-8 10 11 13-15 43
1,2,3,4
12,9 & 21
Cleaned Function Data:
29,30,31,32,39,42
09,37
09,12,19,21
06,07,20
01,03,05,30,31
Invalid
11,19
07,08,07
01,02,03,04,05,19,20,21,26
04,05,06,08,10,11,13,14,15
01,02,03,04
12,09,21
This is my initial attempt and I am stuck. Please help:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[fn_Clas
DROP FUNCTION [dbo].[fn_Classes]
GO
CREATE FUNCTION [dbo].[fn_Classes]
(
@Class varchar(200) -- The Class being interepeted
)
Returns nvarchar(254)
AS
Begin
DECLARE @FinalClass NVARCHAR(254) -- Inprotech Class
SELECT @FinalClass = CASE when (PATINDEX('%[a-zA-Z]%',rtr
and (PATINDEX('%-%',rtrim(@Cla
THEN replace(replace(replace(re
when (PATINDEX('%[a-zA-Z]%',rtr
and (PATINDEX('%-%',rtrim(@Cla
when (PATINDEX('%[a-zA-Z]%',rtr
END
RETURN @FinalClass
END
go
Could you please describe the exact problem? I can't see it on the first sight.
ASKER
The issue is that I will need to parse the Range values e.g. 1-5 19-21 26-30 and I am not sure how to do that.
ASKER
I also need to make the Function future proof. Thus, if I get the following for an example: '1*5 6-8 12 13) the function will automatically work it out to (01,05,06,07,08,12,13)
Use the following function to convert range like 6-8 to values like 06,07,08:
CREATE FUNCTION [dbo].[fn_Range_To_Values]
(
@Range varchar(200)
)
Returns nvarchar(254)
AS
Begin
DECLARE @lowerBound INT
DECLARE @upperBound INT
DECLARE @dashIndex INT
DECLARE @value INT
DECLARE @result NVARCHAR
SELECT @dashIndex = CHARINDEX($Range, '-')
SELECT @lowerBound CAST(SUBSTRING(@Range, 1, @dashIndex - 1) AS INT)
SELECT @upperBound CAST(SUBSTRING(@Range, @dashIndex + 1, LEN(@Range) - @dashIndex - 1) AS INT)
SELECT @value = @lowerBound
SELECT @result = ''
WHILE @value <= @upperBound
BEGIN
IF @value < 10 BEGIN
SELECT @result = @result + ',0' + CAST(@value AS VARCHAR)
ELSE
SELECT @result = @result + ',' + CAST(@value AS VARCHAR)
END
SELECT @value = @value + 1
END
END
go
ASKER
Thanks for your prompt response:
got the following errors running your function:
Msg 126, Level 15, State 1, Procedure fn_Range_To_Values, Line 16
Invalid pseudocolumn "$Range".
Msg 102, Level 15, State 1, Procedure fn_Range_To_Values, Line 18
Incorrect syntax near 'SUBSTRING'.
Msg 102, Level 15, State 1, Procedure fn_Range_To_Values, Line 19
Incorrect syntax near 'SUBSTRING'.
Msg 156, Level 15, State 1, Procedure fn_Range_To_Values, Line 28
Incorrect syntax near the keyword 'ELSE'.
editted it to the following:
CREATE FUNCTION [dbo].[fn_Range_To_Values]
(
@Range varchar(200)
)
Returns nvarchar(254)
AS
Begin
DECLARE @lowerBound INT
DECLARE @upperBound INT
DECLARE @dashIndex INT
DECLARE @value INT
DECLARE @result NVARCHAR
SELECT @dashIndex = CHARINDEX(@Range, '-')
SELECT @lowerBound = CAST(SUBSTRING(@Range, 1, @dashIndex - 1) AS INT)
SELECT @upperBound = CAST(SUBSTRING(@Range, @dashIndex + 1, LEN(@Range) - @dashIndex - 1) AS INT)
SELECT @value = @lowerBound
SELECT @result = ''
WHILE @value <= @upperBound
BEGIN
IF @value < 10
SELECT @result = @result + ',0' + CAST(@value AS VARCHAR)
ELSE
SELECT @result = @result + ',' + CAST(@value AS VARCHAR)
SELECT @value = @value + 1
END
RETURN @result
END
--go
--
i run the following:
Select dbo.[fn_Range_To_Values](' 1-17 19-24 26-34')
and got this error:
Msg 537, Level 16, State 2, Line 1
Invalid length parameter passed to the LEFT or SUBSTRING function.
got the following errors running your function:
Msg 126, Level 15, State 1, Procedure fn_Range_To_Values, Line 16
Invalid pseudocolumn "$Range".
Msg 102, Level 15, State 1, Procedure fn_Range_To_Values, Line 18
Incorrect syntax near 'SUBSTRING'.
Msg 102, Level 15, State 1, Procedure fn_Range_To_Values, Line 19
Incorrect syntax near 'SUBSTRING'.
Msg 156, Level 15, State 1, Procedure fn_Range_To_Values, Line 28
Incorrect syntax near the keyword 'ELSE'.
editted it to the following:
CREATE FUNCTION [dbo].[fn_Range_To_Values]
(
@Range varchar(200)
)
Returns nvarchar(254)
AS
Begin
DECLARE @lowerBound INT
DECLARE @upperBound INT
DECLARE @dashIndex INT
DECLARE @value INT
DECLARE @result NVARCHAR
SELECT @dashIndex = CHARINDEX(@Range, '-')
SELECT @lowerBound = CAST(SUBSTRING(@Range, 1, @dashIndex - 1) AS INT)
SELECT @upperBound = CAST(SUBSTRING(@Range, @dashIndex + 1, LEN(@Range) - @dashIndex - 1) AS INT)
SELECT @value = @lowerBound
SELECT @result = ''
WHILE @value <= @upperBound
BEGIN
IF @value < 10
SELECT @result = @result + ',0' + CAST(@value AS VARCHAR)
ELSE
SELECT @result = @result + ',' + CAST(@value AS VARCHAR)
SELECT @value = @value + 1
END
RETURN @result
END
--go
--
i run the following:
Select dbo.[fn_Range_To_Values]('
and got this error:
Msg 537, Level 16, State 2, Line 1
Invalid length parameter passed to the LEFT or SUBSTRING function.
CREATE FUNCTION [dbo].[fn_Classes]
(
@Class varchar(200) -- The Class being interepeted
)
Returns nvarchar(254)
AS
Begin
DECLARE @FinalClass NVARCHAR(254) -- Inprotech Class
DECLARE @Byte INT
DECLARE @Class_Found VARCHAR(200)
DECLARE @Range_Indicator TINYINT -- 0 = No Range; 1 = Start of Range.
DECLARE @Range_Values VARCHAR(200)
IF (PATINDEX('%[a-zA-Z]%', @Class) > 0)
SET @FinalClass = 'Invalid'
ELSE
BEGIN
SET @Class = LTRIM(@Class)
SET @Range_Indicator = 0
WHILE @Class > ''
BEGIN
SET @Byte = PATINDEX('%[^0-9]%', @Class)
IF @Byte = 0
SET @Class_Found = @Class
ELSE
SET @Class_Found = LEFT(@Class, @Byte - 1)
IF LEN(@Class_Found) < 2
SET @Class_Found = '0' + @Class_Found
IF @Range_Indicator = 0
SET @FinalClass = COALESCE(@FinalClass + ',', '') + @Class_Found
ELSE
IF @Range_Indicator = 1
BEGIN --generate all values in range
SET @Range_Values = ''
;WITH
cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally AS (
SELECT [100s].digit * 100 + [10s].digit * 10 + [1s].digit AS range_val
FROM cteDigits [1s]
CROSS JOIN cteDigits [10s]
CROSS JOIN cteDigits [100s]
)
SELECT @Range_Values = (
SELECT TOP 100 PERCENT
CASE WHEN range_val < 10 THEN '0' + CAST(range_val AS char(1))
ELSE CAST(range_val AS varchar(3)) END + ','
FROM cteTally
WHERE
range_val BETWEEN RIGHT(',' + @FinalClass, PATINDEX('%[^0-9]%', REVERSE(',' + @FinalClass)) - 1) + 1
AND @Class_Found
ORDER BY
range_val
FOR XML PATH('')
)
SET @FinalClass = COALESCE(@FinalClass + ',', '') + LEFT(@Range_Values, LEN(@Range_Values) - 1)
END --IF
--SELECT @Class AS Class, @Byte AS Byte, @Class_Found AS Class_Found,
-- @Range_Indicator AS Range_Ind, @FinalClass AS FinalClass
IF @Byte = 0
BREAK
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
SET @Byte = @Byte + PATINDEX('%[0-9]%', SUBSTRING(@Class, @Byte + 1, 200))
SET @Class = LTRIM(SUBSTRING(@Class, @Byte, 200))
END --WHILE
END --ELSE
RETURN @FinalClass
END --FUNCTION
GO
(
@Class varchar(200) -- The Class being interepeted
)
Returns nvarchar(254)
AS
Begin
DECLARE @FinalClass NVARCHAR(254) -- Inprotech Class
DECLARE @Byte INT
DECLARE @Class_Found VARCHAR(200)
DECLARE @Range_Indicator TINYINT -- 0 = No Range; 1 = Start of Range.
DECLARE @Range_Values VARCHAR(200)
IF (PATINDEX('%[a-zA-Z]%', @Class) > 0)
SET @FinalClass = 'Invalid'
ELSE
BEGIN
SET @Class = LTRIM(@Class)
SET @Range_Indicator = 0
WHILE @Class > ''
BEGIN
SET @Byte = PATINDEX('%[^0-9]%', @Class)
IF @Byte = 0
SET @Class_Found = @Class
ELSE
SET @Class_Found = LEFT(@Class, @Byte - 1)
IF LEN(@Class_Found) < 2
SET @Class_Found = '0' + @Class_Found
IF @Range_Indicator = 0
SET @FinalClass = COALESCE(@FinalClass + ',', '') + @Class_Found
ELSE
IF @Range_Indicator = 1
BEGIN --generate all values in range
SET @Range_Values = ''
;WITH
cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally AS (
SELECT [100s].digit * 100 + [10s].digit * 10 + [1s].digit AS range_val
FROM cteDigits [1s]
CROSS JOIN cteDigits [10s]
CROSS JOIN cteDigits [100s]
)
SELECT @Range_Values = (
SELECT TOP 100 PERCENT
CASE WHEN range_val < 10 THEN '0' + CAST(range_val AS char(1))
ELSE CAST(range_val AS varchar(3)) END + ','
FROM cteTally
WHERE
range_val BETWEEN RIGHT(',' + @FinalClass, PATINDEX('%[^0-9]%', REVERSE(',' + @FinalClass)) - 1) + 1
AND @Class_Found
ORDER BY
range_val
FOR XML PATH('')
)
SET @FinalClass = COALESCE(@FinalClass + ',', '') + LEFT(@Range_Values, LEN(@Range_Values) - 1)
END --IF
--SELECT @Class AS Class, @Byte AS Byte, @Class_Found AS Class_Found,
-- @Range_Indicator AS Range_Ind, @FinalClass AS FinalClass
IF @Byte = 0
BREAK
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
SET @Byte = @Byte + PATINDEX('%[0-9]%', SUBSTRING(@Class, @Byte + 1, 200))
SET @Class = LTRIM(SUBSTRING(@Class, @Byte, 200))
END --WHILE
END --ELSE
RETURN @FinalClass
END --FUNCTION
GO
That's the complete function. To test:
SELECT
class_string,
[dbo].[fn_Classes](class_s tring)
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 19-21 26-30'
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
) AS data
SELECT
class_string,
[dbo].[fn_Classes](class_s
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 19-21 26-30'
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
) AS data
I deliberately wrote the code to handle, and properly expand, input like this:
'1-5-19-21-26-30'
If you don't want that capability, the code will have to be adjusted to remove it.
'1-5-19-21-26-30'
If you don't want that capability, the code will have to be adjusted to remove it.
ASKER
Thanks. The code looks perfect. I am reviewing it to understand what it is doing. Running it against my dataset to see what it happening. I will update you shortly
CORRECTION: Need to add two lines in case the string ends in a non-num:
...as before...
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
IF PATINDEX('%[0-9]%', SUBSTRING(@Class, @Byte + 1, 200)) = 0 --add this line
BREAK --add this line
...as before...
I need to make one other correction, in case a space precedes the "-" for a range.
But overall I think the function will then be fairly solid.
...as before...
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
IF PATINDEX('%[0-9]%', SUBSTRING(@Class, @Byte + 1, 200)) = 0 --add this line
BREAK --add this line
...as before...
I need to make one other correction, in case a space precedes the "-" for a range.
But overall I think the function will then be fairly solid.
CORRECTION: To allow space(s) to precede "-" for range:
...as before...
WHILE SUBSTRING(@Class, @Byte, 1) = SPACE(1) --add this line
SET @Byte = @Byte + 1 --add this line
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
...as before...
If there are a lot spaces, it could be slightly more efficient to do the check a different way. With few spaces, it's prob more efficient that way. I don't see there being enough spaces often enough to make it worth different code.
...as before...
WHILE SUBSTRING(@Class, @Byte, 1) = SPACE(1) --add this line
SET @Byte = @Byte + 1 --add this line
IF SUBSTRING(@Class, @Byte, 1) = '-'
SET @Range_Indicator = 1
ELSE
SET @Range_Indicator = 0
...as before...
If there are a lot spaces, it could be slightly more efficient to do the check a different way. With few spaces, it's prob more efficient that way. I don't see there being enough spaces often enough to make it worth different code.
OOPS.
Hold it, that last space correction caused PROBLEMS.
Redoing ... :-) .
Hold it, that last space correction caused PROBLEMS.
Redoing ... :-) .
Had to add another condition to the WHILE:
WHILE SUBSTRING(@Class, @Byte, 1) = SPACE(1) AND SUBSTRING(@Class, @Byte + 1, 1) NOT LIKE '[0-9]'
SET @Byte = @Byte + 1
Will now post the entire, corrected function.
WHILE SUBSTRING(@Class, @Byte, 1) = SPACE(1) AND SUBSTRING(@Class, @Byte + 1, 1) NOT LIKE '[0-9]'
SET @Byte = @Byte + 1
Will now post the entire, corrected function.
ASKER
Hi thanks. The results I am getting from running:
SELECT
class_string,
[dbo].[fn_Classes](class_s tring)
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 19-21 26-30'
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
) AS data
are different from what I expected
for example:
29 30 31 32 39 42 29,00,01,02,09,02
9 37 09,07
and I dont want the range expanded. ie 1-3 6-8 = 01,02,03,06,08
SELECT
class_string,
[dbo].[fn_Classes](class_s
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 19-21 26-30'
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
) AS data
are different from what I expected
for example:
29 30 31 32 39 42 29,00,01,02,09,02
9 37 09,07
and I dont want the range expanded. ie 1-3 6-8 = 01,02,03,06,08
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SELECT
class_string,
[dbo].[fn_Classes](class_s tring)
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 -19 21-26 30' --<<-- note spaces before "-"
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
UNION ALL SELECT '1*5 6-8 12 13)' --<<-- ends in non-numeric
) AS data
class_string,
[dbo].[fn_Classes](class_s
FROM (
SELECT '29 30 31 32 39 42' AS class_string
UNION ALL SELECT '9 37'
UNION ALL SELECT '9,12,19,21'
UNION ALL SELECT '6,7&20'
UNION ALL SELECT '1 3 5 30 31'
UNION ALL SELECT '13 Test Group'
UNION ALL SELECT '11 + 19'
UNION ALL SELECT '7,8,9'
UNION ALL SELECT '1-5 -19 21-26 30' --<<-- note spaces before "-"
UNION ALL SELECT '4 5 6-8 10 11 13-15 43'
UNION ALL SELECT '1,2,3,4'
UNION ALL SELECT '12,9 & 21'
UNION ALL SELECT '1*5 6-8 12 13)' --<<-- ends in non-numeric
) AS data
>> Hi thanks. The results I am getting from running: <<
Yeah, SORRY about that, that was because my second / last correction caused a BIG problem.
I've corrected that in subsequent posts above.
Yeah, SORRY about that, that was because my second / last correction caused a BIG problem.
I've corrected that in subsequent posts above.
ASKER
Yep Thanks. I am testing your solution against my dataset now and will let you know shortly. Thanks for your prompt assistance
No problem. I really think we're good now ... until you find some other odd data pattern :-) .
Note that the extra char(s) before a "-" must all be spaces for it to still count as a range, as coded now.
For example, these:
'5 *- 8'
'5 ,- 8'
'5 /- 8' etc.
would NOT be considered a range.
For example, these:
'5 *- 8'
'5 ,- 8'
'5 /- 8' etc.
would NOT be considered a range.
ASKER
Thanks for you assistance. I am still testing and trying to understand the script you provided and also comment it for the future. I will accept the solutuion though as it has been of great help. Thanks once again.
ASKER
Brilliant solution
Thanks.
Sorry, I really don't have time to comment it myself now.
If you'd like, you can follow up with any qs, of course, on why the code is as it is.
Sorry, I really don't have time to comment it myself now.
If you'd like, you can follow up with any qs, of course, on why the code is as it is.
ASKER
Thanks