?
Solved

Sorting subset strings in a char field in MS SQL 2008 R2

Posted on 2015-01-14
9
Medium Priority
?
244 Views
Last Modified: 2015-01-19
Hi EEE:

A char field contains values like:

 AB  23  43  543
AB 5 111 22
 AB 5 2 3333
XY 234 1  24
  G 112 3434 222 33
45 33 44

In other words, characters (optional) followed by numbers, all separated by variable space(s).
I would like to sort this field by characters first and then the individual number blocks. White spaces of any length determine blocks.


So for example,

45 33 44 would sort before AB 5 2 3333 as it contains no leading set of characters.
AB 5 111 22 would sort before  AB 23  43  543 as 5 comes before 23.
 AB 5 2 3333 would sort before AB 5 111 22 as 2 comes before 111 (all prior characters are
the same)

Is there a way I can do this in Microsoft SQL 2008 R2?
0
Comment
Question by:LuckyLucks
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 70

Expert Comment

by:Qlemo
ID: 40550063
I daresay that not. The dynamic nature of the sort criteria does not allow for it.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 40550073
You could pass the string into a function and return a sortable value, but the performance will not be great.  How many rows like this do you need to process?
0
 

Author Comment

by:LuckyLucks
ID: 40550112
50K and likely not to exceed the 100K.
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 2000 total points
ID: 40550155
CREATE TABLE #data (
    string varchar(1000) NOT NULL
    )
INSERT INTO #data
SELECT 'AB  23  43  543' UNION ALL
SELECT ' AB 5 111 22' UNION ALL
SELECT '  AB 5 2 3333' UNION ALL
SELECT ' XY 234 1  24' UNION ALL
SELECT '   G 112 3434 222 33' UNION ALL
SELECT ' 45 33 44'

GO

CREATE FUNCTION dbo.Set_Sort_String (
    @string varchar(1000)
)
RETURNS varchar(1000)
AS
BEGIN
SET NOCOUNT ON;
SELECT
    CASE WHEN [1] LIKE '%[^0-9]%'
         THEN [1] + SPACE(10 - LEN([1]))
         ELSE SPACE(10 - LEN([1]))  + [1]
         END +
    SPACE(10 - ISNULL(LEN([2]), 0))  + ISNULL([2], '') +
    SPACE(10 - ISNULL(LEN([3]), 0))  + ISNULL([3], '') +
    SPACE(10 - ISNULL(LEN([4]), 0))  + ISNULL([4], '') +
    SPACE(10 - ISNULL(LEN([5]), 0))  + ISNULL([5], '') +
    SPACE(10 - ISNULL(LEN([6]), 0))  + ISNULL([6], '') +
    SPACE(10 - ISNULL(LEN([7]), 0))  + ISNULL([6], '') +
    SPACE(10 - ISNULL(LEN([8]), 0))  + ISNULL([7], '') +
    SPACE(10 - ISNULL(LEN([9]), 0))  + ISNULL([8], '') +
    SPACE(10 - ISNULL(LEN([10]), 0)) + ISNULL([9], '') +
    SPACE(10 - ISNULL(LEN([2]), 0)) + ISNULL([10], '')    
FROM (
    SELECT
        --replace 2+ spaces in a row with a single space
        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(@string)),
            SPACE(33), SPACE(1)), SPACE(17), SPACE(1)), SPACE(9), SPACE(1)), SPACE(5), SPACE(1)),
            SPACE(3), SPACE(1)), SPACE(2), SPACE(1)), SPACE(2), SPACE(1)) AS string
) AS test_data
CROSS APPLY dbo.DelimitedSplit8K ( string, SPACE(1) ) ds
PIVOT (
    MAX(item)
    FOR ItemNumber IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])
) AS pvt
END --IF

GO



SELECT *, dbo.Set_Sort_String ( string )
FROM #data
ORDER BY dbo.Set_Sort_String ( string )
0
 

Author Comment

by:LuckyLucks
ID: 40550228
Hi Scott,

Could you please add some comments so I can understand what the code is trying to do in each block?


Thanks much
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 40550303
First, it's splitting it into separate values with no extra spaces, so:
 G  112    3434  222    33
would become:
G
112
3434
222
33
Then it concatenates all the values into one string, as follows:
If the first value has any nonnumeric char in it -- such as "G", like this one does -- the first 10 bytes of that value is left-aligned in the new string:
Gbbbbbbbbb
where b = blank/space.
Then every entry after that is right-aligned into the next 10 bytes:
Gbbbbbbbbbbbbbbbb112bbbbbb3434bbbbbbb222bbbbbbbb33
That is done for each row.  Then the concatenated value above is used to sort the rows.

I believe this will yield the final sort order that you want.
0
 
LVL 32

Expert Comment

by:awking00
ID: 40551614
What is the required sort order criteria for the alpha portion? Is it like A,B,C,...Z,AA, AB, etc.which would make the string beginning with a single G sort before strings beginning with AB or should the string beginning with a single G sort after strings beginning with AB because G comes after A?
0
 

Author Comment

by:LuckyLucks
ID: 40551986
Thanks Scott, a couple of last questions:

1. What does the below try to accomplish?

SPACE(10 - ISNULL(LEN([2]), 0))  + ISNULL([2], '') +
     SPACE(10 - ISNULL(LEN([3]), 0))  + ISNULL([3], '') +
     SPACE(10 - ISNULL(LEN([4]), 0))  + ISNULL([4], '') +
     SPACE(10 - ISNULL(LEN([5]), 0))  + ISNULL([5], '') +
     SPACE(10 - ISNULL(LEN([6]), 0))  + ISNULL([6], '') +
     SPACE(10 - ISNULL(LEN([7]), 0))  + ISNULL([6], '') +
     SPACE(10 - ISNULL(LEN([8]), 0))  + ISNULL([7], '') +
     SPACE(10 - ISNULL(LEN([9]), 0))  + ISNULL([8], '') +
     SPACE(10 - ISNULL(LEN([10]), 0)) + ISNULL([9], '') +
     SPACE(10 - ISNULL(LEN([2]), 0)) + ISNULL([10], '')    

2. Where is the DelimitedSplit8K function defined?

Thanks
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 40552011
Exactly what I stated above.  It puts each column into a fixed 10 byte string from left to right:
1st column = bytes 01-10
2nd column = bytes 11-20
3rd column = bytes 21-30
etc.

A fairly well-known, highly efficient splitter function.  Sorry, thought I had posted its code too:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--NOTE: (max) data type will at least double the time to do the split!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l
;
GO
0

Featured Post

Use Case: Protecting a Hybrid Cloud Infrastructure

Microsoft Azure is rapidly becoming the norm in dynamic IT environments. This document describes the challenges that organizations face when protecting data in a hybrid cloud IT environment and presents a use case to demonstrate how Acronis Backup protects all data.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Suggested Courses

765 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question