We help IT Professionals succeed at work.

SQL - Splitting column values into separate rows

Medium Priority
50 Views
Last Modified: 2020-03-04
I do not think this is possible with SQL but I thought i would ask anyway.   I need to split a column into multiple rows.  

Table A:

Item ID               Kits Reference ID       Kits Reference Qty
---------            -------------------      --------------------
1                         100;500                       1;40
2                          200;820;910               45;32;56

As you can see each item record can have one or more reference ID's each separated by a ";".   In addition, each reference ID has a QTY.

Expected Output

Item ID               Kits Reference ID       Kits Reference Qty
---------            -------------------      --------------------
1                         100                                1
1                          500                               40
2                          200                               45
2                           820                              32
2                           910                              56
Comment
Watch Question

Senior DBA
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
Somewhat suprisingly, it's actually easy.  Creating the sample data took me  far longer than the actual code.  It's so much nicer if you provide CREATE TABLE and INSERT statements rather than just a splat of data on the screen.

SELECT k_id.Item, k_qty.Item
FROM #data d
CROSS APPLY dbo.DelimitedSplit8K(d.[Kits Reference ID], ';') AS k_id
CROSS APPLY dbo.DelimitedSplit8K(d.[Kits Reference Qty], ';') AS k_qty
WHERE k_qty.ItemNumber = k_id.ItemNumber

SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K] (
    @pString varchar(8000),
    @pDelimiter char(1)
    )
RETURNS TABLE WITH SCHEMABINDING 
AS
/*SELECT * FROM dbo.DelimitedSplit8K('ab/c/def/ghijklm/no/prq/////st/u//', '/')*/
RETURN
/*Inline CTE-driven "tally table" produces values from 0 up to 10,000: enough to cover varchar(8000).*/
WITH E1(N) AS (SELECT N FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) AS Ns(N)),
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max 
    ctetally(N) AS (/* This provides the "zero base" and limits the number of rows right up front,
                       for both a performance gain and prevention of accidental "overruns". */
        SELECT 0 UNION ALL
        SELECT TOP (DATALENGTH(ISNULL(@pString, 1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    ), cteStart(N1) AS ( /* This returns N+1 (starting position of each "element" just once for each delimiter). */
        SELECT t.N+1
        FROM ctetally t
        WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
    )
/* Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. */
SELECT ROW_NUMBER() OVER(ORDER BY s.N1) AS ItemNumber,
       SUBSTRING(@pString, s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1), 0) - s.N1, 8000)) AS Item
FROM cteStart s;
/*end of func*/
GO

Author

Commented:
Thanks Scott!  Is there any alternative instead of using function?
Bill PrewTest your restores, not your backups...
CERTIFIED EXPERT
Expert of the Year 2019
Top Expert 2016

Commented:
What specific SQL platform are you using (Microsoft SQL Server, ORacle, MySQL, ...)?

»bp

Author

Commented:
Microsoft SQL Server
CERTIFIED EXPERT
Distinguished Expert 2019

Commented:
you should use a function that will validate that the number of items in column kits reference ID matches the number of entries in kits reference qty.
Scott PletcherSenior DBA
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Is there any alternative instead of using function?
Yeah, sure.  It won't perform as well, though, for high volumes of data.  My approach would be something like below.  Realistically, you'd have to pre-set some limit to the max number of values you'd allow in the list using this method.  I've done 5 max values below, but I think I've designed it so that it's easy to expand it to allow more values.

Btw, yes, this code did take longer than the data set up code below :-).

SELECT ca_final.*    
FROM #A
CROSS APPLY (
    SELECT
        CHARINDEX(';', [Kits Reference ID] + ';') AS ID_delim1,
        CHARINDEX(';', [Kits Reference Qty] + ';') AS Qty_delim1
) AS ca1
CROSS APPLY (
    SELECT
        CHARINDEX(';', [Kits Reference ID] + ';', ID_delim1 + 1) AS ID_delim2,
        CHARINDEX(';', [Kits Reference Qty] + ';', Qty_delim1 + 1) AS Qty_delim2
) AS ca2
CROSS APPLY (
    SELECT CASE WHEN ID_delim2 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference ID] + ';', ID_delim2 + 1) END AS ID_delim3,
        CASE WHEN Qty_delim2 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference Qty] + ';', Qty_delim2 + 1) END AS Qty_delim3
) AS ca3
CROSS APPLY (
    SELECT CASE WHEN ID_delim3 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference ID] + ';', ID_delim3 + 1) END AS ID_delim4,
        CASE WHEN Qty_delim3 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference Qty] + ';', Qty_delim3 + 1) END AS Qty_delim4
) AS ca4
CROSS APPLY (
    SELECT CASE WHEN ID_delim4 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference ID] + ';', ID_delim4 + 1) END AS ID_delim5,
        CASE WHEN Qty_delim4 = 0 THEN 0 ELSE
        CHARINDEX(';', [Kits Reference Qty] + ';', Qty_delim4 + 1) END AS Qty_delim5
) AS ca5
CROSS APPLY (
    SELECT [Item ID],
        LEFT([Kits Reference ID], ID_delim1 - 1) AS [Kits Reference ID],
        LEFT([Kits Reference Qty], Qty_delim1 - 1) AS [Kits Reference Qty]
    UNION ALL
    SELECT [Item ID],
        REPLACE(SUBSTRING([Kits Reference ID], ID_delim1 + 1, ID_delim2 - ID_delim1), ';', ''),
        REPLACE(SUBSTRING([Kits Reference Qty], Qty_delim1 + 1, Qty_delim2 - Qty_delim1), ';', '')
    WHERE ID_delim2 > 0
    UNION ALL
    SELECT [Item ID],        
        REPLACE(SUBSTRING([Kits Reference ID], ID_delim2 + 1, ID_delim3 - ID_delim2), ';', ''),
        REPLACE(SUBSTRING([Kits Reference Qty], Qty_delim2 + 1, Qty_delim3 - Qty_delim2), ';', '')
    WHERE ID_delim3 > 0
    UNION ALL
    SELECT [Item ID],        
        REPLACE(SUBSTRING([Kits Reference ID], ID_delim3 + 1, ID_delim4 - ID_delim3), ';', ''),
        REPLACE(SUBSTRING([Kits Reference Qty], Qty_delim3 + 1, Qty_delim4 - Qty_delim3), ';', '')
    WHERE ID_delim4 > 0
    UNION ALL
    SELECT [Item ID],        
        REPLACE(SUBSTRING([Kits Reference ID], ID_delim4 + 1, ID_delim5 - ID_delim4), ';', ''),
        REPLACE(SUBSTRING([Kits Reference Qty], Qty_delim4 + 1, Qty_delim5 - Qty_delim4), ';', '')
    WHERE ID_delim5 > 0
) AS ca_final
ORDER BY [Item ID], [Kits Reference ID]


Sample data for testing the code above:

CREATE TABLE #A (
    [Item ID] int NOT NULL,
    [Kits Reference ID] varchar(8000) NULL,
    [Kits Reference Qty] varchar(8000) NULL
    )
INSERT INTO #A VALUES
    (1, '100;500', '1;40'),
    (2, '200;820;910', '45;32;56')
INSERT INTO #A VALUES
    (3, '300', '99')
INSERT INTO #A VALUES
    (4, '400;401;402;403', '991;992;993;994'),
    (5, '500;501;502;503;504', '1010;1020;1030;1040;1050')