[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

How to parse through a SQL Comments Column

Posted on 2014-08-28
12
Medium Priority
?
150 Views
Last Modified: 2014-09-02
I have a 'COMMENTS' field (column) as such:

CONVENTIONAL OIL PRICES PER QUART: $3.00 SYNTHETIC BLENDS:$4.00 TIRE ROTATION: $16.00 TIRE BALANCE: $8.00 LD & MD RATE: $72.00 HD RATE: NA

I need to parse this comment out to insert into individual columns.

The text in the above comments will go into a description column, i.e.,  (CONVENTIONAL OIL PRICES PER QUART, SYNTHETIC BLENDS, etc....)  The dollar amounts will go into a "Cost" or "Amount" column.  

I don't think it it is imperative to have the '$' included with the amount, but want to try it both ways.

So would end up with something like:

Description                             Amount
CONVENTIONAL OIL...            $3.00
SYNTHETIC BLENDS               $4.00
TIRE ROTATION                      $16.00

...and so on


So basically the comments column is comprised of a description and associated cost.
0
Comment
Question by:66chawger
  • 6
  • 3
  • 2
  • +1
12 Comments
 
LVL 53

Expert Comment

by:Vitor Montalvão
ID: 40290446
Where is the source for the COMMENTS column?
And the target it's a new table or an existing one?
0
 

Author Comment

by:66chawger
ID: 40290492
Comments is in A user defined table USER_DEFINED8_FW

Target table is USER_DEFINED36_FW.
0
 
LVL 4

Expert Comment

by:Randy Knight, MCM
ID: 40290519
I would use a split function (I can post one later if you dont' have one) and use it twice.  First split with a : as the delimeter, then split the resulting string with a $ as the delimiter.
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 
LVL 53

Accepted Solution

by:
Vitor Montalvão earned 1400 total points
ID: 40290707
Here's one possible solution but I didn't test it hard, so can have some bug on the code.
DECLARE @id INTEGER
DECLARE @descr VARCHAR(100), @Amount MONEY
DECLARE @Comments VARCHAR(1000)
DECLARE @LenComments INTEGER
DECLARE @PosStart INTEGER, @PosEnd INTEGER

DECLARE comments_cursor CURSOR FOR 
SELECT id, comments, LEN(comments) FROM USER_DEFINED38_FW

OPEN comments_cursor

FETCH NEXT FROM comments_cursor INTO @id, @Comments, @LenComments
WHILE @@FETCH_STATUS = 0
BEGIN
	SET @PosStart=1
	
	WHILE @PosStart<@LenComments
	BEGIN
		SET @PosEnd = CHARINDEX(':',@comments,@PosStart)
		SET @descr = SUBSTRING(@comments,@PosStart,@PosEnd-@PosStart)
		
		SET @PosStart = CHARINDEX('$',@comments,@PosEnd)+1
		SET @PosEnd = CHARINDEX(' ',@comments,@PosStart)
		
		IF ISNUMERIC(SUBSTRING(@comments,@PosStart,@PosEnd-@PosStart))=1 
			SET @Amount = SUBSTRING(@comments,@PosStart,@PosEnd-@PosStart)
		ELSE
			SET @Amount = NULL
		
		INSERT INTO USER_DEFINED36_FW
		VALUES (@id, @descr, @Amount)
		
		SET @PosStart = @PosEnd+1
	END
	
	FETCH NEXT FROM comments_cursor INTO @id, @Comments, @LenComments
END

CLOSE comments_cursor
DEALLOCATE comments_cursor

Open in new window

0
 
LVL 70

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 600 total points
ID: 40290825
This was a bear!  But please try the code below.  First create the split function, then run the main query.  Despite the gyrations in the code, I believe this should run fast overall.

The "id" column is the key or other identifier for the row, so that each row's comments can be accurately assigned back the row it came from.

--main query
SELECT
    ItemGroup,
    MAX(CASE WHEN WhichColumn = 'Description' THEN SUBSTRING(split.Item, CHARINDEX(' ', split.Item + ' ') + 1, 100)
        ELSE '' END) AS Description,
    MAX(CASE WHEN WhichColumn = 'Amount' THEN LEFT(split.Item, CHARINDEX(' ', split.Item + ' ') - 1)
        ELSE '' END) AS Amount
FROM (
    SELECT 1 AS id, 'CONVENTIONAL OIL PRICES PER QUART: $3.00 SYNTHETIC BLENDS:$4.00 TIRE ROTATION: $16.00 TIRE BALANCE: $8.00 LD & MD RATE: $72.00 HD RATE: NA' AS comments
) AS test_data
CROSS APPLY dbo.DelimitedSplit8K ('<dummy> ' + comments, ':') AS split
CROSS JOIN (
    SELECT 'Amount' AS WhichColumn UNION ALL
    SELECT 'Description'
) AS cj1
CROSS APPLY (
    SELECT CASE WHEN WhichColumn = 'Amount' THEN split.ItemNumber - 1 ELSE split.ItemNumber END AS ItemGroup
) AS assign_alias
GROUP BY id, ItemGroup
HAVING
    MAX(CASE WHEN WhichColumn = 'Description' THEN SUBSTRING(split.Item, CHARINDEX(' ', split.Item + ' ') + 1, 100)
        ELSE '' END) > ''
ORDER BY id, ItemGroup


--high-performance splitter function
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = LTRIM(RTRIM(SUBSTRING(@pString, l.N1, l.L1)))
   FROM cteLen l
;
GO
0
 

Author Comment

by:66chawger
ID: 40291017
Yes, have been working on this for two days.... which is a long time for me... I don't have that much pride to admit it is quirky.  

Vitor and Scott, let me run these against my data and see what I get.   I will tweek as necessary.    Vitor, what you are doing is very similar to what I have been attempting.  I have come close, but not entirely.  Main issue is the comments is free form and you can't depend on a common delimiter, spelling errors, etc...
0
 

Author Comment

by:66chawger
ID: 40291084
Vitor,

Getting  "Msg 537, Level 16, State 2, Line 20 Invalid length parameter passed to the LEFT or SUBSTRING function"


Scott...,. testing yours now.
0
 
LVL 53

Expert Comment

by:Vitor Montalvão
ID: 40292018
66chawger, I wrote that code based in the COMMENTS value you gave me. If there are others text formats for that column please post it here so I can adapt my code to it.
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 40298739
Were you able to do any further testing on this?  Just curious :).
0
 

Author Comment

by:66chawger
ID: 40298992
Apologies, was out of town for the Labor Day weekend.    

I made some modifications to a cursor.  

This is working fairly well. The only issue I have is the very last amount is being left off.

Example:  "CONVENTIONAL OIL QT: N/A SYNTHETIC BLEND OIL QT: N/A 4 TIRE ROTATION: N/A TIRE BALANCE PER TIRE: N/A LABOR RATE HOUR LD: $68 LABOR RATE HOUR MD: $68 LABOR RATE HOUR HD: $68"

So only the "LABOR RATE HOUR HD" is parsed, the amount, in this case $68 is left off.   Again this only happens for the last value at the end of the comment.

Below is what I am currently working with:



DECLARE @valueList varchar(8000)
DECLARE @pos INT
DECLARE @len INT
DECLARE @value varchar(8000)

DECLARE user_defined8_fw_cursor CURSOR
    FOR SELECT comments_fw FROM USER_DEFINED8_FW
    WHERE (ARCHIVE_STATUS_FW = 'N' OR ARCHIVE_STATUS_FW IS NULL)
--      AND UD8.START_DATE_FW < '2014-07-01'
      AND END_DATE_FW >= '2014-07-01'
--      UD8.START_DATE_FW >= UD8.END_DATE_FW
    AND VENDOR_CONTRACT_TYPE_FW = 'A'

OPEN user_defined8_fw_cursor


    FETCH NEXT FROM user_defined8_fw_cursor
    INTO @valueList

    WHILE @@FETCH_STATUS = 0
    BEGIN

        set @pos = 0
        set @len = 0

        WHILE CHARINDEX(':', @valueList, @pos+1)>0
        BEGIN
            set @len = CHARINDEX(':', @valueList, @pos+1) - @pos
            set @value = SUBSTRING(@valueList, @pos, @len)
                           
            PRINT @value
           --  This is where I will put update code

            set @pos = CHARINDEX(':', @valueList, @pos+@len) +1
        END

        FETCH NEXT FROM user_defined8_fw_cursor
        INTO @valueList  

    END

CLOSE user_defined8_fw_cursor
DEALLOCATE user_defined8_fw_cursor
0
 

Author Comment

by:66chawger
ID: 40299623
Ok, figured out the last issue I was having mentioned in my last post.  Looks like I am good to go at this point.  Even though I built my own TSQL w/ cursor, I was able to get some valuable information from your posts.  Thanks!!!
0
 

Author Closing Comment

by:66chawger
ID: 40299637
As stated, even though I did not use either of the experts code exactly, I was able to build my own TSQL w/cursor with information from both of their posts.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.
Suggested Courses

826 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question