• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 407
  • Last Modified:

Simple Code Challenge

Dear Experts,

I have the following string: CO_DD_MM_YY_HI
Fields CO, DD, MM, YY, HI represent numbers with leading 0's removed..
e.g. 1_12_93_2_8 ,
      33_4_45_35_2

What I need to do, is split the string up so, I know what CO, DD, MM, YY, HI is into seperate fields..
CO DD MM YY HI
1   12  93   2  8
33 4   45   35 2
Due to the removing of leading 0's patindex has to be used to get each part of the string with a replace (@string,'_','#') for each patindex input.

This has to be a single T-SQL Statement, and cannot be a cursor/loop procedure.
As this may give you a head-ache in producing the code, I've assigned a suitable amount of points for a speedy solution.

This is to reformat a complete table structure of fields and perform the field splits as rqd (and wil be run many times -- hence the need for a good single T-SQL Statement to perform this work).  I do not mind if you can produce a demo piece of code using a single value stored in a variable, as I will modify to run on the table.

To add to the complications, Some ROWS are formated like so:
1129401248  --> This data should not be converted and return 0 rows of data where the input format is incorrectly formated.
(Input format, to be checked by the number of _ present in the string -- should be at least 4 _ as given above).

Regards,

Dan.
0
danblake
Asked:
danblake
  • 5
  • 5
  • 4
  • +2
2 Solutions
 
muzzy2003Commented:
Small amount of prep work, but then the select is in a single line. Any good?

CREATE FUNCTION udfSplit (
      @string nvarchar(20),
      @index int
) RETURNS nvarchar(20)
BEGIN

      WHILE @index > 0
      BEGIN
            IF CHARINDEX('_', @string) > 0
                  SET @string = SUBSTRING(@string, CHARINDEX('_', @string) + 1, LEN(@string))
            SET @index = @index - 1
      END
      IF CHARINDEX('_', @string) > 0
            SET @string = LEFT(@string, CHARINDEX('_', @string) - 1)
      RETURN @string

END
GO

DECLARE @tabTest table (TestString nvarchar(20))

INSERT INTO @tabTest VALUES ('1_12_93_2_8')
INSERT INTO @tabTest VALUES ('33_4_45_35_2')
INSERT INTO @tabTest VALUES ('1129401248')

SELECT      TestString,
      dbo.udfSplit(TestString, 0) CO,
      dbo.udfSplit(TestString, 1) DD,
      dbo.udfSplit(TestString, 2) MM,
      dbo.udfSplit(TestString, 3) YY,
      dbo.udfSplit(TestString, 4) HI
FROM      @tabTest
WHERE      LEN(TestString) - LEN(REPLACE(TestString, '_', '')) >= 4


0
 
Snarf0001Commented:
agree with muzzy that using the function provides a more elegent and easier to follow solution, not sure if you wanted functions involved
if not, and you need the complexity of having everything in one statement, this will do the trick:

it's almost impossible to follow in this textbox, but paste into QA and the line breaks will make it easier to follow

update mytable
set CO=substring(CO_DD_MM_YY_HI, 0, charindex('_', CO_DD_MM_YY_HI, 0)),
DD=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) - charindex('_', CO_DD_MM_YY_HI, 0) - 1),
MM=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) - 1),
YY=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) - 1),
HI=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1) + 1, len(CO_DD_MM_YY_HI) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1))
where len(CO_DD_MM_YY_HI) - len(replace(CO_DD_MM_YY_HI, '_', '')) >= 4
0
 
LowfatspreadCommented:
ok my attempt...

drop table #temp
create table #temp (data varchar(100))
insert into #temp values('1_12_93_2_8')
insert into #temp values('33_4_45_35_2')
insert into #temp values('12345678')


Select co
      ,Substring(Data,1,P1-1) as dd
      ,Substring(data,p1+1,dl-p1-p2) as mm
      ,Reverse(Substring(Reverse(Data),1,P2-1)) as yy
     ,HI

from (
       Select co,hi,Data ,Patindex('%!%',Data) as P1
                  ,Patindex('%!%',Reverse(Data)) as P2
                  ,Datalength(data) as DL
               from (
Select Substring(Data,1,P1-1) as Co
      ,Reverse(Substring(Reverse(Data),1,P2-1)) as HI
      ,Substring(data,p1+1,dl-p1-p2) as data
from (
       Select Data,Patindex('%!%',Data) as P1
                  ,Patindex('%!%',Reverse(Data)) as P2
                  ,Datalength(data) as DL
         from (
               select Replace(Data,'_','!') as Data
                 from #temp
                where Data like '_%!__%!__%!__%!__%' Escape '!'
              ) as x1
     ) as x2
     ) as x3
     ) as X4
 
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LowfatspreadCommented:
snarf ,

your
set CO=substring(CO_DD_MM_YY_HI, o, charindex('_', CO_DD_MM_YY_HI, 0)),
needs to be
set CO=substring(CO_DD_MM_YY_HI, 1, charindex('_', CO_DD_MM_YY_HI, 0)),

also the where difference should only be 4

good fun anyway...




drop table #temp
create table #temp (co_dd_mm_yy_hi varchar(100),co varchar(3),dd varchar(3),mm varchar(3),yy varchar(3),hi varchar(3))
insert into #temp (co_dd_mm_yy_hi) values('1_12_93_2_8')
insert into #temp (co_dd_mm_yy_hi) values('33_4_45_35_2')
insert into #temp (co_dd_mm_yy_hi) values('12345678')

select * from #temp

update #temp
set CO=substring(CO_DD_MM_YY_HI, 1, charindex('_', CO_DD_MM_YY_HI, 0)),
DD=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) - charindex('_', CO_DD_MM_YY_HI, 0) - 1),
MM=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) - 1),
YY=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) - 1),
HI=substring(CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1) + 1, len(CO_DD_MM_YY_HI) - charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, charindex('_', CO_DD_MM_YY_HI, 0) + 1) + 1) + 1))
where len(CO_DD_MM_YY_HI) - len(replace(CO_DD_MM_YY_HI, '_', '')) = 4

select * from #temp
0
 
Snarf0001Commented:
nice lowfat.

that's a tiny bit easier to read, i would imagine significantly less intesive on processor as well, since it's not recalculating the same charindex for every column

impressive.

never used escape before

0
 
HilaireCommented:
Thumbs up for Lowfatspread ...
Really a nice one !
0
 
LowfatspreadCommented:
oK possibly a bit better , using the PARSENAME function builtin to SQL SERVER

drop table #temp
create table #temp (data varchar(100))
insert into #temp values('1_12_93_2_8')
insert into #temp values('33_4_45_35_2')
insert into #temp values('12345678')

Select convert(varchar(10),Co) as co
      ,convert(varchar(10),Parsename(data,4)) as dd
      ,convert(varchar(10),Parsename(data,3)) as MM
      ,convert(varchar(10),Parsename(data,2)) as YY
      ,convert(varchar(10),Parsename(data,1)) as HI
from (
Select Substring(Data,1,P1-1) as Co
      ,Substring(Data,p1+1,dl-p1) as data
         from (
               select Replace(Data,'_','.') as Data
                     ,Patindex('%!%',replace(Data,'_','!')) as P1
                     ,DATALENGTH(DATA) as DL
                 from #temp
                where Data like '_%!__%!__%!__%!__%' Escape '!'
              ) as x1
     ) as x2



0
 
muzzy2003Commented:
Now that is nice! Give that man some points.
0
 
danblakeAuthor Commented:
Ok, all .. .I've been catching some zzz's whilst you have had some fun demonstrating your skills -- I'm aim to review the code later today, and award some points !

I can let you know the answer looks like its in LowfatSpreads court, with some assists to others for the good attempts...
Reason -- code looks clean, no UDFs, shortest solution (in code).



0
 
LowfatspreadCommented:
hmm not tested but you probably also need the

where len(CO_DD_MM_YY_HI) - len(replace(CO_DD_MM_YY_HI, '_', '')) = 4

type test in to trap invalid data as '_______'  would probably be accepted otherwise...

or needs some isnumeric tests on the final columns...


0
 
muzzy2003Commented:
Should be >= 4 by the wording of the question, though instinctively = 4 makes sense to me as well.
0
 
danblakeAuthor Commented:
Where about should the where clause be, to ensure that the whole lot validates, in the subquery or outside...
I presume its in the subquery as:

len(data) - len(replace(data,'_','')) >= 4
0
 
danblakeAuthor Commented:
@muzzy2003
-- I've asked Comm Support to add an assist on this question, so you can get some points.
For some reason I cannot find the assist button, to give you points.
0
 
muzzy2003Commented:
No, give them all to Lowfat - I like his solution!
0
 
danblakeAuthor Commented:
@muzzy - -yes I like LFS Solution -- (I think its brill)  for all the reasons quoted above.
You've pointed out the len >= 4 comment/kept on the thread with valid answers -- ahbet not the optimal, hence I want to give you an assist.  (also LFS -- like myself have so many points, that we don't really do this for the points !)
0
 
muzzy2003Commented:
Seriously not bothered, as I've had 98000 SQL Server points this month already, but I won't complain!
0
 
danblakeAuthor Commented:
Note -- isnumeric testing for all parts negates the need for checking the length of the data.
0

Featured Post

NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

  • 5
  • 5
  • 4
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now