Link to home
Start Free TrialLog in
Avatar of smdrzal1
smdrzal1

asked on

updating comma delimited field of mixed data types and messy data

Hi All,

I am trying to update a field with data from another table. Normally this is a no brainer however this time I have a comma delimited field that contains the primary key(s) to data in another table. On top of that the data field isn't always clean. For example the field I'm trying to update can look like any of the following:
-1- 78ebba0bbd2bd440329584a245630d06
-2- Physical Access Control / Building Security
-3- b9000d8bbd571000329584a245630d39,9e000d8bbd571000329584a245630d4a
-4- 83000d8bbd571000329584a245630d90,Patient Admitting and Registration
and some like this one which is only 2 pieces of information not 3
-5- 8e429b0bbd6fd440329584a245630dc6,Scheduling, Scripting and Automation Tools

So far I've tried several things (function, stored procedure, dynamic sql) and always keep coming back at a stalemate when attempting to update this field (I actually have 3 fields with this same issue). I did create a procedure that does seem to work, but only for a single record at a time so I'm sure there has to be a better way - I don't want to use a cursor, and I can't manipulate any kind of table within a function.

I am trying to return a name for the ones with ID's, the name when the names matches, and, ideally, the name with a flag (asterisk or something) for ones don't match on either name or ID.

Below is the procedure I currently run to return "names".  Any thoughts on how to improve this so it operates on the entire table without a cursor are greatly appreciated. Thanks.
DECLARE @AppID VARCHAR(25) SET @AppID = 'APP006670'
DECLARE @SQL VARCHAR(2000)
DECLARE @MApptypeID VARCHAR(500)
DECLARE @Apptype VARCHAR(5000) 

SET @MApptypeID =
  (
SELECT
  '''' + replace
  (
      CASE
        WHEN isnull(cast(atn."u_name" AS VARCHAR (500)),'none') = 'none'
        THEN cast("u_application_type" AS VARCHAR(500))
      ELSE cast(atn."u_name" AS VARCHAR(500))
      END,',',''',''') + ''''
    AS "u_application_type"
  FROM
    dbo.Application app
      left outer join dbo.Apptype atn
      ON cast(app.u_application_type AS VARCHAR(500))=cast(atn.sys_id AS VARCHAR(500))
  WHERE
    cast("u_number" AS VARCHAR(50)) = @AppID
  ) 

PRINT @MApptypeID
TRUNCATE TABLE dbo.MApptype 

SET @SQL =
  'INSERT INTO dbo.MApptype
    SELECT u_name
    FROM [dbo].[Apptype]
    WHERE [sys_id] IN (' + @MApptypeID + ')
    UNION ALL
    SELECT u_name
    FROM [dbo].[Apptype]
    WHERE [u_name] IN (' + @MApptypeID + ')'
EXECUTE (@SQL) 

SELECT @Apptype = coalesce(@Apptype + ',', '') + u_name FROM dbo.MApptype 

PRINT @Apptype

Open in new window

Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

I don't actually follow really what the data "format" is about, as visibly the "," can be both the separator AND part of the data.
in which case you have a BIG issue in the table design, and instead of trying to "fix" this on the symptoms, you really should fix this in the design.

also, i don't see what you are trying to achieve, as the sample data you posted vs the code you posted doesn't explain the result of the "update", and the rules of it.

please try again by posting the "before data", the "after data", and the "input data" that would result in the "update".
Avatar of Mike McCracken
Mike McCracken

There probably isn't much you can do to change the data in the field since there are probably other programs that access the data and rely on there being a comma between the fields.

Generally situations like this are handled in 2 ways.
1.  Choose a character that cannot be in the data and use that as the separator.  Since you seem to have text and probably most printable characters can appear, you could appear you could use a non-printable character like chr(9) or chr(200) as the delimiter

2.  Put " " around the text fields.  In that way an internal comma is not used as a separator.

mlmcc
Avatar of smdrzal1

ASKER

I would love to just fix the source - however I can't.  Until recently I could only see the behavior within that system as I traversed the vendor UI.  Once I got read access to the underlying data I could interpret and formulate a why.  The issue has been reported to the vendor to correct.  So for simplification let's say they correct it and the field now only contains one or more comma delimited id's like this:
b9000d8bbd571000329584a245630d39,9e000d8bbd571000329584a245630d4a

The process goes like this and only applies to about 5000 rows in destination
Vendor Tables in vendor system -
Source Table A (STA) - Column 1 Internal ID (SAC1), Column 2 Reference (SAC2), Column 3 Name (SAC3), Column 4 External ID (SAC4)
Source Table B (STB) - Column 1 ID (SBC1), Column 2 Name (SBC2)

My dataset -
Destination Table A (DTA) - Column 1 ID (DAC1), Column 2 Name (DAC2), Column 3 Category (DAC3)

DTA should have the External ID, Name, and Category in plain english.

DTA is populated with SAC4->DAC1, SAC3->DAC2, SAC2->DAC3 as part of a larger update process that is in place.

DAC1 and DAC2 are fine, but DAC3 isn't and the process I wrote changes the data in DAC3 from  this:
b9000d8bbd571000329584a245630d39,9e000d8bbd571000329584a245630d4a
to
Clinical Support Systems, Security Access

I can use a cursor and have it do this for every ID found in DAC1, but I am trying to avoid that.  What the code above does is insert DAC3 into a variable, then manipulates that variable to return this:
'b9000d8bbd571000329584a245630d39','9e000d8bbd571000329584a245630d4a'

That is fed as an IN parameter to look up SBC2 and return all where SBC1 is in that parameter and dumps it to a table - in this case 2 rows
Clinical Support Systems
Security Access

Then it does a coalesce and returns those two rows as one variable that updates DAC3 to
Clinical Support Systems, Security Access

When using that code as dynamic SQL in a proc which is then called from a function it fails since it is acting on a table

I rewrote the proc to not use a table at all - just variables, and I still get an error about only being able to use other functions and extended procedures when I call it from the function.

It seems to me I should just be able to insert the code in the overall master procedure somehow or maybe there is a way to rewrite the function that I am just not seeing.

Thanks,
its great that you gave a full information on this question, but somehow it is going on top of my head to understand this....

in order to simplify,

Let us say you have one table Table A in which you have some data related to the cryptic codes...
What is the tableB does and Table C....

Can you put that in that way,,, instead of the complicated Jargons like DAC and all.....
Ok forget everything I've said so far except that the code I wrote works - just not as a function and not without a cursor - neither of which is desirable.

I have Table A with a category field that contains 1-10 comma delimited ids.  I need to update the category field in Table A to replace those comma delimited ids with comma delimited names for every record in my data (about 5000).  For this I will use table B which contains the category ids and names.  My code looks up the ids and coalesces them into a comma delimited string of the names found in Table B with corresponding ids.  This is the correct result, however the implementation is not.

What I'm trying to find out is why I can't get it to work as a function or without using a cursor which I should be able to do.
In case anyone is interested I have arrived at a satisfactory method - using 2 functions, and completely discarding the stored procedure I had written previously.  I am still getting an odd truncation at 32 char in some of the data previously identified as "bad" data, but I will figure that out too.
Thanks,
ASKER CERTIFIED SOLUTION
Avatar of smdrzal1
smdrzal1

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I asked for a way to make the tsql I wrote to work without using a cursor and preferably as a function.  The question received no suggestions or any kind of assistance in this area.  The only suggestions offered questioned why the source data was like that and offered ways to alter the source data (unhelpful).  Some just couldn't understand the issue.

I provided the code for how I solved the problem as maybe someone else is running up against this type of thing and may find it useful.