smdrzal1
asked on
updating comma delimited field of mixed data types and messy data
Hi All,
I am trying to update a field with data from another table. Normally this is a no brainer however this time I have a comma delimited field that contains the primary key(s) to data in another table. On top of that the data field isn't always clean. For example the field I'm trying to update can look like any of the following:
-1- 78ebba0bbd2bd440329584a245 630d06
-2- Physical Access Control / Building Security
-3- b9000d8bbd571000329584a245 630d39,9e0 00d8bbd571 000329584a 245630d4a
-4- 83000d8bbd571000329584a245 630d90,Pat ient Admitting and Registration
and some like this one which is only 2 pieces of information not 3
-5- 8e429b0bbd6fd440329584a245 630dc6,Sch eduling, Scripting and Automation Tools
So far I've tried several things (function, stored procedure, dynamic sql) and always keep coming back at a stalemate when attempting to update this field (I actually have 3 fields with this same issue). I did create a procedure that does seem to work, but only for a single record at a time so I'm sure there has to be a better way - I don't want to use a cursor, and I can't manipulate any kind of table within a function.
I am trying to return a name for the ones with ID's, the name when the names matches, and, ideally, the name with a flag (asterisk or something) for ones don't match on either name or ID.
Below is the procedure I currently run to return "names". Any thoughts on how to improve this so it operates on the entire table without a cursor are greatly appreciated. Thanks.
I am trying to update a field with data from another table. Normally this is a no brainer however this time I have a comma delimited field that contains the primary key(s) to data in another table. On top of that the data field isn't always clean. For example the field I'm trying to update can look like any of the following:
-1- 78ebba0bbd2bd440329584a245
-2- Physical Access Control / Building Security
-3- b9000d8bbd571000329584a245
-4- 83000d8bbd571000329584a245
and some like this one which is only 2 pieces of information not 3
-5- 8e429b0bbd6fd440329584a245
So far I've tried several things (function, stored procedure, dynamic sql) and always keep coming back at a stalemate when attempting to update this field (I actually have 3 fields with this same issue). I did create a procedure that does seem to work, but only for a single record at a time so I'm sure there has to be a better way - I don't want to use a cursor, and I can't manipulate any kind of table within a function.
I am trying to return a name for the ones with ID's, the name when the names matches, and, ideally, the name with a flag (asterisk or something) for ones don't match on either name or ID.
Below is the procedure I currently run to return "names". Any thoughts on how to improve this so it operates on the entire table without a cursor are greatly appreciated. Thanks.
DECLARE @AppID VARCHAR(25) SET @AppID = 'APP006670'
DECLARE @SQL VARCHAR(2000)
DECLARE @MApptypeID VARCHAR(500)
DECLARE @Apptype VARCHAR(5000)
SET @MApptypeID =
(
SELECT
'''' + replace
(
CASE
WHEN isnull(cast(atn."u_name" AS VARCHAR (500)),'none') = 'none'
THEN cast("u_application_type" AS VARCHAR(500))
ELSE cast(atn."u_name" AS VARCHAR(500))
END,',',''',''') + ''''
AS "u_application_type"
FROM
dbo.Application app
left outer join dbo.Apptype atn
ON cast(app.u_application_type AS VARCHAR(500))=cast(atn.sys_id AS VARCHAR(500))
WHERE
cast("u_number" AS VARCHAR(50)) = @AppID
)
PRINT @MApptypeID
TRUNCATE TABLE dbo.MApptype
SET @SQL =
'INSERT INTO dbo.MApptype
SELECT u_name
FROM [dbo].[Apptype]
WHERE [sys_id] IN (' + @MApptypeID + ')
UNION ALL
SELECT u_name
FROM [dbo].[Apptype]
WHERE [u_name] IN (' + @MApptypeID + ')'
EXECUTE (@SQL)
SELECT @Apptype = coalesce(@Apptype + ',', '') + u_name FROM dbo.MApptype
PRINT @Apptype
There probably isn't much you can do to change the data in the field since there are probably other programs that access the data and rely on there being a comma between the fields.
Generally situations like this are handled in 2 ways.
1. Choose a character that cannot be in the data and use that as the separator. Since you seem to have text and probably most printable characters can appear, you could appear you could use a non-printable character like chr(9) or chr(200) as the delimiter
2. Put " " around the text fields. In that way an internal comma is not used as a separator.
mlmcc
Generally situations like this are handled in 2 ways.
1. Choose a character that cannot be in the data and use that as the separator. Since you seem to have text and probably most printable characters can appear, you could appear you could use a non-printable character like chr(9) or chr(200) as the delimiter
2. Put " " around the text fields. In that way an internal comma is not used as a separator.
mlmcc
ASKER
I would love to just fix the source - however I can't. Until recently I could only see the behavior within that system as I traversed the vendor UI. Once I got read access to the underlying data I could interpret and formulate a why. The issue has been reported to the vendor to correct. So for simplification let's say they correct it and the field now only contains one or more comma delimited id's like this:
b9000d8bbd571000329584a245 630d39,9e0 00d8bbd571 000329584a 245630d4a
The process goes like this and only applies to about 5000 rows in destination
Vendor Tables in vendor system -
Source Table A (STA) - Column 1 Internal ID (SAC1), Column 2 Reference (SAC2), Column 3 Name (SAC3), Column 4 External ID (SAC4)
Source Table B (STB) - Column 1 ID (SBC1), Column 2 Name (SBC2)
My dataset -
Destination Table A (DTA) - Column 1 ID (DAC1), Column 2 Name (DAC2), Column 3 Category (DAC3)
DTA should have the External ID, Name, and Category in plain english.
DTA is populated with SAC4->DAC1, SAC3->DAC2, SAC2->DAC3 as part of a larger update process that is in place.
DAC1 and DAC2 are fine, but DAC3 isn't and the process I wrote changes the data in DAC3 from this:
b9000d8bbd571000329584a245 630d39,9e0 00d8bbd571 000329584a 245630d4a
to
Clinical Support Systems, Security Access
I can use a cursor and have it do this for every ID found in DAC1, but I am trying to avoid that. What the code above does is insert DAC3 into a variable, then manipulates that variable to return this:
'b9000d8bbd571000329584a24 5630d39',' 9e000d8bbd 5710003295 84a245630d 4a'
That is fed as an IN parameter to look up SBC2 and return all where SBC1 is in that parameter and dumps it to a table - in this case 2 rows
Clinical Support Systems
Security Access
Then it does a coalesce and returns those two rows as one variable that updates DAC3 to
Clinical Support Systems, Security Access
When using that code as dynamic SQL in a proc which is then called from a function it fails since it is acting on a table
I rewrote the proc to not use a table at all - just variables, and I still get an error about only being able to use other functions and extended procedures when I call it from the function.
It seems to me I should just be able to insert the code in the overall master procedure somehow or maybe there is a way to rewrite the function that I am just not seeing.
Thanks,
b9000d8bbd571000329584a245
The process goes like this and only applies to about 5000 rows in destination
Vendor Tables in vendor system -
Source Table A (STA) - Column 1 Internal ID (SAC1), Column 2 Reference (SAC2), Column 3 Name (SAC3), Column 4 External ID (SAC4)
Source Table B (STB) - Column 1 ID (SBC1), Column 2 Name (SBC2)
My dataset -
Destination Table A (DTA) - Column 1 ID (DAC1), Column 2 Name (DAC2), Column 3 Category (DAC3)
DTA should have the External ID, Name, and Category in plain english.
DTA is populated with SAC4->DAC1, SAC3->DAC2, SAC2->DAC3 as part of a larger update process that is in place.
DAC1 and DAC2 are fine, but DAC3 isn't and the process I wrote changes the data in DAC3 from this:
b9000d8bbd571000329584a245
to
Clinical Support Systems, Security Access
I can use a cursor and have it do this for every ID found in DAC1, but I am trying to avoid that. What the code above does is insert DAC3 into a variable, then manipulates that variable to return this:
'b9000d8bbd571000329584a24
That is fed as an IN parameter to look up SBC2 and return all where SBC1 is in that parameter and dumps it to a table - in this case 2 rows
Clinical Support Systems
Security Access
Then it does a coalesce and returns those two rows as one variable that updates DAC3 to
Clinical Support Systems, Security Access
When using that code as dynamic SQL in a proc which is then called from a function it fails since it is acting on a table
I rewrote the proc to not use a table at all - just variables, and I still get an error about only being able to use other functions and extended procedures when I call it from the function.
It seems to me I should just be able to insert the code in the overall master procedure somehow or maybe there is a way to rewrite the function that I am just not seeing.
Thanks,
its great that you gave a full information on this question, but somehow it is going on top of my head to understand this....
in order to simplify,
Let us say you have one table Table A in which you have some data related to the cryptic codes...
What is the tableB does and Table C....
Can you put that in that way,,, instead of the complicated Jargons like DAC and all.....
in order to simplify,
Let us say you have one table Table A in which you have some data related to the cryptic codes...
What is the tableB does and Table C....
Can you put that in that way,,, instead of the complicated Jargons like DAC and all.....
ASKER
Ok forget everything I've said so far except that the code I wrote works - just not as a function and not without a cursor - neither of which is desirable.
I have Table A with a category field that contains 1-10 comma delimited ids. I need to update the category field in Table A to replace those comma delimited ids with comma delimited names for every record in my data (about 5000). For this I will use table B which contains the category ids and names. My code looks up the ids and coalesces them into a comma delimited string of the names found in Table B with corresponding ids. This is the correct result, however the implementation is not.
What I'm trying to find out is why I can't get it to work as a function or without using a cursor which I should be able to do.
I have Table A with a category field that contains 1-10 comma delimited ids. I need to update the category field in Table A to replace those comma delimited ids with comma delimited names for every record in my data (about 5000). For this I will use table B which contains the category ids and names. My code looks up the ids and coalesces them into a comma delimited string of the names found in Table B with corresponding ids. This is the correct result, however the implementation is not.
What I'm trying to find out is why I can't get it to work as a function or without using a cursor which I should be able to do.
ASKER
In case anyone is interested I have arrived at a satisfactory method - using 2 functions, and completely discarding the stored procedure I had written previously. I am still getting an odd truncation at 32 char in some of the data previously identified as "bad" data, but I will figure that out too.
Thanks,
Thanks,
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I asked for a way to make the tsql I wrote to work without using a cursor and preferably as a function. The question received no suggestions or any kind of assistance in this area. The only suggestions offered questioned why the source data was like that and offered ways to alter the source data (unhelpful). Some just couldn't understand the issue.
I provided the code for how I solved the problem as maybe someone else is running up against this type of thing and may find it useful.
I provided the code for how I solved the problem as maybe someone else is running up against this type of thing and may find it useful.
in which case you have a BIG issue in the table design, and instead of trying to "fix" this on the symptoms, you really should fix this in the design.
also, i don't see what you are trying to achieve, as the sample data you posted vs the code you posted doesn't explain the result of the "update", and the rules of it.
please try again by posting the "before data", the "after data", and the "input data" that would result in the "update".