Link to home
Start Free TrialLog in
Avatar of bfuchs
bfuchsFlag for United States of America

asked on

Function to merge 2 records (#2).

Hi Experts,

This is in regards to following

https://www.experts-exchange.com/questions/29100944/Function-to-merge-2-records.html?anchorAnswerId=cceptAnswerByMember#acceptAnswerByMember

I would like to have a function/stored proc that loops thru all my records in Employeestbl and for every two records with same email it executes the stored proc below according to SkipImport = 0 being first.

procedure usp_update_employeestbl (@empid1 varchar(10) = NULL, @empid2 varchar(10)  = NULL, @status varchar(100) output)
as
BEGIN


if not exists (select NULL from employeestbl where ID = @empid1) begin set @status = 'ERROR - NO EMPID1' return end
if not exists (select NULL from employeestbl where ID = @empid2) begin set @status = 'ERROR - NO EMPID2' return end

-- first step is to identify columns that might be subject to updates

declare @where varchar(max)

select @where = isnull(@where+char(13)+char(10)+'or ','') + '(E.['+c.[name]+'] is NULL and S.['+c.[name]+'] is not NULL '+case when t.[name] like '%date%' then ' or E.['+c.[name]+'] < S.['+c.[name]+']' else '' end +')'
from sys.columns c
inner join sys.types t on t.user_type_id = c.user_type_id
where object_name(c.object_id) = 'EmployeesTbl'
and (c.is_nullable = 1
or t.[name] like '%date%') 

print @where

-- set up selection criteria to extract just the needed column names where there is a change

declare @select varchar(max)

select @select = isnull(@select+char(13)+char(10)+' + ','') + 'case when E.['+c.[name]+'] is NULL and S.['+c.[name]+'] is not NULL '+case when t.[name] like '%date%' then ' or E.['+c.[name]+'] < S.['+c.[name]+']' else '' end +' then ''['+c.[name]+'] = S.['+c.[name]+']'' else '''' end' 
from sys.columns c
inner join sys.types t on t.user_type_id = c.user_type_id
where object_name(c.object_id) = 'EmployeesTbl'
and (c.is_nullable = 1
or t.[name] like '%date%') 

print @select

-- now to do the hard work of finding differences in the data
-- we will store the column names in a temp table

declare @sql varchar(max)

if object_id('tempdb..#tmp_employeestbl_merge','U') is not null drop table #tmp_employeestbl_merge
create table #tmp_employeestbl_merge (results varchar(max))

set @sql = 'declare @results varchar(max) Select @results = '+@select+' from EmployeesTbl E cross apply (select * from EmployeesTbl s where s.id ='+@empID2 + ') S
where e.ID = '+@empID1+ ' and ('+@where+')
insert #tmp_employeestbl_merge select @results'

print @sql

exec (@sql)

-- if results were found then we can update

if @@rowcount > 0
begin

   if (select len(results) from #tmp_employeestbl_merge) > 3 print 'UPDATES FOUND' else begin set @status = 'NOTHING TO DO - NO CHANGES' return end

   declare @update varchar(max)

   select @update = 'Update E set '+replace(results,'][','],[')+ ' From EmployeesTbl E cross apply (select * from EmployeesTbl s where s.id ='+@empID2 + ') S
where E.ID = '+@empID1 from #tmp_employeestbl_merge

   print @update

   exec (@update)

   set @status = 'UPDATED'
   
end

END

Open in new window


Thanks in advance.
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Trick question for you - when we get 'pairs' of duplicated email addresses - which one is the updateable row, and which is the surrogate providing the source data for the update ?

While I might know the answer, it might help other experts...

Basic approach I would take is a CURSOR to identify pairs and loop through each of the pairs to call the procedure. I know it is RBAR (or pairs of rows) but to keep it simple....

To identify the pairs, something like :
select E.ID as EMPID1, S.ID as EMPID2 from Employeestbl E inner join Employeestbl S on E.[Email] = S.[Email] Where  E.SkipImport = 0 And S.SkipImport = 1

Open in new window


Back a bit later (have to do a presentation)
Avatar of bfuchs

ASKER

Hi Mark,
which one is the updateable row, and which is the surrogate providing the source data for the update ?
Record with SkipImport = 0 is the updateable row and SkipImport = 1 is the surrogate providing the source data for the update..

This is what I meant by
according to SkipImport = 0 being first
Perhaps didnt not express myself clearly enough..

Thanks,
Ben
Avatar of bfuchs

ASKER

BTW,

How do I exclude SkipImport field from being updated in the SP in question?

Thanks,
Ben
ASKER CERTIFIED SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of bfuchs

ASKER

Hi Mark,

Dont think will get to test this before Sun as I'm not coming tom at work.
Will keep you posted.
Appreciate your help.
Have a nice weekend!

Thanks,
Ben
Avatar of bfuchs

ASKER

Hi Mark,
Two questions..
1- Will this also handle nulls, for example if EmdID1.DateField is null and EmpID2.DateField is not null, will it get transferred?
2- What is the meaning of the code "or t.[name] like '%date%') ", are you determining the field type according to its name? if yes, that is not true here..
Thanks,
Ben
1) yes, it will transfer if  if EmdID1.DateField is null and EmpID2.DateField is not null OR  if EmdID1.DateField is less than EmpID2.DateField

2) t.[name] is coming from "sys.types t" it is the name of the datatype. Check by :  select * from sys.types
Avatar of bfuchs

ASKER

Thank you very much Mark.
You're of great help!