Solved

duplicates in the result set

Posted on 2011-03-08
8
247 Views
Last Modified: 2012-05-11
The data set has dupes.  It is data from a call center, where there actually may be several records written for a specific social security number, before the call is done.  My result-set from the logic below is returning the data that is there, where unfortunately, there are often dupes.  I used DISTINCT, but a record may be there for one customer three or four times, with a different phone number, or reference.  So, that makes it not really a dupe -- and it is still returned multiple times in the result-set, even with the DISTINCT.

IF SSN is there more than once, I just want to return one value.  I don't even care which one it is, just one value per ssn.  Any suggestsions on the most optimal way to handle this, would be truly appreciated.


BEGIN
	IF(@DateCreated IS NULL)
	BEGIN
		SELECT @DateCreated = MAX(CONVERT(CHAR(10),tl.DateCreated,121)) FROM dbo.TBL_LoanApplications tl
	END
			
	SELECT 
		  '111' [group_id],
		  '999' [account_id],
		  '444' [location_id],
		  'CC' [inquiry_tradeline_type],
		  'BatchInquiry' [control_file_name],
		  FirstName [first_name],
		  LastName [last_name],
		  SSN [social_security_number],
		  REPLACE([Address],',','') [street_address_1],
		  City [city],
		  [State] [state],
		  Zip [zip_code],
		  HomePhone [home_phone],
		  WorkPhone [work_phone],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DOB,101),121) [date_of_birth],
		  DlState [drivers_license_state],
		  DlNr [drivers_license_number],
		  [net_monthly_income] = CASE WHEN PayFrequency IN ('W','X') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 52/12 AS INT)
						WHEN PayFrequency = 'B' THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT)
						WHEN PayFrequency IN('I','S','F','T') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 24/12 AS INT)
						WHEN PayFrequency IN ('M','E','O','C','D') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT) 
				ELSE 0 END,
		  Email [email_address],
		  ABANr [bank_routing_number],
		  AccountNr [bank_account_number],
		  AccountType [bank_account_type],
		  REPLACE(EmployerName,',','') [employer_name],
		  EmploymentType [occupation_type],
		  [months_at_current_employer] = CASE WHEN EmploymentDate >DATEADD(dd,DATEDIFF(dd,0,GETDATE()), 0) THEN 0 
											  WHEN EmploymentDate IS NULL THEN 0
									ELSE DATEDIFF(mm,EmploymentDate,GETDATE()) END,
		  [pay_frequency] = CASE WHEN PayFrequency IN('W','X') THEN 'Weekly'
								 WHEN PayFrequency = 'B' THEN 'Biweekly'
								 WHEN PayFrequency IN('I','S','F','T') THEN 'Semimonthly'
								 WHEN PayFrequency IN('M','E','O','C','D') THEN 'Monthly' END, 						 
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,NextPayDate,101),121) [date_of_next_payday],
		  CASE WHEN Ref1FirstName = '0' THEN '' ELSE Ref1FirstName END [reference_first_name],
		  CASE WHEN Ref1LastName = '0' THEN '' ELSE Ref1LastName END [reference_last_name],
		  Ref1Phone [reference_phone],
		  Ref1Relationship [reference_relationship],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DateCreated,101),121) [inquiry_received_at]
	FROM
		  dbo.TABLENAME WITH (NOLOCK)
	WHERE
		 (@DateCreated IS NULL OR CONVERT(CHAR(10),DateCreated,121) = @DateCreated)
		  AND FirstName <> ''
		  
		  IF @@ERROR <> 0
		  BEGIN
				RAISERROR('Failure executing procedureName.',16,1)
				RETURN;
		  END
END

SET NOCOUNT OFF;

Open in new window

0
Comment
Question by:dbaSQL
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 11

Expert Comment

by:N R
ID: 35071480
Why not put max aggregate on any of the fields like phone number or reference then do a group by on the rest?
0
 
LVL 50

Accepted Solution

by:
Lowfatspread earned 500 total points
ID: 35071496
use a windowing function

e.g. row_number()



BEGIN
	IF(@DateCreated IS NULL)
	BEGIN
		SELECT @DateCreated = MAX(CONVERT(CHAR(10),tl.DateCreated,121)) FROM dbo.TBL_LoanApplications tl
	END
	Select x.* 
	  from (	
	SELECT 
		  '111' [group_id],
		  '999' [account_id],
		  '444' [location_id],
		  'CC' [inquiry_tradeline_type],
		  'BatchInquiry' [control_file_name],
		  FirstName [first_name],
		  LastName [last_name],
		  SSN [social_security_number],
		  REPLACE([Address],',','') [street_address_1],
		  City [city],
		  [State] [state],
		  Zip [zip_code],
		  HomePhone [home_phone],
		  WorkPhone [work_phone],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DOB,101),121) [date_of_birth],
		  DlState [drivers_license_state],
		  DlNr [drivers_license_number],
		  [net_monthly_income] = CASE WHEN PayFrequency IN ('W','X') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 52/12 AS INT)
						WHEN PayFrequency = 'B' THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT)
						WHEN PayFrequency IN('I','S','F','T') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 24/12 AS INT)
						WHEN PayFrequency IN ('M','E','O','C','D') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT) 
				ELSE 0 END,
		  Email [email_address],
		  ABANr [bank_routing_number],
		  AccountNr [bank_account_number],
		  AccountType [bank_account_type],
		  REPLACE(EmployerName,',','') [employer_name],
		  EmploymentType [occupation_type],
		  [months_at_current_employer] = CASE WHEN EmploymentDate >DATEADD(dd,DATEDIFF(dd,0,GETDATE()), 0) THEN 0 
											  WHEN EmploymentDate IS NULL THEN 0
									ELSE DATEDIFF(mm,EmploymentDate,GETDATE()) END,
		  [pay_frequency] = CASE WHEN PayFrequency IN('W','X') THEN 'Weekly'
								 WHEN PayFrequency = 'B' THEN 'Biweekly'
								 WHEN PayFrequency IN('I','S','F','T') THEN 'Semimonthly'
								 WHEN PayFrequency IN('M','E','O','C','D') THEN 'Monthly' END, 						 
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,NextPayDate,101),121) [date_of_next_payday],
		  CASE WHEN Ref1FirstName = '0' THEN '' ELSE Ref1FirstName END [reference_first_name],
		  CASE WHEN Ref1LastName = '0' THEN '' ELSE Ref1LastName END [reference_last_name],
		  Ref1Phone [reference_phone],
		  Ref1Relationship [reference_relationship],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DateCreated,101),121) [inquiry_received_at]
		  
		  ,ROW_NUMBER() over (PARTITION by  ssn order by datecreated desc) as rn
	FROM
		  dbo.TABLENAME WITH (NOLOCK)
	WHERE
		 (@DateCreated IS NULL OR CONVERT(CHAR(10),DateCreated,121) = @DateCreated)
		  AND FirstName <> ''
	) as x
	Where rn=1	  
		  IF @@ERROR <> 0
		  BEGIN
				RAISERROR('Failure executing procedureName.',16,1)
				RETURN;
		  END
END

SET NOCOUNT OFF;

Open in new window

0
 
LVL 41

Expert Comment

by:Sharath
ID: 35071526
Try ROW_NUMBER.
declare @DateCreated date
BEGIN
	IF(@DateCreated IS NULL)
	BEGIN
		SELECT @DateCreated = MAX(CONVERT(CHAR(10),tl.DateCreated,121)) FROM dbo.TBL_LoanApplications tl
	END
	
	SELECT * FROM (		
	SELECT 
		  '111' [group_id],
		  '999' [account_id],
		  '444' [location_id],
		  'CC' [inquiry_tradeline_type],
		  'BatchInquiry' [control_file_name],
		  FirstName [first_name],
		  LastName [last_name],
		  SSN [social_security_number],
		  REPLACE([Address],',','') [street_address_1],
		  City [city],
		  [State] [state],
		  Zip [zip_code],
		  HomePhone [home_phone],
		  WorkPhone [work_phone],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DOB,101),121) [date_of_birth],
		  DlState [drivers_license_state],
		  DlNr [drivers_license_number],
		  [net_monthly_income] = CASE WHEN PayFrequency IN ('W','X') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 52/12 AS INT)
						WHEN PayFrequency = 'B' THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT)
						WHEN PayFrequency IN('I','S','F','T') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 24/12 AS INT)
						WHEN PayFrequency IN ('M','E','O','C','D') THEN CAST(CAST(REPLACE(NetPayPerCheck,'.00','') AS FLOAT) * 26/12 AS INT) 
				ELSE 0 END,
		  Email [email_address],
		  ABANr [bank_routing_number],
		  AccountNr [bank_account_number],
		  AccountType [bank_account_type],
		  REPLACE(EmployerName,',','') [employer_name],
		  EmploymentType [occupation_type],
		  [months_at_current_employer] = CASE WHEN EmploymentDate >DATEADD(dd,DATEDIFF(dd,0,GETDATE()), 0) THEN 0 
											  WHEN EmploymentDate IS NULL THEN 0
									ELSE DATEDIFF(mm,EmploymentDate,GETDATE()) END,
		  [pay_frequency] = CASE WHEN PayFrequency IN('W','X') THEN 'Weekly'
								 WHEN PayFrequency = 'B' THEN 'Biweekly'
								 WHEN PayFrequency IN('I','S','F','T') THEN 'Semimonthly'
								 WHEN PayFrequency IN('M','E','O','C','D') THEN 'Monthly' END, 						 
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,NextPayDate,101),121) [date_of_next_payday],
		  CASE WHEN Ref1FirstName = '0' THEN '' ELSE Ref1FirstName END [reference_first_name],
		  CASE WHEN Ref1LastName = '0' THEN '' ELSE Ref1LastName END [reference_last_name],
		  Ref1Phone [reference_phone],
		  Ref1Relationship [reference_relationship],
		  +' '+CONVERT(VARCHAR(10),CONVERT(DATETIME,DateCreated,101),121) [inquiry_received_at],
		  row_number() over (partition by SSN order by FirstName) rn
	FROM
		  dbo.TABLENAME WITH (NOLOCK)
	WHERE
		 (@DateCreated IS NULL OR CONVERT(CHAR(10),DateCreated,121) = @DateCreated)
		  AND FirstName <> '') t1 WHERE rn = 1
		  
		  IF @@ERROR <> 0
		  BEGIN
				RAISERROR('Failure executing procedureName.',16,1)
				RETURN;
		  END
END

SET NOCOUNT OFF;

Open in new window

0
Backup Solution for AWS

Read about how CloudBerry Backup fully integrates your backups with Amazon S3 and Amazon Glacier to provide military-grade encryption and dramatically cut storage costs on any platform.

 
LVL 17

Author Comment

by:dbaSQL
ID: 35071589
Thank you, Lowfat!  I am embarrassed that I didn't think of that.  Unfortunately, about 30 seconds into it, I fail with this:

An error occurred while executing batch. Error message is: Exception of type 'System.OutOfMemoryException' was thrown.
0
 
LVL 17

Author Comment

by:dbaSQL
ID: 35071638
The server has 26G, and 20G is my max sql server memory.
0
 
LVL 17

Author Comment

by:dbaSQL
ID: 35071665
Disregard.  I think that was due to my client.  I tried a couple other execs (ie sp_who2, sp_configure), they both failed with the same memory exception.  But, when I connected directly to the server, and executed it on the erver, the errors have subsided.

Let me check the data a bit.  Back shortly.
0
 
LVL 17

Author Comment

by:dbaSQL
ID: 35071714
Perfect, Lowfat!  Thank you each for looking.  
0
 
LVL 17

Author Closing Comment

by:dbaSQL
ID: 35071732
Excellent.
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Data architecture is an important aspect in Software as a Service (SaaS) delivery model. This article is a study on the database of a single-tenant application that could be extended to support multiple tenants. The application is web-based develope…
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an antispam), the admini…

735 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question