Link to home
Create AccountLog in
Avatar of Whah
Whah

asked on

Count characters query

MSSQL 2008

Table fields:
FirstName
LastName
PhoneNum
Address
SubmitDate

I want to eliminate duplicate entries from my result set.  The rules I'm trying to follow are below:

First search for duplicate LastNames values.  For duplicate LastNames, count the characters in the FirstName, LastName, PhoneNum, and Address fields.  Whichever duplicate has the largest number of characters in the counted fields should be returned in the result set.  If the duplicates have the same number of characters in the counted fields then return the one with the newest SubmitDate.

Example:
Bill
Bob
3334893348
123 Ocean st.
2012-11-06 12:03:24

William
Bob
3334893348
123 Ocean Street
2012-12-25 09:12:34

Jane
Smith
1289478576
222 Road Ave.
2012-02-22 10:02:00

Jane
Smith
1289478576
222 Road Ave.
2012-05-18 07:34:59

Result set:
William|Bob|3334893348|123 Ocean Street|2012-12-25 09:12:34
This row was returned because it had more characters than the other "Bob" last name duplicate.

Jane|Smith|1289478576|222 Road Ave.|2012-05-18 07:34:59
This one had the same number of characters as the other "Smith" last name duplicate so I looked to the SubmitDate for the newest.  That is why this row was returned.
Avatar of Member_2_861731
Member_2_861731
Flag of Canada image

Give this a try.
Just out of the top of my head it seems like it should work.

DECLARE @t TABLE (
FirstName varchar(50),
LastName varchar(50),
PhoneNum varchar(20),
[Address] varchar(100),
SubmitDate datetime
)

INSERT INTO @t
SELECT 'Bill','Bob','3334893348','123 Ocean st.','2012-11-06 12:03:24'
UNION ALL
SELECT 'William','Bob','3334893348','123 Ocean Street','2012-12-25 09:12:34'
UNION ALL
SELECT 'Jane','Smith','1289478576','222 Road Ave.','2012-02-22 10:02:00'
UNION ALL 
SELECT 'Jane','Smith','1289478576','222 Road Ave.','2012-05-18 07:34:59'
UNION ALL
SELECT 'Christopher','Columbus','8521234567','1201 Any Road',GETDATE()

SELECT *
FROM @T T INNER JOIN (SELECT LastName, 
	MAX(LEN(FirstName+PhoneNum+[Address])) MaxTextLength,
	MAX(SubmitDate) MaxDate
	FROM @t
	GROUP BY LastName
	HAVING COUNT(1)>1) Comp ON
		T.LastName=Comp.LastName
WHERE
	LEN(T.FirstName+T.PhoneNum+T.[Address]) =	CASE 
		WHEN LEN(T.FirstName+T.PhoneNum+T.[Address])=Comp.MaxTextLength 
			THEN LEN(T.FirstName+T.PhoneNum+T.[Address])
		ELSE
			Comp.MaxTextLength
		END
	AND T.SubmitDate = CASE
		WHEN LEN(T.FirstName+T.PhoneNum+T.[Address])=Comp.MaxTextLength 
			THEN COMP.MaxDate
		ELSE
			T.SubmitDate
		END

Open in new window


Let us know.
Avatar of Whah
Whah

ASKER

That works pretty well.  How would I use data in a table rather than supplied values in the "INSERT INTO" statement?
ASKER CERTIFIED SOLUTION
Avatar of Member_2_861731
Member_2_861731
Flag of Canada image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account