[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 166
  • Last Modified:

Help with SELECT to find duplicate records

I have the start of a query--but it's not complete (this is where I need help)--that finds duplicate records based on the criteria below.  Basically, I need to identify any companies where the company name, address1, and city are similar and the state and zip code are the same....My criteria and the start of the query are below, but I have no idea how to add in the rest.  Help?

LIKE: Company name (comp_name)
LIKE: Address 1 (addr_address1)
LIKE: City (addr_city)
EQUAL: State (addr_state)
EQUAL: Zip code (addr_postcode)

SELECT Comp_Name, Addr_Address1, Addr_City, Addr_State, Addr_PostCode, Count(*)
FROM vSearchListCompany
WHERE  Comp_Deleted is null
GROUP BY Comp_Name, Addr_Address1, Addr_City, Addr_State, Addr_PostCode
HAVING Count(*) > 1
0
saturation
Asked:
saturation
  • 3
  • 3
1 Solution
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
>where the company name, address1, and city are similar
Define 'are similar', as you can't group based on a LIKE comparison.

Otherwise, lose the COUNT(*)
SELECT Comp_Name, Addr_Address1, Addr_City, Addr_State, Addr_PostCode,
FROM vSearchListCompany
WHERE  Comp_Deleted is null
GROUP BY Comp_Name, Addr_Address1, Addr_City, Addr_State, Addr_PostCode
HAVING Count(Addr_PostCode) > 1

Open in new window

0
 
saturationAuthor Commented:
I can lose the COUNT(*).   Also, "are similar" would be where the company name, for example, is like any other company name in the system using the LIKE statement and % symbols.  Does that help?  How would I structure the query that way without the count?
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
>Also, "are similar" would be where the company name, for example, is like any other company name in the system using the LIKE statement and % symbols.  Does that help?
No.  Again, you can't GROUP BY using a LIKE.  

Would Microsoft be like Microstrategy?  
Would Experts Exchange be like Expert Sex Change?
What criteria would you use to define 'like'?
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
saturationAuthor Commented:
I'm unsure, actually--never written a query like this. Any rules you would suggest starting with looking for similar company names?   The company names would be typed in by human beings, we may ignore any "inc.","co.", or "LLC" at the end...Does that help?  I guess that's a start.
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
>we may ignore any "inc.","co.", or "LLC" at the end...Does that help?
That's a start.  You'll have to define this in excruitiatingly clear terms, as SQL has no clue if inc is the same as co.

I believe SSIS has a Fuzzy Grouping task that performs this well
T-SQL SOUNDEX also shows some promise, but I haven't used either of these so I can't comment on it.

Otherwise, If we're just talking 'give me the count of rows where company name fragments in table a are found in the company name of table b, then using a subquery it would go like this
CREATE TABLE #a (company_name varchar(255))
CREATE TABLE #b (company_name varchar(255))
INSERT INTO #a (company_name) values ('goo'), ('foo'), ('boo'), ('bar'), ('baz')
INSERT INTO #b (company_name) values ('goofyriffic'), ('banana'), ('oo'), ('bazinga'), ('googoogaga')

SELECT a_name, COUNT(b_name) as the_count
FROM (
	SELECT #a.company_name as a_name, #b.company_name as b_name
	FROM #a 
		JOIN #b ON #b.company_name LIKE CONCAT('%', #a.company_name, '%')) a
GROUP BY a_name

Open in new window

0
 
saturationAuthor Commented:
I just did a little bit of research on Fuzzy Grouping in SSIS and was able to get what I want--I had no idea anything like that existed, and it was rather easy.  Thank you so much for the recommendation and time!
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now