• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 216
  • Last Modified:

VB.Net - Designing Duplicate Search Across Multiple Databases

Good Day Experts:

In my VB.Net project I am having a speed/performance issue with my Duplicate record search.  I need to find a better way to do it and I hope you can help.  

I have 5 SQL databases each representing an Account.  All 5 of them belong to the same Company.  Accounts are processed 1 time a week on Friday.  I need to take ID's from Account 1 and look those ID's in all of the data in Account 2, Account 3, Account 4 and Account 5 to see if they have been used before for a given week.   For example, today is 5/8 so I to take the ID's for Account 1 for the week of 5/8 and search through Account 2, Account 3, Account 4 and Account 5 looking for re-used ID's.  Then  I have to take the ID's for Account 2 and search all of the other Accounts.  Then repeat for the remaining Accounts.

I have the looping constructs in place to handle all the Account iteration.  I also have my query in place to look for the duplicates. The problem is the performance is very slow on the query...then performing it multiple time is causing trouble.  

Here is what I have going in my qry:
Select [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " with (nolock)
Where [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]) in
(Select [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
 from " & AccountNumberStr & " with (nolock) where [process week]= '" & txtDate.Text & "')
Group by [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])

Do you think there is a better way to design the query with performance in mind?

Thanks,
jimbo99999
0
Jimbo99999
Asked:
Jimbo99999
  • 5
  • 4
  • 3
2 Solutions
 
SStoryCommented:
My first question is why are their duplicate records for sales?  Is it just that they have the same ID numbers and something should be changed to make those universally unique?

If not, my next question is do you have the Pro Number indexed?  Also any function calls probably increase the time--not sure how much and depends on number of records.
0
 
Jimbo99999Author Commented:
Hello Thanks for replying.  

We receive this shipment information from the actual Company so we are doing this 3rd party.    As a result, we are checking to make sure the shipment information is no duplicated on "accident" in that shipment information.  

I wonder if some kind of Join might be better?

Thanks,
jimbo99999
0
 
SStoryCommented:
I'm still a little confused. Do you compare ID's in a table in one SQL Database with a table in another SQL Database because nothing suggests this from the query shown. Or are you just checking in one database for multiple usages of a single ID and then repeating the process in each database? Or is it actually multiple tables?
0
Cloud Class® Course: Microsoft Azure 2017

Azure has a changed a lot since it was originally introduce by adding new services and features. Do you know everything you need to about Azure? This course will teach you about the Azure App Service, monitoring and application insights, DevOps, and Team Services.

 
Jimbo99999Author Commented:
I check across databases. I just have different table variables in each part of the query.

Select [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " with (nolock)
Where [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]) in
(Select [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
 from " & AccountNumberStr & " with (nolock) where [process week]= '" & txtDate.Text & "')
Group by [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
0
 
SStoryCommented:
So none of this  "[Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER]" is a uniquely identifying value?

My thought is that it is not able to utilize quicker indexing because of all the where values that must be met.
I would try joining the tables on a single ID number that uniquely represents the order, make sure that field has indexing set up for it on both ends and then add the other parts in the where clause and see if that makes it any faster.

MySQL has an EXPLAIN statement you can add in from of the SELECT to see how it is getting values, whether the brute force (long way) or via index, etc. I don't think SQL Server has that feature.

I read online where some says to:
" Execute thecommand SET SHOWPLAN_TEXT ON and then execute the query. This will display the  query execution plan in text format." So you might check on the SET SHOWPLAN_TEXT ON and see how it is gathering your values.  

Via indexing where it makes sense (by adding as needed) will improve many situations the most. If you don't have to CONVERT() call a function each time, that might improve performance too.
0
 
Vitor MontalvãoMSSQL Senior EngineerCommented:
Yes, there's a better way. Avoid using function as CONVERT since they are performance killer specially when used on indexed fields and try to join fields to reduce the number of returned records instead of bringing them all. So, this is my suggestion:
Select A1.[Pro Number],A1.SCAC,A1.[PRO CODE],A1.[SALES ORDER NUMBER],A1.CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " A1 with (nolock)
Where A1.[process week]= '" & txtDate.Text & "'
	And Exists
		 (Select 1
		 from " & AccountNumberStr & " A2 with (nolock) 
		 where A2.[Pro Number]=A1.[Pro Number]
			And A2.SCAC=A1.SCAC
			And A2.[PRO CODE]=A1.[PRO CODE]
			And A2.[SALES ORDER NUMBER]=A1.[SALES ORDER NUMBER]
			And A2.[AMOUNT PAID]=A1.[AMOUNT PAID]
			And A2.[process week]=A1.[process week])

Open in new window

0
 
Jimbo99999Author Commented:
Thanks for responding.  I am looking at it now.
0
 
Jimbo99999Author Commented:
Excellent recommendations! It is far more efficient and working great.

Thanks,
jimbo99999
0
 
Vitor MontalvãoMSSQL Senior EngineerCommented:
How much gain you had? Few seconds? Large seconds? Minutes?
0
 
Jimbo99999Author Commented:
I went from timing out to about a minute for the one account whose table has 570,000 plus records!
0
 
SStoryCommented:
Yes, a key of good SQL speed is to get just what you want and to index any where possible and useful. This is generally on the join fields and those in where clauses.
0
 
Vitor MontalvãoMSSQL Senior EngineerCommented:
Yes, at least [process week] should be indexed.
And this composed index:
[Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],[AMOUNT PAID],[process week]
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

  • 5
  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now