VB.Net - Designing Duplicate Search Across Multiple Databases

Good Day Experts:

In my VB.Net project I am having a speed/performance issue with my Duplicate record search.  I need to find a better way to do it and I hope you can help.  

I have 5 SQL databases each representing an Account.  All 5 of them belong to the same Company.  Accounts are processed 1 time a week on Friday.  I need to take ID's from Account 1 and look those ID's in all of the data in Account 2, Account 3, Account 4 and Account 5 to see if they have been used before for a given week.   For example, today is 5/8 so I to take the ID's for Account 1 for the week of 5/8 and search through Account 2, Account 3, Account 4 and Account 5 looking for re-used ID's.  Then  I have to take the ID's for Account 2 and search all of the other Accounts.  Then repeat for the remaining Accounts.

I have the looping constructs in place to handle all the Account iteration.  I also have my query in place to look for the duplicates. The problem is the performance is very slow on the query...then performing it multiple time is causing trouble.  

Here is what I have going in my qry:
Select [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " with (nolock)
Where [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]) in
(Select [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
 from " & AccountNumberStr & " with (nolock) where [process week]= '" & txtDate.Text & "')
Group by [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])

Do you think there is a better way to design the query with performance in mind?

Thanks,
jimbo99999
Jimbo99999Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SStoryCommented:
My first question is why are their duplicate records for sales?  Is it just that they have the same ID numbers and something should be changed to make those universally unique?

If not, my next question is do you have the Pro Number indexed?  Also any function calls probably increase the time--not sure how much and depends on number of records.
0
Jimbo99999Author Commented:
Hello Thanks for replying.  

We receive this shipment information from the actual Company so we are doing this 3rd party.    As a result, we are checking to make sure the shipment information is no duplicated on "accident" in that shipment information.  

I wonder if some kind of Join might be better?

Thanks,
jimbo99999
0
SStoryCommented:
I'm still a little confused. Do you compare ID's in a table in one SQL Database with a table in another SQL Database because nothing suggests this from the query shown. Or are you just checking in one database for multiple usages of a single ID and then repeating the process in each database? Or is it actually multiple tables?
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

Jimbo99999Author Commented:
I check across databases. I just have different table variables in each part of the query.

Select [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " with (nolock)
Where [Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID]) in
(Select [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
 from " & AccountNumberStr & " with (nolock) where [process week]= '" & txtDate.Text & "')
Group by [pro number],SCAC,[PRO CODE],[SALES ORDER NUMBER],CONVERT(varchar,[AMOUNT PAID])
0
SStoryCommented:
So none of this  "[Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER]" is a uniquely identifying value?

My thought is that it is not able to utilize quicker indexing because of all the where values that must be met.
I would try joining the tables on a single ID number that uniquely represents the order, make sure that field has indexing set up for it on both ends and then add the other parts in the where clause and see if that makes it any faster.

MySQL has an EXPLAIN statement you can add in from of the SELECT to see how it is getting values, whether the brute force (long way) or via index, etc. I don't think SQL Server has that feature.

I read online where some says to:
" Execute thecommand SET SHOWPLAN_TEXT ON and then execute the query. This will display the  query execution plan in text format." So you might check on the SET SHOWPLAN_TEXT ON and see how it is gathering your values.  

Via indexing where it makes sense (by adding as needed) will improve many situations the most. If you don't have to CONVERT() call a function each time, that might improve performance too.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
Yes, there's a better way. Avoid using function as CONVERT since they are performance killer specially when used on indexed fields and try to join fields to reduce the number of returned records instead of bringing them all. So, this is my suggestion:
Select A1.[Pro Number],A1.SCAC,A1.[PRO CODE],A1.[SALES ORDER NUMBER],A1.CONVERT(varchar,[AMOUNT PAID]
From " & AccountNumber2Str & " A1 with (nolock)
Where A1.[process week]= '" & txtDate.Text & "'
	And Exists
		 (Select 1
		 from " & AccountNumberStr & " A2 with (nolock) 
		 where A2.[Pro Number]=A1.[Pro Number]
			And A2.SCAC=A1.SCAC
			And A2.[PRO CODE]=A1.[PRO CODE]
			And A2.[SALES ORDER NUMBER]=A1.[SALES ORDER NUMBER]
			And A2.[AMOUNT PAID]=A1.[AMOUNT PAID]
			And A2.[process week]=A1.[process week])

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Jimbo99999Author Commented:
Thanks for responding.  I am looking at it now.
0
Jimbo99999Author Commented:
Excellent recommendations! It is far more efficient and working great.

Thanks,
jimbo99999
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
How much gain you had? Few seconds? Large seconds? Minutes?
0
Jimbo99999Author Commented:
I went from timing out to about a minute for the one account whose table has 570,000 plus records!
0
SStoryCommented:
Yes, a key of good SQL speed is to get just what you want and to index any where possible and useful. This is generally on the join fields and those in where clauses.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
Yes, at least [process week] should be indexed.
And this composed index:
[Pro Number],SCAC,[PRO CODE],[SALES ORDER NUMBER],[AMOUNT PAID],[process week]
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic.NET

From novice to tech pro — start learning today.