LINQ List searching performance

Hi Experts,
    I recently wrote a linq query to do a search in a list of items. Each item in my list contains 50 fields, now I have two lists, their set ups and fields are all the same. List A contains 100 records, List B contains 10000 records.  What i need to do is using items in list a to search within List B to find its match.
   That's simple enough and I manage to do that. however, If I increase my List B size to 100000 records, and using the same List A, the speed it does the job will be slower. Is there an efficient way of doing this other than what i did?

My linq query:
       For Each pmTransaction In ListA
                Dim tran As New Trans
                tran = pmTransaction
                Dim tranExist = (From tr As Trans In ListB Where _
                                 tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                 tr.externref = tran.externref And _
                                 tr.descriptn = tran.descriptn And _
                                 tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault

Open in new window

LVL 2
miketonnyAsked:
Who is Participating?
 
nepaluzConnect With a Mentor Commented:
Actually, I think we were fixated on LINQ (and somehow it clouded our thinking here!). Try:
Dim tranExist = ListA.Intersect(ListB)

Open in new window

That should give you alist of all common occurancesin both lists. As the meerkat says, SIMPLESSSS!
0
 
nepaluzCommented:
Your code looks incomplete, however, a couple of questions.
1. When you say the speed it does the job will be slower, do you mean that the speed you have noted IS sower or MAY be slower?
2. What is / how do you define Trans in your code?
3. Have you tried a database (I say this because bydesign, databases are meant to hold huge records are are optimised for speed)
0
 
miketonnyAuthor Commented:
hi nepaluz,
   It is slower, I ran 1 week's records took me 5mins, but when i try 1 month's records it then took me nearlly 2 hours, the records are spreading evenly for every week. I'm not sure why is it taking longer than 20mins when i'm trying to run 1month's records.
  "trans" is a class for transactions, contains 50 fields which i read the data from database and store them into these two lists. if that answers your question?
0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
miketonnyAuthor Commented:
to add some information, When i put the program on server and test, it consumes 100% of 1 CPU core(on a 4core machine) when it's doing the searching. so i guess if i increase the CPU speed it'll be faster? How about multicore?
0
 
nepaluzCommented:
I hink it would be better to run your queries on the database rather than into lists. However if you chose to persue this avenue, is that the complete loop (i.e should I assume the next line is Next?) Also, what version of .NET are you running this on?
0
 
nepaluzCommented:
You can utilise a Parallel. ForEach to run this loop and improve both performance and CPU usage.
0
 
nepaluzCommented:
Something like this may improve your performance (and use more of your cores)
Threading.Tasks.Parallel.ForEach(_otherCurrency, Sub(pmTransaction)
                                                     Dim tran As New Trans
                                                     tran = pmTransaction
                                                     Dim tranExist = (From tr As Trans In ListB Where _
                                                                      tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                                      tr.externref = tran.externref And _
                                                                      tr.descriptn = tran.descriptn And _
                                                                      tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                                  End Sub)

Open in new window

0
 
nepaluzCommented:
Also, why do you not define ListA as a ist of Trans, e.g
Dim ListA As New List(Of Trans)

Open in new window

Also, I erred on the code above, shoud be:
Threading.Tasks.Parallel.ForEach(ListA, Sub(pmTransaction)
                                                     Dim tran As New Trans
                                                     tran = pmTransaction
                                                     Dim tranExist = (From tr As Trans In ListB Where _
                                                                      tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                                      tr.externref = tran.externref And _
                                                                      tr.descriptn = tran.descriptn And _
                                                                      tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                                  End Sub)

Open in new window

0
 
miketonnyAuthor Commented:
Sry was my mistake, it's a complete for loop, i forgot to paste the next on it.
I did declare my List A as list of trans

I'm using VS 2008 which has .Net 3.5.
I was actually reading the PLINQ on the internet, but a lot of sources said it's only for .NET 4.0, is that right?
0
 
miketonnyAuthor Commented:
I asked my colleague how he would handle that, he said using a sorted list could be faster than LINQ (he doesn't do LINQ) as sorted list is using binary search, could that be a way of dealing with this?
0
 
nepaluzCommented:
Not sure what you are trying to achieve now. Sorted list to accomplish a search? I must have missed something  .....
Anyhow, to continue with my suggestion, if you have ListA declared as a List(Of Trans), then you can improve on memory by just doing:
Threading.Tasks.Parallel.ForEach(ListA, Sub(tran)
                                            Dim tranExist = (From tr As Trans In ListB Where _
                                                             tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                             tr.externref = tran.externref And _
                                                             tr.descriptn = tran.descriptn And _
                                                             tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                        End Sub)
GC.Collect()

Open in new window

Since you are dealing with hundreds of thousands of lines, re-declaring Dim tran As New Trans inside the loop will result in (a small) but additional usage of memory for each declaration, and withyour hundreds of thousands of lines, it does add up!
I have also added a GC.Collect() at the end of the routine.
0
 
miketonnyAuthor Commented:
umm, under threading I couldn't find tasks.  does that mean i don't have that in my .NET 3.5?
but ya thanks for pointing out that its better to declare that variable outside the loop, i didn't realize that.
0
 
nepaluzCommented:
You are right about 3.5, but did you know that youcan actually use .NET 4.0 in VS 2008? Just install the .NET 4.0 SDK (thats if you are not on a work machine!)
Best of luck with the rest then .......
0
 
miketonnyAuthor Commented:
if in that case, all the servers that's gonna run the program will need .NET 4.0 then?
I'll dig a little into this, but PLINQ does seem to be a good way
0
 
CodeCruiserCommented:
You did not answer the question (or I did not find it) "why lists?"!

The obvious improvement that can be made (but may not make a huge difference) is



For Each tran As Trans In ListA
                Dim tranExist = (From tr As Trans In ListB Where _
                                 tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                 tr.externref = tran.externref And _
                                 tr.descriptn = tran.descriptn And _
                                 tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault

Open in new window

0
 
miketonnyAuthor Commented:
@ CodeCruiser, I'm used to use lists to do these kind of things, is there a more efficient way of doing such? something like query through database to do the same?

@nepaluz, that does look simple enough! I'll test a little to see how it goes.
0
 
CodeCruiserCommented:
Yes this can be done on the DB. Are the lists being populated from DB?
0
 
miketonnyAuthor Commented:
Yes they're all from the same table in foxpro database.
so in VB shall i just write a long query to do the same thing when i bring in these?
would this hold up the database for too long?
0
 
CodeCruiserConnect With a Mentor Commented:
Oh Foxpro. If foxpro supports the TSQL as any other DB then this should be straight forward. Otherwise, you can fill a DataTable and use the RowFilter to do this.
0
 
miketonnyAuthor Commented:
thank you both for the help on this problem, I learnt something new on this :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.