Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

LINQ List searching performance

Posted on 2011-09-11
20
Medium Priority
?
273 Views
Last Modified: 2012-05-12
Hi Experts,
    I recently wrote a linq query to do a search in a list of items. Each item in my list contains 50 fields, now I have two lists, their set ups and fields are all the same. List A contains 100 records, List B contains 10000 records.  What i need to do is using items in list a to search within List B to find its match.
   That's simple enough and I manage to do that. however, If I increase my List B size to 100000 records, and using the same List A, the speed it does the job will be slower. Is there an efficient way of doing this other than what i did?

My linq query:
       For Each pmTransaction In ListA
                Dim tran As New Trans
                tran = pmTransaction
                Dim tranExist = (From tr As Trans In ListB Where _
                                 tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                 tr.externref = tran.externref And _
                                 tr.descriptn = tran.descriptn And _
                                 tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault

Open in new window

0
Comment
Question by:miketonny
  • 9
  • 8
  • 3
20 Comments
 
LVL 17

Expert Comment

by:nepaluz
ID: 36520342
Your code looks incomplete, however, a couple of questions.
1. When you say the speed it does the job will be slower, do you mean that the speed you have noted IS sower or MAY be slower?
2. What is / how do you define Trans in your code?
3. Have you tried a database (I say this because bydesign, databases are meant to hold huge records are are optimised for speed)
0
 
LVL 2

Author Comment

by:miketonny
ID: 36520348
hi nepaluz,
   It is slower, I ran 1 week's records took me 5mins, but when i try 1 month's records it then took me nearlly 2 hours, the records are spreading evenly for every week. I'm not sure why is it taking longer than 20mins when i'm trying to run 1month's records.
  "trans" is a class for transactions, contains 50 fields which i read the data from database and store them into these two lists. if that answers your question?
0
 
LVL 2

Author Comment

by:miketonny
ID: 36520369
to add some information, When i put the program on server and test, it consumes 100% of 1 CPU core(on a 4core machine) when it's doing the searching. so i guess if i increase the CPU speed it'll be faster? How about multicore?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 17

Expert Comment

by:nepaluz
ID: 36520372
I hink it would be better to run your queries on the database rather than into lists. However if you chose to persue this avenue, is that the complete loop (i.e should I assume the next line is Next?) Also, what version of .NET are you running this on?
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 36520376
You can utilise a Parallel. ForEach to run this loop and improve both performance and CPU usage.
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 36520391
Something like this may improve your performance (and use more of your cores)
Threading.Tasks.Parallel.ForEach(_otherCurrency, Sub(pmTransaction)
                                                     Dim tran As New Trans
                                                     tran = pmTransaction
                                                     Dim tranExist = (From tr As Trans In ListB Where _
                                                                      tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                                      tr.externref = tran.externref And _
                                                                      tr.descriptn = tran.descriptn And _
                                                                      tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                                  End Sub)

Open in new window

0
 
LVL 17

Expert Comment

by:nepaluz
ID: 36520401
Also, why do you not define ListA as a ist of Trans, e.g
Dim ListA As New List(Of Trans)

Open in new window

Also, I erred on the code above, shoud be:
Threading.Tasks.Parallel.ForEach(ListA, Sub(pmTransaction)
                                                     Dim tran As New Trans
                                                     tran = pmTransaction
                                                     Dim tranExist = (From tr As Trans In ListB Where _
                                                                      tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                                      tr.externref = tran.externref And _
                                                                      tr.descriptn = tran.descriptn And _
                                                                      tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                                  End Sub)

Open in new window

0
 
LVL 2

Author Comment

by:miketonny
ID: 36520464
Sry was my mistake, it's a complete for loop, i forgot to paste the next on it.
I did declare my List A as list of trans

I'm using VS 2008 which has .Net 3.5.
I was actually reading the PLINQ on the internet, but a lot of sources said it's only for .NET 4.0, is that right?
0
 
LVL 2

Author Comment

by:miketonny
ID: 36520468
I asked my colleague how he would handle that, he said using a sorted list could be faster than LINQ (he doesn't do LINQ) as sorted list is using binary search, could that be a way of dealing with this?
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 36522200
Not sure what you are trying to achieve now. Sorted list to accomplish a search? I must have missed something  .....
Anyhow, to continue with my suggestion, if you have ListA declared as a List(Of Trans), then you can improve on memory by just doing:
Threading.Tasks.Parallel.ForEach(ListA, Sub(tran)
                                            Dim tranExist = (From tr As Trans In ListB Where _
                                                             tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                                             tr.externref = tran.externref And _
                                                             tr.descriptn = tran.descriptn And _
                                                             tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault
                                        End Sub)
GC.Collect()

Open in new window

Since you are dealing with hundreds of thousands of lines, re-declaring Dim tran As New Trans inside the loop will result in (a small) but additional usage of memory for each declaration, and withyour hundreds of thousands of lines, it does add up!
I have also added a GC.Collect() at the end of the routine.
0
 
LVL 2

Author Comment

by:miketonny
ID: 36525835
umm, under threading I couldn't find tasks.  does that mean i don't have that in my .NET 3.5?
but ya thanks for pointing out that its better to declare that variable outside the loop, i didn't realize that.
0
 
LVL 17

Expert Comment

by:nepaluz
ID: 36526010
You are right about 3.5, but did you know that youcan actually use .NET 4.0 in VS 2008? Just install the .NET 4.0 SDK (thats if you are not on a work machine!)
Best of luck with the rest then .......
0
 
LVL 2

Author Comment

by:miketonny
ID: 36526301
if in that case, all the servers that's gonna run the program will need .NET 4.0 then?
I'll dig a little into this, but PLINQ does seem to be a good way
0
 
LVL 83

Expert Comment

by:CodeCruiser
ID: 36526320
You did not answer the question (or I did not find it) "why lists?"!

The obvious improvement that can be made (but may not make a huge difference) is



For Each tran As Trans In ListA
                Dim tranExist = (From tr As Trans In ListB Where _
                                 tr.internref = tran.internref And tr.ledgerno = tran.ledgerno And _
                                 tr.externref = tran.externref And _
                                 tr.descriptn = tran.descriptn And _
                                 tr.trantype = tran.trantype And tr.accountid = tran.accountid).FirstOrDefault

Open in new window

0
 
LVL 17

Accepted Solution

by:
nepaluz earned 1200 total points
ID: 36526702
Actually, I think we were fixated on LINQ (and somehow it clouded our thinking here!). Try:
Dim tranExist = ListA.Intersect(ListB)

Open in new window

That should give you alist of all common occurancesin both lists. As the meerkat says, SIMPLESSSS!
0
 
LVL 2

Author Comment

by:miketonny
ID: 36526922
@ CodeCruiser, I'm used to use lists to do these kind of things, is there a more efficient way of doing such? something like query through database to do the same?

@nepaluz, that does look simple enough! I'll test a little to see how it goes.
0
 
LVL 83

Expert Comment

by:CodeCruiser
ID: 36532018
Yes this can be done on the DB. Are the lists being populated from DB?
0
 
LVL 2

Author Comment

by:miketonny
ID: 36532324
Yes they're all from the same table in foxpro database.
so in VB shall i just write a long query to do the same thing when i bring in these?
would this hold up the database for too long?
0
 
LVL 83

Assisted Solution

by:CodeCruiser
CodeCruiser earned 800 total points
ID: 36532345
Oh Foxpro. If foxpro supports the TSQL as any other DB then this should be straight forward. Otherwise, you can fill a DataTable and use the RowFilter to do this.
0
 
LVL 2

Author Comment

by:miketonny
ID: 36719786
thank you both for the help on this problem, I learnt something new on this :)
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It’s quite interesting for me as I worked with Excel using vb.net for some time. Here are some topics which I know want to share with others whom this might help. First of all if you are working with Excel then you need to Download the Following …
Calculating holidays and working days is a function that is often needed yet it is not one found within the Framework. This article presents one approach to building a working-day calculator for use in .NET.
In response to a need for security and privacy, and to continue fostering an environment members can turn to for support, solutions, and education, Experts Exchange has created anonymous question capabilities. This new feature is available to our Pr…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

926 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question