Solved

Array vs. List, which is a better option?

Posted on 2012-03-15
18
456 Views
Last Modified: 2012-04-01
Hi, I'm using C#.NET 2010. I have a project, where I'm going to have to use either an Array or a List to store values.

My question is, which is better and why? I understand that a List offers the advantage of not having to pre-declare the size, where a Non-Mutable Array has a fixed size. Barring that, why would one option be better than the other?

Is one more efficient than the other?

Thanks,
Fulano
0
Comment
Question by:Mr_Fulano
  • 6
  • 5
  • 4
  • +2
18 Comments
 
LVL 96

Assisted Solution

by:Bob Learned
Bob Learned earned 100 total points
ID: 37727267
Generics lists let you add and remove elements easily, and you get type safety.  You can still reference the item by index.  You can sort a list easily.  There are LINQ extensions to the List<T> type, that don't exist for arrays.  The List<T> class has a Contains, where you can see if the list contains an object.
0
 
LVL 75

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 200 total points
ID: 37727479
and you get type safety.
You can get type-safety with an array as well, provided you used the bracket syntax and not the actual Array class.

The List<T> class has a Contains, where you can see if the list contains an object.
You can do this with arrays as well if you are using .NET 3.5+ via the Contains extension method.

The algorithm for growing the internal storage (which is an array itself, IIRC) of a List is that it doubles its capacity whenever it reaches capacity. You could implement your own algorithm if you wanted greater efficiency. Of course you can also specify the initial capacity of a List if you know the count/maximum ahead of time. As TheLearnedOne mentions, Lists are a bit more convenient programming-wise, but arrays have a smaller footprint.
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37727536
I love a little course correction *BIG GRIN*.
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 

Author Comment

by:Mr_Fulano
ID: 37727718
Hi Kaufmed, I'm a little confused about what you're saying...when you say:

"The algorithm for growing the internal storage (which is an array itself, IIRC) of a List is that it doubles its capacity whenever it reaches capacity."

What "capacity" are you referring to? In a List one doesn't specify the size, so how would it know it has reached capacity?

I'm also assuming this is what you're referring to when you speak about Array's having a smaller footprint.

Fulano
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37727758
What "capacity" are you referring to?
A List has a Count--the number of items it is actually holding--and it has a capacity--the maximum number of items it can hold. It's like have a 5 gallon jug that's only filled with 3 gallons of water. The "Count" of the jug is 3; the "capacity" of the jug is 5. You don't see the capacity of the List because the internal storage is not accessible by your code.

In a List one doesn't specify the size, so how would it know it has reached capacity?
The List manages capacity itself so you don't have to. However, you can manually adjust the capacity if you prefer. In an array, you have to manage the capacity. If you fill the array, but you want to append more items, you have to allocate a new, larger array and copy the items from the existing array to the new array. A List does this internally, and it does so during the call to Add. If the List is at capacity, then internal to Add, the capacity is doubled by means of a new array, and the existing items are copied from the old array to the new array. You wouldn't be cognizant of when this was happening. [OK, you could probably calculate it, but if you were going to go to that much trouble, why not just use an array  ; )   ]

I'm also assuming this is what you're referring to when you speak about Array's having a smaller footprint.
Well since a List has an array internally, it automatically has at least the same amount of "stuff". But since you have additional ways (methods) to interact with the list, and additional properties and such, there is more stuff associated with a List than there is with an array.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37727761
P.S.

I'm not sure I am permitted to post decompiled Framework source code, but you can easily inspect the code for a List (and the Array class) using something like dotPeek or JustDecompile.
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37727776
P.P.S.

I just realized I contradicted myself. You can see the capacity of the list via the Capacity property. As I mentioned, the List will adjust its capacity as needed whenever you add items and its internal storage is at capacity. It will do this within the Add method. AFAIK, the List will never shrink of its own accord. You would shrink the capacity of the List by assigning a smaller value to the Capacity property, and you could only do so if the Count of items was less than the new Capacity value (otherwise you'll receive an exception).
0
 

Author Comment

by:Mr_Fulano
ID: 37728011
Kaufmed, very interesting explanation. Thank you! - So, in the end, my take away is that its really a matter of preference. I can either use an Array or a List, depending on which I like to use more. The performance difference is somewhat negligible.

I guess the only question that still haunts me is why would anyone want to use and Array if a List is so much more "user-friendly."

Thanks,
Fulano
0
 
LVL 40

Accepted Solution

by:
Jacques Bourgeois (James Burger) earned 200 total points
ID: 37728577
The main difference between both is a balance between performance and possibilities.

Because of their structure in memory, arrays are a lot faster. Arrays are arranged sequentially, and the system know the lenght of each item. An integer is 4 bytes, a String or an object built on a class is a pointer (32-bits or 64-bits depending on the operating system). Knowing the memory address at which the array begins, the size of each element, that it is sequentially stored in memory, if you want to get say the 123th element of the array, the system only has to do a little multiplication to know at which address it needs to retrieve the element.

A List however is stored as a linked list (binary tree), a series of elements that point one to the other. If you need to retrieve an element, the system has to navigate the whole list up to the 123th element. This takes longer.

However, if you have a lot of manipulations to do, collections are a better choice. If you wanted to insert an element in the middle of an array, you would need to add one at the end, and then move everthing up one place starting at the point of insertion in order to free a space for the new element. Same problem in reverse if you want to remove an element.

In A List, you simply change to pointer of the element previous to the new one so that it point to the new one, and set the new one to point at the following one. Or reset 1 pointer when you remove an element. That is a lot faster.

So the choice between both is primarily a question of knowing what type of manipulations you intend to do. Data in which you want to insert and remove elements regularly, is better served by a collection. Data that is fixed and/or on which you need to do a lot of manipulations, such as matrix on which you need to make mathematical operations, are usually better served by an array.

This being said, collections have another advantage. Arrays always work the same way, because there is only one type of array. List is but one collection. There are a lot of different collections (queue, stack, hashset, sorted just to name a few). Each of them as a unique behaviour. Would would need to program that behaviour if you were using an array. With collections, you simply choose the collection that suits you the best.

Some collections also enables you to use something else than a number to reference an element. That can come very useful in some situations. Think of databases for instance. Calling Columns(2) is dangerous because the order of the fields in the table or the stored procedure could change during maintenance. Calling Column("City") is a lot safer because the name of the fields never change once a database is in production. This is something you could not do with an array.
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37728765
My turn *GRIN*

A List however is stored as a linked list (binary tree), a series of elements that point one to the other. If you need to retrieve an element, the system has to navigate the whole list up to the 123th element. This takes longer.

You can post framework code, as it is not licensed code, and freely available to anyone with a decompiler.

List<T> stores elements in an array of T.  Here is the field definition from the List<T> class (using JustDecompile):

 private T[] _items;
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37728972
Let's just say that the List<T> class is a wrapper around an array.  That is pretty cool, since it handles the memory allocation, searching, etc.  

There is a difference between length and capacity.  Length is the number of items stored, and capacity is the number of available "slots" for items storage.  If you know ahead of time that you are going to be adding a lot of items, then you can have the list allocate a larger array capacity.

There are a few constructors for the List<T> class, and one takes an integer for the initial capacity.  Another takes an array of items.  Going between an array and List<T> instance is pretty easy:

List<T> constructor takes an array of T
List<T>.AddRange method takes an array of T
List<T>.ToArray method returns an array of T
0
 
LVL 4

Expert Comment

by:Srinivasulu Muppala
ID: 37729126
Advantages on List:
1. Item adding into the list<string> as per type.
2. While insertion of item between/or specific index dimension will extend and auto re-arrage
3.Delete item would same as point 2
4.we can add multiple items to list by array input(AddRange)
5.we could bind list as Datasource
6.we could iterate by for as well as for each
7.sorting is possible
8.we can use LINQ on list
9.List<CustomClass> possible
10.Strongly typed

On Array has limitations on above specified most of the features
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37729259
@srinipro

1. Same for array
2. Agree, provided by "re-arrange" you mean elements are copied to new storage, and not moved around (sorted) in some way--this must be invoked manually by calling the Sort method.
3. Disagree (see my previous comment)
4. Agree.
5. Array can be used in this manner as well.
6. Same for array.
7. Same for array--you just have to write the logic to do so (external function). (I suppose you meant that List already provides a mechanism for sorting which you don't have to write.)
8. Same for array.
9. CustomClass[] is possible.
10. Same for array (see #9).
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37729412
@Mr_Fulano,

I have to be honest:  I had to search for this. I've been using generics for so long, that I couldn't really think of any solid reason to use an array. (Perhaps for interoperability?) You can even find MS folks who prefer generics over "old-shool" arrays. Perhaps you could run into performance issues when using a generic, but I think you'd have to be doing some pretty intense computation for that. Even then, if you decide to work with an array you have to think:  MS put the code for generics through a good bit of testing (let's hope!) before they shipped the Framework. How much testing are you going to be able to do with an array implementation you write?

The only solid example I could think of for using arrays would be if you were going to write your own generic collection. As has been mentioned, a List uses an array for its internal storage. An array can also be used easily as a Heap (thank you Computer Science class!), so you could write your own Heap collection using an array for your internal storage. There's nothing to stop you from using one of the Framework generics for your internal storage, but you would be adding another layer (invisible to users of your creation) to the collection you are creating.
0
 
LVL 40
ID: 37730923
@TheLearnedOne

Array vs Linked List.

We are both right. I was a little short in my description.

According to sources at Microsoft, that explained a few years back durint a DevDays how most collections are implemented in .NET, the data is, as you say and as described in a few places in the documentation, saved in an array. This makes things easier for the garbage collector.

However, so as not to incur the penalty that inserts and removes involve, the thing is treaded internally as a linked list. When you access an element, it goes through a linked list that holds a pointer or the real index (I do not remember which).

In fact, it is the best of both worlds. You have the speed of an array on storage, but the advantages of a linked list as far as data manipulations are concerned. This is one of the reasons why collections need to provide GetEnumerator. If they were treated as arrays, the enumerator would not be needed.

The difference in performance can be felt however. I had to consult a couple of years ago with actuaries at the government who deal with statistics out of census data covering 100 years. That is a huge amount of data. They had some routines developped for them by a consultant that did everything in collections, although they do not manipulate the data, they only made calculations on it. The thing was taking hours. We moved the data into arrays and without changing anything else, we brought it down to about 10 minutes.
0
 
LVL 96

Expert Comment

by:Bob Learned
ID: 37731068
I love a little course correction (revisited) *BIG GRIN*.

Thanks, James!!
0
 

Author Comment

by:Mr_Fulano
ID: 37794586
Gentlemen, very sorry for the delay. I was away and could not reply. However...WOW, what a great discussion. You've ALL been extremely helpful and have all contributed with such good advice. -- This is one of those times that I wish the point awards were unrestricted or at least far more than 500 points, but I'll so my best to award them in a just manner.

THANK YOU all for such good advice!!!

Fulano
0
 

Author Closing Comment

by:Mr_Fulano
ID: 37794591
One of the best discussions I've had here on EE in years. EXCELLENT!!!
0

Featured Post

Master Your Team's Linux and Cloud Stack

Come see why top tech companies like Mailchimp and Media Temple use Linux Academy to build their employee training programs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many of us here at EE write code. Many of us write exceptional code; just as many of us write exception-prone code. As we all should know, exceptions are a mechanism for handling errors which are typically out of our control. From database errors, t…
This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question