Solved

Array vs. List, which is a better option?

Posted on 2012-03-15
18
455 Views
Last Modified: 2012-04-01
Hi, I'm using C#.NET 2010. I have a project, where I'm going to have to use either an Array or a List to store values.

My question is, which is better and why? I understand that a List offers the advantage of not having to pre-declare the size, where a Non-Mutable Array has a fixed size. Barring that, why would one option be better than the other?

Is one more efficient than the other?

Thanks,
Fulano
0
Comment
Question by:Mr_Fulano
  • 6
  • 5
  • 4
  • +2
18 Comments
 
LVL 96

Assisted Solution

by:Bob Learned
Bob Learned earned 100 total points
Comment Utility
Generics lists let you add and remove elements easily, and you get type safety.  You can still reference the item by index.  You can sort a list easily.  There are LINQ extensions to the List<T> type, that don't exist for arrays.  The List<T> class has a Contains, where you can see if the list contains an object.
0
 
LVL 74

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 200 total points
Comment Utility
and you get type safety.
You can get type-safety with an array as well, provided you used the bracket syntax and not the actual Array class.

The List<T> class has a Contains, where you can see if the list contains an object.
You can do this with arrays as well if you are using .NET 3.5+ via the Contains extension method.

The algorithm for growing the internal storage (which is an array itself, IIRC) of a List is that it doubles its capacity whenever it reaches capacity. You could implement your own algorithm if you wanted greater efficiency. Of course you can also specify the initial capacity of a List if you know the count/maximum ahead of time. As TheLearnedOne mentions, Lists are a bit more convenient programming-wise, but arrays have a smaller footprint.
0
 
LVL 96

Expert Comment

by:Bob Learned
Comment Utility
I love a little course correction *BIG GRIN*.
0
 

Author Comment

by:Mr_Fulano
Comment Utility
Hi Kaufmed, I'm a little confused about what you're saying...when you say:

"The algorithm for growing the internal storage (which is an array itself, IIRC) of a List is that it doubles its capacity whenever it reaches capacity."

What "capacity" are you referring to? In a List one doesn't specify the size, so how would it know it has reached capacity?

I'm also assuming this is what you're referring to when you speak about Array's having a smaller footprint.

Fulano
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
What "capacity" are you referring to?
A List has a Count--the number of items it is actually holding--and it has a capacity--the maximum number of items it can hold. It's like have a 5 gallon jug that's only filled with 3 gallons of water. The "Count" of the jug is 3; the "capacity" of the jug is 5. You don't see the capacity of the List because the internal storage is not accessible by your code.

In a List one doesn't specify the size, so how would it know it has reached capacity?
The List manages capacity itself so you don't have to. However, you can manually adjust the capacity if you prefer. In an array, you have to manage the capacity. If you fill the array, but you want to append more items, you have to allocate a new, larger array and copy the items from the existing array to the new array. A List does this internally, and it does so during the call to Add. If the List is at capacity, then internal to Add, the capacity is doubled by means of a new array, and the existing items are copied from the old array to the new array. You wouldn't be cognizant of when this was happening. [OK, you could probably calculate it, but if you were going to go to that much trouble, why not just use an array  ; )   ]

I'm also assuming this is what you're referring to when you speak about Array's having a smaller footprint.
Well since a List has an array internally, it automatically has at least the same amount of "stuff". But since you have additional ways (methods) to interact with the list, and additional properties and such, there is more stuff associated with a List than there is with an array.
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
P.S.

I'm not sure I am permitted to post decompiled Framework source code, but you can easily inspect the code for a List (and the Array class) using something like dotPeek or JustDecompile.
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
P.P.S.

I just realized I contradicted myself. You can see the capacity of the list via the Capacity property. As I mentioned, the List will adjust its capacity as needed whenever you add items and its internal storage is at capacity. It will do this within the Add method. AFAIK, the List will never shrink of its own accord. You would shrink the capacity of the List by assigning a smaller value to the Capacity property, and you could only do so if the Count of items was less than the new Capacity value (otherwise you'll receive an exception).
0
 

Author Comment

by:Mr_Fulano
Comment Utility
Kaufmed, very interesting explanation. Thank you! - So, in the end, my take away is that its really a matter of preference. I can either use an Array or a List, depending on which I like to use more. The performance difference is somewhat negligible.

I guess the only question that still haunts me is why would anyone want to use and Array if a List is so much more "user-friendly."

Thanks,
Fulano
0
 
LVL 40

Accepted Solution

by:
Jacques Bourgeois (James Burger) earned 200 total points
Comment Utility
The main difference between both is a balance between performance and possibilities.

Because of their structure in memory, arrays are a lot faster. Arrays are arranged sequentially, and the system know the lenght of each item. An integer is 4 bytes, a String or an object built on a class is a pointer (32-bits or 64-bits depending on the operating system). Knowing the memory address at which the array begins, the size of each element, that it is sequentially stored in memory, if you want to get say the 123th element of the array, the system only has to do a little multiplication to know at which address it needs to retrieve the element.

A List however is stored as a linked list (binary tree), a series of elements that point one to the other. If you need to retrieve an element, the system has to navigate the whole list up to the 123th element. This takes longer.

However, if you have a lot of manipulations to do, collections are a better choice. If you wanted to insert an element in the middle of an array, you would need to add one at the end, and then move everthing up one place starting at the point of insertion in order to free a space for the new element. Same problem in reverse if you want to remove an element.

In A List, you simply change to pointer of the element previous to the new one so that it point to the new one, and set the new one to point at the following one. Or reset 1 pointer when you remove an element. That is a lot faster.

So the choice between both is primarily a question of knowing what type of manipulations you intend to do. Data in which you want to insert and remove elements regularly, is better served by a collection. Data that is fixed and/or on which you need to do a lot of manipulations, such as matrix on which you need to make mathematical operations, are usually better served by an array.

This being said, collections have another advantage. Arrays always work the same way, because there is only one type of array. List is but one collection. There are a lot of different collections (queue, stack, hashset, sorted just to name a few). Each of them as a unique behaviour. Would would need to program that behaviour if you were using an array. With collections, you simply choose the collection that suits you the best.

Some collections also enables you to use something else than a number to reference an element. That can come very useful in some situations. Think of databases for instance. Calling Columns(2) is dangerous because the order of the fields in the table or the stored procedure could change during maintenance. Calling Column("City") is a lot safer because the name of the fields never change once a database is in production. This is something you could not do with an array.
0
Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

 
LVL 96

Expert Comment

by:Bob Learned
Comment Utility
My turn *GRIN*

A List however is stored as a linked list (binary tree), a series of elements that point one to the other. If you need to retrieve an element, the system has to navigate the whole list up to the 123th element. This takes longer.

You can post framework code, as it is not licensed code, and freely available to anyone with a decompiler.

List<T> stores elements in an array of T.  Here is the field definition from the List<T> class (using JustDecompile):

 private T[] _items;
0
 
LVL 96

Expert Comment

by:Bob Learned
Comment Utility
Let's just say that the List<T> class is a wrapper around an array.  That is pretty cool, since it handles the memory allocation, searching, etc.  

There is a difference between length and capacity.  Length is the number of items stored, and capacity is the number of available "slots" for items storage.  If you know ahead of time that you are going to be adding a lot of items, then you can have the list allocate a larger array capacity.

There are a few constructors for the List<T> class, and one takes an integer for the initial capacity.  Another takes an array of items.  Going between an array and List<T> instance is pretty easy:

List<T> constructor takes an array of T
List<T>.AddRange method takes an array of T
List<T>.ToArray method returns an array of T
0
 
LVL 4

Expert Comment

by:Srinivasulu Muppala
Comment Utility
Advantages on List:
1. Item adding into the list<string> as per type.
2. While insertion of item between/or specific index dimension will extend and auto re-arrage
3.Delete item would same as point 2
4.we can add multiple items to list by array input(AddRange)
5.we could bind list as Datasource
6.we could iterate by for as well as for each
7.sorting is possible
8.we can use LINQ on list
9.List<CustomClass> possible
10.Strongly typed

On Array has limitations on above specified most of the features
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
@srinipro

1. Same for array
2. Agree, provided by "re-arrange" you mean elements are copied to new storage, and not moved around (sorted) in some way--this must be invoked manually by calling the Sort method.
3. Disagree (see my previous comment)
4. Agree.
5. Array can be used in this manner as well.
6. Same for array.
7. Same for array--you just have to write the logic to do so (external function). (I suppose you meant that List already provides a mechanism for sorting which you don't have to write.)
8. Same for array.
9. CustomClass[] is possible.
10. Same for array (see #9).
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
Comment Utility
@Mr_Fulano,

I have to be honest:  I had to search for this. I've been using generics for so long, that I couldn't really think of any solid reason to use an array. (Perhaps for interoperability?) You can even find MS folks who prefer generics over "old-shool" arrays. Perhaps you could run into performance issues when using a generic, but I think you'd have to be doing some pretty intense computation for that. Even then, if you decide to work with an array you have to think:  MS put the code for generics through a good bit of testing (let's hope!) before they shipped the Framework. How much testing are you going to be able to do with an array implementation you write?

The only solid example I could think of for using arrays would be if you were going to write your own generic collection. As has been mentioned, a List uses an array for its internal storage. An array can also be used easily as a Heap (thank you Computer Science class!), so you could write your own Heap collection using an array for your internal storage. There's nothing to stop you from using one of the Framework generics for your internal storage, but you would be adding another layer (invisible to users of your creation) to the collection you are creating.
0
 
LVL 40

Expert Comment

by:Jacques Bourgeois (James Burger)
Comment Utility
@TheLearnedOne

Array vs Linked List.

We are both right. I was a little short in my description.

According to sources at Microsoft, that explained a few years back durint a DevDays how most collections are implemented in .NET, the data is, as you say and as described in a few places in the documentation, saved in an array. This makes things easier for the garbage collector.

However, so as not to incur the penalty that inserts and removes involve, the thing is treaded internally as a linked list. When you access an element, it goes through a linked list that holds a pointer or the real index (I do not remember which).

In fact, it is the best of both worlds. You have the speed of an array on storage, but the advantages of a linked list as far as data manipulations are concerned. This is one of the reasons why collections need to provide GetEnumerator. If they were treated as arrays, the enumerator would not be needed.

The difference in performance can be felt however. I had to consult a couple of years ago with actuaries at the government who deal with statistics out of census data covering 100 years. That is a huge amount of data. They had some routines developped for them by a consultant that did everything in collections, although they do not manipulate the data, they only made calculations on it. The thing was taking hours. We moved the data into arrays and without changing anything else, we brought it down to about 10 minutes.
0
 
LVL 96

Expert Comment

by:Bob Learned
Comment Utility
I love a little course correction (revisited) *BIG GRIN*.

Thanks, James!!
0
 

Author Comment

by:Mr_Fulano
Comment Utility
Gentlemen, very sorry for the delay. I was away and could not reply. However...WOW, what a great discussion. You've ALL been extremely helpful and have all contributed with such good advice. -- This is one of those times that I wish the point awards were unrestricted or at least far more than 500 points, but I'll so my best to award them in a just manner.

THANK YOU all for such good advice!!!

Fulano
0
 

Author Closing Comment

by:Mr_Fulano
Comment Utility
One of the best discussions I've had here on EE in years. EXCELLENT!!!
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

This document covers how to connect to SQL Server and browse its contents.  It is meant for those new to Visual Studio and/or working with Microsoft SQL Server.  It is not a guide to building SQL Server database connections in your code.  This is mo…
Introduction Hi all and welcome to my first article on Experts Exchange. A while ago, someone asked me if i could do some tutorials on object oriented programming. I decided to do them on C#. Now you may ask me, why's that? Well, one of the re…
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now