Choosing an appropriate data structure

Posted on 2004-08-16
Last Modified: 2010-04-17
Hi all,

I would like to know the criteria for choosing a particular abstract data type for e.g linked list, tree, hash table.

How can one determine which data structure is the best
for an application design ?
(design considerations)

Question by:arut

Expert Comment

Comment Utility
Well, there are a few questions you can ask yourself (this is by no means a complete list).

If accessing speed is the key (ie, time to find a specif element), then a tree would probably be best.  The downside to trees is they incurr a performance hit when adding elements, as you must navigate the tree.  You can also look into self-balancing trees, which are even faster at searching, but even slower at adding.

If you need to be able to loop through ALL your elements regularly, then a linked list is best.  You can change the size of the list on the fly, you can add elements in the middle of the list, and you can even make it backward-searchable.  The downside is that they are very inefficient at searching for individual elements, as you basically have to loop through ALL your elements until you find the right one.  

As for hashtables, they are mainly used when you have a large number of key-value pairs, like a dictionary.  They have an efficient means of storing your keys for reasonably fast searches.  These are the best when you need to retrieve specific elements which usually have a name associated with them.

Hope this helped!

LVL 45

Expert Comment

Comment Utility
Hi arut,

Primary considerations are

- language(s) that you are using and the capabilities they offer

- the application ... if you need a parser you probably want a LIFO structure ... if you want this parser to handle infinitely long strings, you might prefer to have a data structure that can be expanded dynamically, like a malloced array or linked list ...

If you need to insert randomly in the data, linked list may not be the best choice ... If you need range searches more frequently, linked list might be a good idea ... If you need both then you may consider indexing the data ... all depends on data and application

- future expansion and scalability ... you may find one data structure offering better scalability for an application as compared to another ... e.g. linked list of records in a commercial database offers very little scalability


Author Comment

Comment Utility
Lets assume I want to implement a multicast routing table which has an entry like :    3         4
.    6         7

and I want to quickly lookup the entries.

In such a scenario what should be my design choice
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

LVL 45

Accepted Solution

sunnycoder earned 500 total points
Comment Utility
routing tables have different arrangement of data ... These are more or less variations of trees with some optimizations doe for space and/or search time ... Different data structures are used in different types of routers as they are expected to operate at different speed and handle different volumes of route information

download this thesis, it has some invaluable information which will help you decide your data structures and give you a good insight on the kind of optimizations and approaches used.

Expert Comment

Comment Utility
also keep in mind that dynamic allocation in non-managed memory is relativly expensive, as hardware developed, and memories are now much faster is is cheaper to simply scale by 2/halve a static structure (rewrite whole to new when needed, with a large mem margin) than alloc/free needed mem in a place/time of entry first/last usage

Expert Comment

Comment Utility
It depends very much on the data size. Sometimes the "slowest" technique can actually be the fastest because the data set is very small.

When I need extremely fast lookup speeds I usually choose between a sorted vector (a sorted array of consecutively stored elements) or a hash table.

A binary search is extremely fast and gets faster as the data grows. For a binary search, the search time is O(log2 N).

Here is a chart that shows the maximum number of comparisons needed for some array sizes:

Items: max comparisons needed

         128: 7 comparisons max
       1024: 10 comparisons max
16777216: 24 comparisons max

Binary search provides extremely fast lookup speed, but inserting an element in a binary search can be slow because the array must be kept in sorted order. An insertion near the top requires pushing all the subsequent elements forward. This can be minimized by using an array of pointers for the lookup array.

The other choice is a hash table. You can make your searches faster (as fast as O(1), instant) by using a larger hash table. You can save memory (and slow the searches) by using a smaller hash table.

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

I know it’s not a new topic to discuss and it has lots of online contents already available over the net. But Then I thought it would be useful to this site’s visitors and can have online repository on vim most commonly used commands. This post h…
If you’re thinking to yourself “That description sounds a lot like two people doing the work that one could accomplish,” you’re not alone.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this fifth video of the Xpdf series, we discuss and demonstrate the PDFdetach utility, which is able to list and, more importantly, extract attachments that are embedded in PDF files. It does this via a command line interface, making it suitable …

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now