How to design collection class(es) holding tabular data?

Posted on 2008-06-16
Last Modified: 2013-11-12
I realize this question is a bit large and general but I hope to get some ideas to put me on the right track rather than having a full solution. Useful links to articles on the subject of collection classes are also useful. Google on the subject just gives too many generic results.

In the past I have mainly been doing database design and related work. It's time to move on so I recently started to learn C# and found an interesting project that I'm partly using as a learning exercise. It's about analyzing results from surveys in order to identify and price-tag bad management which I think is deserving. ;-) The methodology works well but I realized that all analysis was done manually. I figured that an application could be built to save lots of time and ease the process + add some new features that would be too time consuming to be done manually. We will look into the data-entry and database design later on but right now I get the data from an html formatted Excel sheet.

The data are in columns and the most important ones are:
Assessed, Assessor, Assessor_Role, Variable1, Variable2...
* Assessed = The person who was assesed.
* Assessor = The person who did the assessment. ("Anonymized" but have additional data like age, sex, salary etc.)
* Assessor_Role = The relation to the assessed person. Currently we have only three fixed roles: Boss, Co-worker, myself.
* VariableX = can be some 200 columns with response values. Values are always integers between 1 and 6.

I can read the data in, I can do the analysis, graphing etc but I have not yet figured out a good way to keep the data internally to the program and this is where I need some help. I suppose I need to create some collection classes that can hold the data for me and make it easy to access. I have hardly used collections so far so this subject is quite new to me.

What is the best approach? HashTable, List, what else?

I have pasted a code snippet that shows some ideas on how I want to set/get the information and to use it for graphing and further calculations. This could be entirely wrong but is what I have come up with so far.

// Need to be able to add values...
if (!assessedPerson["John Doe"].Exist)
	assessedPerson.add("John Doe");
assessedPerson["John Doe"].addValue("BOSS", "VAR_1234", 3);
// Need to read the grand totals for each variable...
double totalMean = assessedPerson["John Doe"].variable["VAR_1234"].Mean;
double totalStDev = assessedPerson["John Doe"].variable["VAR_1234"].StDev;
double totalCount = assessedPerson["John Doe"].variable["VAR_1234"].Count;
// Need to read "subtotals" for each variable based on the assessor...
double bossMean = assessedPerson["John Doe"].assessor["BOSS"].variable["VAR_1234"].Mean;
double bossStDev = assessedPerson["John Doe"].assessor["BOSS"].variable["VAR_1234"].StDev;
double bossCount = assessedPerson["John Doe"].assessor["BOSS"].variable["VAR_1234"].Count;

Open in new window

Question by:Sharp2b
  • 4
  • 4
  • 2
LVL 20

Accepted Solution

REA_ANDREW earned 500 total points
ID: 21792055
Object Orientated Programming. :-) best and biggest topic for you i would say if you have started to learn C#.  So to your problem.

We create an Object called DataAnalysisItem

The object DataAnalysisItem has Five Properties:
Assessed - string
Assessor - string
Assessor_Role - string
Variables - Dictionary<string,int>

See Code Snippet A for the simple proposed Object

So what you could then do is use the list type and define that of type DataAnalysisItem i.e.

List<DataAnalysisItem> items = new List<DataAnalysisItem>();

Lets define an instance of DataAnalysisItem

        List<DataAnalysisItem> items = new List<DataAnalysisItem>();
        DataAnalysisItem item1 = new DataAnalysisItem();
        item1.Assessor = "Andy";
        item1.Assessed = "XX";
        item1.Assessor_Role = "ROLE";
        item1.Variables.Add("Variable1", 1);
        item1.Variables.Add("Variable2", 2);
        item1.Variables.Add("Variable3", 3);
        item1.Variables.Add("Variable4", 4);
        item1.Variables.Add("Variable5", 5);

I hope this helps, and please post any questions you may have.


--Code Snippet A--
public class DataAnalysisItem
	public DataAnalysisItem()
        variables = new Dictionary<string, int>();
    private string assessed;
    public string Assessed
        get { return assessed; }
        set { assessed = value; }
    private string assessor;
    public string Assessor
        get { return assessor; }
        set { assessor = value; }
    private string assessor_Role;
    public string Assessor_Role
        get { return assessor_Role; }
        set { assessor_Role = value; }
    private Dictionary<string, int> variables;
    public Dictionary<string, int> Variables
        get { return variables; }
        set { variables = value; }

Open in new window

LVL 18

Expert Comment

ID: 21792079
I think ADO.NET is the best if you need to work a lot with snapshots of relational data. You can use DataSet, DataTable etc. These are the "in-memory version" of a database server.

Author Comment

ID: 21792189
Thanks a lot. Looks like a good start. I will need a few hours to play around with this before I get back to you. Yes, OOP is where I need to start (and have) but I'm also very much hands-on. I need to see it done before I can understand the theory. Once I get on the track and see the initial how-to:s I can usually figure it out from reading. I think my problem here is that I get confused when I try to think objects within objects. It's not only the collections, these also need to be nested in classes which makes me dizzy. Now I have a structure to start with.

Thanks, I use ADO to get the data so initially I have them in a DataSet. The thing is that I want to do several calculations and have easy access to the results, this is why I figured I put it in a class. OK I could probably use LINQ to do most of what I want but I'm not sure I want to do all the ad-hoc SQL stuff over and over again. For sure, I will keep this in mind when I consider direct access to the db in the next phase of the project. I have spent abt. 10 years doing Oracle work so I also wanted to do something non-databaseish. ;-) Otherwise I could have done most of this in SQL*Plus, except for the graphing...I also need something very user friendly.
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

LVL 20

Expert Comment

ID: 21792218
LVL 18

Expert Comment

ID: 21819459
Well, then ADO.NET sure is database-ish :)
But it is quite powerful and you can do most things that can be done on a SQL server, even aggregate functions (DataTable.Compute method). You can even sub-class the DataTable to suit your data structure if you want to. I believe you can also read from Excel spreadsheet, provided you have the right ODBC connection (never tried this myself though ...).
But at the end of the day, the choice is yours ...

Author Comment

ID: 21819710
Thanks for the comments and the links. I looked briefly at them and it sure seem helpful.
Unfortunately, I just got a lot of preassure to finish some of my paid work so not time for the fun stuff ;-( I have to put this aside until after the weekend but I will be back beginning of next week.

Author Closing Comment

ID: 31467523
Thanks a lot! I still didn't have enough time to test it fully but it looks promising so I'll assign you the points and close. I might be back with more detailed questions later.
LVL 20

Expert Comment

ID: 21898291
Not sure I agree with the Grade B, but hey ho!

Author Comment

ID: 21898353
Sorry, by habit I'm very restrictive when grading things so that was kinda default to me. On the other hand, the question was very vague. Actually, after submitting, I realised that I should also have considered the links you posted. Very useful!!! I didn't see a way to go back and change.
LVL 20

Expert Comment

ID: 21898411
np. :-)

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Diagnostics with Net and Net.Sockets 2 33
Expression Evaluater 3 38
how to remove duplicate code from my project 5 37
C# Linq Select From List 3 18
Introduction This question got me thinking... ( Why shouldn't we use Globals? This is a simple question without a simple answer.  How do you explain these concepts to a programmer w…
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
Microsoft Active Directory, the widely used IT infrastructure, is known for its high risk of credential theft. The best way to test your Active Directory’s vulnerabilities to pass-the-ticket, pass-the-hash, privilege escalation, and malware attacks …

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question