?
Solved

normal distribution plots of column of excel data

Posted on 2012-08-13
10
Medium Priority
?
743 Views
Last Modified: 2012-08-13
Hi,

I have numeric data in an excel spreadsheet that I would like to disaggregate on a categorical variable and plot the resulting sets of data as probability distributions.  

The worksheet contains about 1958 rows of data.  The worksheet will be disaggregated on the dependent variable called "radon" (an integer) based on the categorical variable "code" (a text field).  Both are highlighted in the attached spreadsheet.  This will produce several sets of rows of data (I think there are about 20+ different "code" values).  For each set of rows, each set corresponding to a different code, I need to compute the median, max, min, mean, and standard deviation for that set of rows, along with a plot of its probability distribution bell curve.  This will be repreated for each of the disaggregated codes (that is, one set of statistics and one plot per code).  Finally, all of the plots should be placed on top of one another to allow me to observe the spread of each of the datasets.  

Attach is a spreadsheet with the data.

Thanks!

TC
expertexchange-radon-vs-code.xls
0
Comment
Question by:trcamp
  • 5
  • 5
10 Comments
 
LVL 24

Expert Comment

by:Steve
ID: 38289205
Have you tried Minitab? It is specifically designed to do just what you want there and has a 30 day free trial available.
0
 

Author Comment

by:trcamp
ID: 38289214
No, but thanks for the heads up!  I am hoping to have a permanent solution using excel.  

Question still open.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38289247
As the file is saved xls I assume Excel 2003?
As this can be done simply enough with sparklines in 2010.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 24

Expert Comment

by:Steve
ID: 38289335
Attached is a rough use of PivotTable and some charting.

I am guessing that the data is not normal, so would likely require a Box-Cox to get it into a normalised form.

However, this may get you started.
expertexchange-radon-vs-code.xls
0
 

Author Comment

by:trcamp
ID: 38289387
Nice quick work batman.  a couple of thoughts.

1.  the plot's y axis only goes up to 25.  It should go up to around 5000 or 8000 (the mean of the set of highest radons).  

2.  Could you include the Box-Cox for more points (the max 500)?

3.  Could you allow the x axis to show the category variable ("code")?
0
 

Author Comment

by:trcamp
ID: 38289400
One correction.  The X axis is the one that should have the 5000 to 8000 mean value, with spreads of 2 SDs....not the y axis.
0
 
LVL 24

Expert Comment

by:Steve
ID: 38289535
OK, have "done some work" on the file.

1) I have added the Radons to the axis

2) I have double clicked one of the Pivot lines to extract just that "code"
    This I have performed a "basic" Box-Cox Log normal transform.
    ou should be able to do the same with other lines and then copy the formula and graphs.
    Should get you going in the right direction

3) The "code" is each series line... have added a key table, but this may be more confusing than helpful.
expertexchange-radon-vs-code.xls
0
 

Author Comment

by:trcamp
ID: 38289581
Looks great.  One last question.  What are the "divide by zero" cells in the pivot worksheet?  I just want to make sure that those are not being calculated incorrectly or are otherwise important in some way.

Thanks.
0
 
LVL 24

Accepted Solution

by:
Steve earned 2000 total points
ID: 38289606
to calculate standard deviation you must have more than one data point.
With only one data point you get the div0 error.
So without a standard deviation you cannot plot normal distribution.
0
 

Author Comment

by:trcamp
ID: 38289622
Great.  Thanks Batman.  Nice, efficient, quick work.
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is an article on how to answer questions, earn points and become an expert.
With its various features, Office 365 can not only help you with your day-to-day business tasks, it can also do wonders for your marketing campaign.
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question