Solved

Posted on 2012-08-13

Hi,

I have numeric data in an excel spreadsheet that I would like to disaggregate on a categorical variable and plot the resulting sets of data as probability distributions.

The worksheet contains about 1958 rows of data. The worksheet will be disaggregated on the dependent variable called "radon" (an integer) based on the categorical variable "code" (a text field). Both are highlighted in the attached spreadsheet. This will produce several sets of rows of data (I think there are about 20+ different "code" values). For each set of rows, each set corresponding to a different code, I need to compute the median, max, min, mean, and standard deviation for that set of rows, along with a plot of its probability distribution bell curve. This will be repreated for each of the disaggregated codes (that is, one set of statistics and one plot per code). Finally, all of the plots should be placed on top of one another to allow me to observe the spread of each of the datasets.

Attach is a spreadsheet with the data.

Thanks!

TC

expertexchange-radon-vs-code.xls

I have numeric data in an excel spreadsheet that I would like to disaggregate on a categorical variable and plot the resulting sets of data as probability distributions.

The worksheet contains about 1958 rows of data. The worksheet will be disaggregated on the dependent variable called "radon" (an integer) based on the categorical variable "code" (a text field). Both are highlighted in the attached spreadsheet. This will produce several sets of rows of data (I think there are about 20+ different "code" values). For each set of rows, each set corresponding to a different code, I need to compute the median, max, min, mean, and standard deviation for that set of rows, along with a plot of its probability distribution bell curve. This will be repreated for each of the disaggregated codes (that is, one set of statistics and one plot per code). Finally, all of the plots should be placed on top of one another to allow me to observe the spread of each of the datasets.

Attach is a spreadsheet with the data.

Thanks!

TC

expertexchange-radon-vs-code.xls

10 Comments

Question still open.

As this can be done simply enough with sparklines in 2010.

I am guessing that the data is not normal, so would likely require a Box-Cox to get it into a normalised form.

However, this may get you started.

expertexchange-radon-vs-code.xls

1. the plot's y axis only goes up to 25. It should go up to around 5000 or 8000 (the mean of the set of highest radons).

2. Could you include the Box-Cox for more points (the max 500)?

3. Could you allow the x axis to show the category variable ("code")?

1) I have added the Radons to the axis

2) I have double clicked one of the Pivot lines to extract just that "code"

This I have performed a "basic" Box-Cox Log normal transform.

ou should be able to do the same with other lines and then copy the formula and graphs.

Should get you going in the right direction

3) The "code" is each series line... have added a key table, but this may be more confusing than helpful.

expertexchange-radon-vs-code.xls

Thanks.

By clicking you are agreeing to Experts Exchange's Terms of Use.

Title | # Comments | Views | Activity |
---|---|---|---|

In Excel how to add formula for Quintile? | 4 | 23 | |

Microsoft Access Table name | 3 | 19 | |

Formula to count names row by row | 2 | 32 | |

How to Autofill Across to next value | 3 | 20 |

Join the community of 500,000 technology professionals and ask your questions.

Connect with top rated Experts

**16** Experts available now in Live!