You should be aware that the statistical functions in Excel were reworked with Excel 2010 and 2011. They had a checkered reputation in the academic community due to occasional erroneous results that were easy for the casual user to encounter. They had previously been reworked for Excel 2003, but this latest revision is an attempt to answer the criticism for once and for all.

I assume that this is a homework type of assignment, and so will not post my version of the workbook to avoid getting you in trouble with the instructor.

When answering the first question, I used the TTEST function in an array formula that tested whether the attribute was present or not:

=TTEST(IF(B$2:B$36=0,$A$2:$A$36,""),IF(B$2:B$36<>0,$A$2:$A$36,""),2,3)

The TTEST function returns the probability that the two sets of data come from the same population. A number close to 0 means that they are likely to come from different populations. Variables like Color, Age and Mileage aren't really suitable for this type of test, so don't be alarmed by the error value that is returned. Note: you could have sorted the data by column B and entered the two resulting ranges separately in a regular formula. The array formula gives the same answer and saves time by letting you copy it across.

Array-entering a formula is a little tricky:

1) Click in the formula bar and paste the formula

2) Hold the Control and Shift keys down

3) Hit the Enter key, then release all three keys

Excel should reward you with curly braces { } surrounding the formula. You may see a #VALUE! error value if you didn't follow the directions correctly.

I also tried using the RSQ function to return the R-squared value (square of Pearson's correlation coefficient) for the correlation of price with each of the variables. A value close to 0 means that there is little correlation.

=RSQ($A$2:$A$36,B$2:B$36)

The combination of TTEST and RSQ led me to eliminate two of the variables--your judgment & textbook may suggest retaining a different number of variables. I then rearranged the data with the excluded variables off on the right. I could now use LINEST to return the regression equation for the remaining variables. If you exclude two variables, then select a five row x eight column range of cells and array-enter a formula like:

=LINEST(A2:A36,B2:H36,TRUE,TRUE)

The on-line help tells you how to interpret the results. It is worth noting that the constant is at the far right of the top row, and the coefficients for the variables are in reverse order (coefficient for column B appears next to the constant). I like to look at the R-squared for the overall correlation to see how good the fit is; you'll find this in the third row on the left.

Brad

I assume that this is a homework type of assignment, and so will not post my version of the workbook to avoid getting you in trouble with the instructor.

When answering the first question, I used the TTEST function in an array formula that tested whether the attribute was present or not:

=TTEST(IF(B$2:B$36=0,$A$2:

The TTEST function returns the probability that the two sets of data come from the same population. A number close to 0 means that they are likely to come from different populations. Variables like Color, Age and Mileage aren't really suitable for this type of test, so don't be alarmed by the error value that is returned. Note: you could have sorted the data by column B and entered the two resulting ranges separately in a regular formula. The array formula gives the same answer and saves time by letting you copy it across.

Array-entering a formula is a little tricky:

1) Click in the formula bar and paste the formula

2) Hold the Control and Shift keys down

3) Hit the Enter key, then release all three keys

Excel should reward you with curly braces { } surrounding the formula. You may see a #VALUE! error value if you didn't follow the directions correctly.

I also tried using the RSQ function to return the R-squared value (square of Pearson's correlation coefficient) for the correlation of price with each of the variables. A value close to 0 means that there is little correlation.

=RSQ($A$2:$A$36,B$2:B$36)

The combination of TTEST and RSQ led me to eliminate two of the variables--your judgment & textbook may suggest retaining a different number of variables. I then rearranged the data with the excluded variables off on the right. I could now use LINEST to return the regression equation for the remaining variables. If you exclude two variables, then select a five row x eight column range of cells and array-enter a formula like:

=LINEST(A2:A36,B2:H36,TRUE

The on-line help tells you how to interpret the results. It is worth noting that the constant is at the far right of the top row, and the coefficients for the variables are in reverse order (coefficient for column B appears next to the constant). I like to look at the R-squared for the overall correlation to see how good the fit is; you'll find this in the third row on the left.

Brad