Solved

Posted on 2012-08-22

Hi Experts!

This has been bugging me for years so I thought I'd see if I could get a definitive answer here. I have three scenarios. I ask questions about whether respondents like two products using a 10 point scale. There are three samples:

1) 200 people are asked both questions ("Do you like product A" & "Do you like product B")

2) 75 people are asked question one ONLY, 75 people are asked question two ONLY and 50 people are asked both questions

3) 100 people are asked question one ONLY, 100 people are asked question two ONLY

If I want to know if there is a significant difference between the mean of the answer to question one and the mean of the answer to question two, do I use the same formula to stat test these three samples?

Thanks!

This has been bugging me for years so I thought I'd see if I could get a definitive answer here. I have three scenarios. I ask questions about whether respondents like two products using a 10 point scale. There are three samples:

1) 200 people are asked both questions ("Do you like product A" & "Do you like product B")

2) 75 people are asked question one ONLY, 75 people are asked question two ONLY and 50 people are asked both questions

3) 100 people are asked question one ONLY, 100 people are asked question two ONLY

If I want to know if there is a significant difference between the mean of the answer to question one and the mean of the answer to question two, do I use the same formula to stat test these three samples?

Thanks!

45 Comments

But, as phoffric pointed out, the procedures are not exactly the same and the two question test might introduce a bias.

http://en.wikipedia.org/wi

But you already have the data. Are the answers taken by the three methods close to each other? If yes, do not worry, if no something non-random is going on.

My question is, do I use the same test (ANOVA, t-test) on all three data sets to determine if something non-random is going on? After all, few clients would be willing to pay for three studies, using three different methods of sampling, just to see if the three results are the same or different. What if they were? How would you determine which of the three is "most correct"?

If they were, I would say there is something wrong with the test.

Hence use the same test on each method.

(I still worry a bit about proffric's comment about two questions vs one even though you said you rotate the questions. Nevertheless that is a question about poll design rather than data treatment.)

2) phoffric, you are right that rotating probably doesn't neutralize the bias but it does make it less "bias'y" Just like a trial lawyer asks a question knowing even if the judge tells the jury to forget it they won't so, unless you have surveys of only one question, order bias will exist and all you can do is try to blunt its effects.

For instance:

How many people in your household?

What car do you drive?

Would you use a professional gardener?

How would you rate product A?

Have you visited the dentist in the last six months?

Do you listen to the radio?

My point was that if product A is buried in many other questions then the inclusion of product B shouldn't affect the results.

I am assuming that your choice of market sample was made using the same criteria for all groups.

As pointed out by others, this bias can be introduced by asking the two rating questions in the same survey, but this didn't seem to be the object of your question here.

1) Two (or more) groups answered both questions

2) Two (or more) groups had a mix of those who answered both questions & those who answered only one question

3) Half the groups answered only one question & the other half only answered the other question

1) I ask 200 people if they like the President BEFORE he gives a speech. I ask the same 200 people if they like the President AFTER he gives a speech. Is there a significant difference between the Yeses? (all answered both questions)

2) I randomly ask 150 people if they like the President BEFORE he gives a speech. I randomly ask 150 people if they like the President AFTER he gives a speech. Is there a significant difference between the Yeses? (some answered both questions, some answered only one question)

3) I ask 100 people if they like the President BEFORE he gives a speech. I ask the other 100 people if they like the president AFTER he gives a speech. Is there a significant difference between the Yeses? (Half answered the BEFORE question and half answered the AFTER question)

In determining the significance, do I use the same formula or a different formulas?

Thanks!

(number of people in sample)*(probability of yes)*(probability of no)

Whether the asking of one question might affect the answer to a different question asked of the same person is more a question of psychology than statistics, although statistics can help you to test any hypothesis you may have regarding how the asking of one question might affect the answer to another question.

If this speech is very unlike the others then you could expect a proportion of people to change their answer.

I hope this is a little clearer. Have a great day!

With the two products poll, what question are you trying to answer?

Is product A more popular than B?

Is asking a group of people about two products different from asking two groups about one product each? [This could be two T-tests.]

Similarly with the Presidential poll:

Are you trying to measure the presidents popularity?

Are you trying to measure the mood of the public?

Are you trying to measure the effectiveness of a particular speech? [This could be three T-tests.]

This may help:

http://www.sfu.ca/~ber1/ia

To break the Presidential poll into three T-tests you need to define two groups:

Means and SDs before and after the speech for all respondants.

Mean and SD of the Before-Only group vs the mean and SD of the Before-After group on the Before poll.

Mean and SD of the After-Only group vs the mean and SD of the Before-After group on the After poll.

For the two products poll:

1. Compare the data for A vs B data for all respondents.

2. Compare the data for A-only vs A-both.

3. Compare the data for B-only vs B-both.

If 2 or 3 show a difference, then the results of 1 are suspect.

1) 200 people have mean of 3.6 before speech and 4.1 after speech. Is this significant?

2) 150 people have mean of 3.7 before speech and 4.0 after speech. Is this significant?

3) 100 people have mean of 3.8 before speech and 4.2 after speech. Is this significant?

After all, the formula doesn't know where it came from or what its user is trying to discover. The formula doesn't change the numbers I have to test, does it? I guess, to use a perhaps silly analogy, does it matter to the gun if I am aiming for the bottle or the pumpkin?

Or are you saying I need one formula (t-test)

1) one formula: Where BEFORE and AFTER saw the speech

2) Two formulas:Where BEFORE and AFTER saw the speech AND Where BEFORE and AFTER didn't see the speech

3) Where BEFORE and AFTER didn't see the speech

And in situation 2, if both formulas say the mean is significant, it is. If only one formula or no formula says it is significant, than it isn't?

BTW, thanks for the PDF! I will hand a copy of it out to some of our people.

In the cases you've given so far, I think you can do the necessary analysis by splitting things up into two groups and using the T-test (multiple times if necessary). But you need to know more than the means of the data sets to do a comparison. You also need to know the spreads or variances.

The T-test formula doesn't know or care what the numbers are or where they came from, but it can only answer questions that are posed correctly.

==========================

Looking at your presidential poll question:

Are you intentionally polling before and after a planned speech?

Or are you conducting a general poll over the course of a week and major speech happens in the middle?

Do the people you talk to before and after the speech know that you will be in touch again after the first call?

Can you ask the after-people if they've seen the speech, since you can't ask the before-people that question?

It is really hard to answer these sort of questions in the abstract.

I will make all the assumptions I think I need to make. You can let me know if I am mistaken.

You want to gauge the public perception of two products A and B.

You do your polling and come up with two sets of data.

You do a T-test to see if the public perception for these product is the same or different. Call this T1. Assume they are different.

Then you notice that some respondents were asked about both products. And some were only asked about one.

Now you don't care about the products anymore. Now you are worried about the polling method.

Here the question is: Does it make a difference if you ask about a person about more than one product?

You do one T-test to compare the A-only data with the A+B data. Call this T2.

You do a second T-test to compare the B-only data with the A+B data. Call this T3.

If T2 and T3 show no significant difference, then maybe you are done.

You can say, with some justification, that it doesn't matter if you ask a person about one or two products in the same poll.

But suppose T2 and/or T3 do show a significant difference. Is there a problem with your protocol?

You could do some more tests with the data you have, but what is the question you want to answer?

Try this one: If you ask one person about two products, does it make a difference which one is mentioned first?

Now you will only look at the data from the people who graded both products.

Hopefully you have randomized and kept track of this level of detail.

So you do a T-test on rating of Product A in the A+B group versus the B+A group. Call this T4.

Finally another T-test on Product B in the A+B group versus the B+A group. Call this T5.

==========================

It is probably much better to design the polling protocol rigorously in advance than to worry about how to fix things afterwards.

If you asked people to answer both questions, and they only answered one, maybe those results should be thrown out.

If you asked them to pick one and answer it, and they answer both anyway, consider throwing out those.

If you asked some people about one product or the other, and asked others about both, then maybe you should be thrown out.

If you know what you are trying to find out with a poll, and design a good protocol, you will know in advance what sort of tests you need to run on the data.

If you have a bunch of data dumped on you, and you're trying to mine it, then you have different and harder task.

1) You ask 200 people if they like the president BEFORE his speech. You ask the same 200 people if they like the president AFTER a speech. What formula would give you a way of finding out if there was a significant difference in approval after he gives his speech? (SAME PEOPLE)

2) You ask 200 people if they like the president. Now you use the answer to a previous question to see that there are 150 coffee drinkers and 150 tea drinkers. Obviously there will be people who drink both. You want to know if there is a significant difference between coffee and tea drinkers as far as liking the president. (SOME SAME/SOME DIFFERENT). What formula would you use?

3) You ask 150 men and 150 women if they like the president. What formula would test if there is a significant difference between the means/proportions of men vs. women? (DIFFERENT PEOPLE)

Note that the thing I'd like would be the formula/formulas to use. If it is the standard student't t-test, I can get that from the Web but I suspect it is not a Swiss Army knife that can be used for ALL tests of means and proportions. Ideally, I'd like to know if the formula(s) would be different if we assume or don't assume a standard population or a small/large population but that might be pushing it! I also know that if there are more than two objects being compared you have to bring in ANOVA and/or the Marascuilo procedure so I am limiting my question to two things.

Q2 is trickier. You have to decide exactly what you want to know.

You can use the T-test to compare:

1. Coffee drinkers vs non-coffee drinkers

2. Coffee only drinkers vs tea only drinkers

3. Caffeine drinkers vs caffeine abstainers

You could also use the T-test to do three pair wise comparisons:

Coffee only vs tea only

Coffee only vs coffee+tea

Tea only vs coffee+tea

you are saying I couldn't use them to come up with a significant difference?

I am not saying that at all. If you have the raw data [200 questionnaires with Coffee Y/N, Tea Y/N, President 0-10], then you can use the T-test to check any of the hypotheses I mentioned earlier or any other one you can come up with. Whether a particular test shows significance or not depends on the actual data.

1. Coffee drinkers vs non-coffee drinkers

2. Coffee only drinkers vs tea only drinkers

3. Caffeine drinkers vs caffeine abstainers

You have to start with the question you want to answer, then see if you have the relevant data and a method of analyzing it.

There may be different ways of defining the set of observations one is interested in, but testing more hypotheses increases the chance that one of them will accidentally appear significant.

http://xkcd.com/882/

I believe aburr gave the correct answer here first: http:#a38330755

In summary, you realize there may be issues with the way the data was collected,but you

I reiterated his answer with more detail here: http:#40063127

After you do the first analysis, you can use the T-test again to check your assumptions.

But if you are aware of issues in the way the data was collected, then you may want to address them. And one way to address them may be by running additional tests.

There is no special formula or prescription for dealing with bad data, because there are so many ways bias can creep into your data in real world practice. Statistical analysis is as much art as science. You need to figure out what you want to know, what data to collect, how to analyze it, and what can go wrong.

In your Product A vs Product B case:

You notice that some people rated both products and some rated only one.

You are concerned that this

So do the appropriate initial analysis (the T-test) and then look a little deeper

if you can.

I think the problems come from trying to answer poorly defined questions with hypothetical data which may have hypothetical flaws.

So what is the question? And what is the problem?

Potential questions and problems related to the Presidential survey:

Q: What is the President's popularity?

P: He gave a major speech while the poll was being conducted.

Q: How effective was the President's speech?

P: Some respondents answered only Before or After. Some answered both.

Q: How effective was the President's speech?

P: Some of the respondents drink coffee and some drink tea.

Q: Does drinking Coffee or Tea affect how people feel about the President?

P: Some people drink both, and there was a major speech while the poll was going on.

If you start with a well formed question,

For example, this experiment might be well designed:

1) You ask 200 people how they feel about the president BEFORE his speech [0 to 10].

You ask the same 200 people how they feel about the president AFTER his speech [0 to 10].

You decide beforehand to throw out anyone who didn't see the speech.

You decide before hand to analyze the data using the T-test.

Note that I am not saying that any particular data is "bad."

In fact, I can't tell if the data is good or bad, or how good or how bad it is.

The point is that there is a way to analyze the data, even after it is taken, to see if there is any bias.

==========================

Your initial post for this question listed three ways to run a survey, and

S1: 200 people asked about A and B

P1: Is there a difference between A-B vs B-A.

S2: 75 people asked about A. 75 about B. 50 about A and B.

P2: Is there a difference between A-B vs B-A.

Is there a difference between asking about two products vs one.

S3: 100 people asked about A. 100 about B.

P3: The sample sizes may be smaller than you like.

==========================

S3 is the cleanest, if the sample size is large enough.

S1 is probably okay, as long as you make sure to balance the A-B vs B-A order.

And you can do additional tests to see if the order biases the results.

S2 has the most complications, but you still do the main A vs B analysis, and look at the problems as well.

==========================

One way to look at this issue, even you get the data dumped on you after the survey is done, is to ask how you wish it had been run to eliminate all the potential biases you can think of. Then you can try to think of tests you can run on the data you have to see if any of those biases have actually crept in.

Not exactly, but close enough for the moment.

>> If the President's speech is good, doesn't that bias the AFTER results?

What is the question? Are you trying to determine presidential popularity or speech effectiveness?

A good speech may or may not affect the president's popularity. You actually have to run the test to find out.

>> And if I use a really good brewing method for the coffee and a really bad brewing method for the tea, am I not biasing the results?

This would be a horrible way to run a taste test, but I thought you were conducting a poll.

You have to describe one survey/experiment completely, then dig into the details.

>> My question is, understanding there is bias, can't I use a test (or tests) to show that the bias is significant?

You may think or worry that there is a bias, but you can't tell for sure until you run the test.

>> If the rating for liking the coffee is 3.6 and the rating for tea is 4.5, can't I say MY SAMPLE likes tea significantly more than they like coffee?

Even if all you are asking your sample about is coffee vs tea, you actually have to do the test before you say anything about significance.

In an earlier post I said

http://www.socialresearchm

Even with very large samples, you can have a large difference in means that is not significant. You can't look at just the two means and say anything about significance. You have to actually do the calculations.

==========================

>> If I have a mean of one million coffee drinkers and one million tea drinkers, my

This is really where we disagree:

If you have the data, I think it would be much better to do the T-test calculations than to rely on common sense.

Here again, you have collected and dumped the data without specifying the problem carefully.

If you have that much data, you can include all the people that drink both coffee and tea.

Or you can throw out all the people that drink both.

Or you can look at only the people that drink both.

And you should probably do all these three of these tests and more.

By clicking you are agreeing to Experts Exchange's Terms of Use.

Title | # Comments | Views | Activity |
---|---|---|---|

logic in c# | 10 | 57 | |

Perecntage | 4 | 39 | |

Access question - internal training | 9 | 33 | |

Geomentry-Fundamental concepts | 6 | 33 |

Join the community of 500,000 technology professionals and ask your questions.

Connect with top rated Experts

**19** Experts available now in Live!