Correlating test answers to the class in SQL

Hi experts,
My friend gave me this puzzle and I just can't see a way to approach it... (I would like to use SQL)

Let's say I sit a test at school, 20 questions,  and get I 10 answers right and 10 wrong.
The teacher wants to compare my answers to the rest of the class.

They start by finding the other students who got the same questions correct (ie at least the same 10 right answers).

Then we want to examine the other correct answers achieved by that group and find (in order) the questions that were most often answered correctly by the group. So if Bill Fred and Mary in my class all got the same 10 answers as me right, but also got Quest 16 correct (which I got wrong) maybe I should go and bone up on Q16 as it might be easy for me to learn that topic.

Could someone point me in the right direction. I keep getting lost when I think about creating joins because the fields would be so dynamic (I could do it for a small set, but what if there are 5000 possible questions and we have all been asked different subsets over hundreds of classes etc etc). This would be something I would like to do on the fly in a program and I guess has application to all sorts of forecasting - but I'm not a stats person so feeling a bit lost).

Have a great weekend!!

Split_Pin

Who is Participating?

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Commented:
Assuming you have:

1               1                 Correct
1               2                 Incorrect
2               1                 Correct
2               2                 Correct
(and so on)

select a.question_id, count(distinct a.student_id)
from student answers a inner join
(select student_id, count(question_id)
group by student_id
having count(question_id) >= 10) b on a.student_id=b.student_id
group by a.question_id
order by count(distinct a.student_id) desc;

NOTE: untested sql, but it should point you to the right direction
0
Author Commented:
Hi johanntagle
MANY thanks for kick starting this! I like the simple table you have laid out and I could see how the program could build that table as different users progressively enter results.

Am I wrong to say your code will pick up the students with ANY 10 questions right? I am hoping to find the people who have exactly the same questions (by Question ID) correct as me, plus 1 more.

E.g. I know 5 things about Sql. (Correct: Ques 1,4,5,6,11
Mary knows everything I know + just 1 other thing. (Correct Ques 1,2,4,5,6,11)
Johann knows everything Mary knows 100 other things. (Correct 1,2,3,4,5,6,7,8,9....)

The one extra thing Mary knows (Ques 2)  is also known by Johann so that will be count of 2 and the best thing for me to learn next. I guess I am trying to profile my knowledge against the class and find the best fit against those who know what I already know. (Just getting 10 right answers does not make a match, as I could know 10 things and you could know 10 different things entirely and we could get the same mark despite the fact our knowledge is mutually exclusive.

Also, for a few reasons - I don't want to assume that there is any relationship between the questions - question 20 is not harder than question 1, or even necessarily about the same topic.

Kind of hoping someone is going to chime in and say "Ah that sounds like the XYZ problem and this can be solved as follows " (no tables or program exist yet, but I would rather get a handle on how to approach this before I start making tables of sample results etc).

Cheers!
Split_Pin

0
Commented:
Let's say Split_Pin has student_id=1, Mary has student_id=2

Find the questions Split_Pin answered correctly:
``````select question_id from student_answers where student_id=1 and answer_status='Correct'
``````

To slightly simplify the next SQLs, let's make a view for correctly answered questions

``````create view correctly_answered_questions as select * from student_answers where answer_status='Correct';
``````

So to get who has answered all that Split_Pin has answered, plus some:

``````select student_id, count(question_id)
where question_id in (select question_id from correctly_answered_questions where student_id=1)
group by student_id
having count(question_id) > (select count(question_id) from correctly_answered_questions where student_id=1)
order by count(question_id) desc
``````

To list the questions answered correctly by Mary but not Split_Pin:
``````select question_id
where student_id = 2
and question_id not in (select question_id from correctly_answered_questions where student_id=1)
``````
(alternatively you can have a left outer join handle this)

With the above, you can first list who has more correct answers, then use the student_id from the first SQL to find the specific questions answered by the other student but not you.  It is also possible to list everything in one go - but that requires more complex SQL - I'll see if I can post that later when I have time.

0

Experts Exchange Solution brought to you by

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Author Commented:
Hi johanntagle

A series of views works for me as I find it easier to follow the logic - most importantly, from your feedback I can see it is indeed feasible. No doubt this will involve much jiggery pokery along the way as I move this to a demo.

Let's call this case closed for now - thank you again!

0
Commented:
Just want to correct myself. The SQL to determine who has answered all that Split_Pin has answered, plus some should be:

``````select a.student_id, count(a.question_id)
(select student_id, count(question_id)
where question_id in (select question_id from correctly_answered_questions where student_id=1)
group by student_id
having count(question_id) = (select count(question_id) from correctly_answered_questions where student_id=1)) b
where a.student_id=b.student_id
group by a.student_id
having count(a.question_id) > (select count(question_id) from correctly_answered_questions where student_id=1)
order by count(a.question_id) desc
``````

You may want to create another view for "select count(question_id) from..." to further simplify the above SQL
0
Author Commented:
This is great JohannTagle -  thanks for following up. I'm sure I will be poring over your posts in days to come!

Fundamentally I just wanted to get my head around a possible approach and you have got me off to a flying start. I really like to get in and play around with sample data and in doing so often find that my approach and requirements change along the way - but thanks to you I have a direction.

Cheers!
Split_Pin
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server 2008

From novice to tech pro — start learning today.