northward
asked on
Data Mining and Analysing Data
I teach in a school and I have many students taking different combinations of subjects over a few years. I would like to enquire what is the easiest way to do data mining of the marks/grades.
For eg, A, B and C are students and X, Y and Z are subjects.
A takes X and Y in 2011 and 2012
B takes Y and Z in 2011 and 2012
C takes X and Z in 2011 and 2012
My Data is stored as
A, X, 2011, m1
A, Y, 2011, m2
B, Y, 2011, m3
B, Z, 2011, m4
C, X, 2011, m5
C, Z, 2011, m6
A, X, 2012, m7
A, Y, 2012, m8
B, Y, 2012, m9
B, Z, 2012, m10
C, X, 2012, m11
C, Z, 2012, m12
Now I have a lot more students and subjects ... taken over a few years.
I want to systematically go through all the combinations .... say I have n subjects, then choose two of the subject eg X in 2011 and Y in 2012, or X in 2011 and X in 2012, check if I have more than m points (maybe m = 10), do a linear regression and return the R-square value and also the gradient and intercept of the linear regression.
The task may be interrupted and continued on another day.
If it takes too long on one machine, I may decide to run it on a cloud.
What would you suggest I do?
Thanks.
For eg, A, B and C are students and X, Y and Z are subjects.
A takes X and Y in 2011 and 2012
B takes Y and Z in 2011 and 2012
C takes X and Z in 2011 and 2012
My Data is stored as
A, X, 2011, m1
A, Y, 2011, m2
B, Y, 2011, m3
B, Z, 2011, m4
C, X, 2011, m5
C, Z, 2011, m6
A, X, 2012, m7
A, Y, 2012, m8
B, Y, 2012, m9
B, Z, 2012, m10
C, X, 2012, m11
C, Z, 2012, m12
Now I have a lot more students and subjects ... taken over a few years.
I want to systematically go through all the combinations .... say I have n subjects, then choose two of the subject eg X in 2011 and Y in 2012, or X in 2011 and X in 2012, check if I have more than m points (maybe m = 10), do a linear regression and return the R-square value and also the gradient and intercept of the linear regression.
The task may be interrupted and continued on another day.
If it takes too long on one machine, I may decide to run it on a cloud.
What would you suggest I do?
Thanks.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you. :)
ASKER