Solved

Add mean and median columns to mysql table

Posted on 2011-02-21
23
559 Views
Last Modified: 2012-05-11
I have a mysql table with the fields -> name, rating

I want to add another column to the table which shows the mean and median value of all ratings. For example, let us say I have the following 5 records

Name           Date                 Rating

John            2011-02-20          5
John            2011-02-19          4
Ben             2011-02-19          3
John            2011-02-13          4
Ben             2011-02-11          3

I want a table like this

Name           Date                 Rating            Mean Rating          Median

John            2011-02-20          5                       4                         1
John            2011-02-19          4                       4                         0  
Ben             2011-02-19          3                       3                          0  
John            2011-02-13          4                       4                          0
Ben             2011-02-11          3                       3                          0

where mean = average rating
median = mean-current rating

Can anyone please help me with this? It is urgent.
0
Comment
Question by:imperialguy
  • 13
  • 9
23 Comments
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34942845
This would be for the mean (replace q1 with your table name and mean_rating field with yours)

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av

Open in new window

0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34942883
And the whole I guess

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av,
    x.median = x.rating - av

Open in new window

0
 

Author Comment

by:imperialguy
ID: 34946274
Thanks for the solution. It works great for calculating the mean. But, according to definition, the median of an ordered list is the middle value if the number of elements is odd.  If the number of elements is even, then the median is the average of the two middle values. So, that means I have to order the ratings first and pick the middle value. Could you please guide me on how to do this ?
0
Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34947072
I guess this would not be possible with single query. Do you need only MySQL query ? Can this be done with a little help of PHP ?
0
 

Author Comment

by:imperialguy
ID: 34947231
I strongly prefer if it can be done with mysql queries (it need not necessary be one single query, it can be a set of queries). But, if you think php is a must, then it is fine.
0
 

Author Comment

by:imperialguy
ID: 34947247
Either way (with or without php), please let me know if you have a solution.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34947454
It is easier for me to construct a php function which updates the table with a median value rather then querying.
0
 

Author Comment

by:imperialguy
ID: 34947470
Sure. That is fine. Could you please construct that function for me?

Also, for the mean, I want it based on both first and last names, because first names can be the same for two or more people. Could you tell me how do I achieve the same thing based on both first and last names together ?
0
 

Author Comment

by:imperialguy
ID: 34947543
i am thinking something like concat first and last names and then use them in the sql query you gave. but, i don't know how exactly to do that.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34947564
I'm working on it, but have to get back to you in the morning because it's terribly late here
If you want to work with concat, use CONCAT_WS(" ",name, surname)
0
 

Author Comment

by:imperialguy
ID: 34947619
Okay sure. I appreciate it. Please do get back to me. I am not exactly sure of how to use the concat feature, but here is what I wrote:

UPDATE q1 x
  JOIN (SELECT concat_ws(t1.candidate_last_name, t1.candidate_first_name) as t1.name
             ,AVG(t2.candidate_rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON concat(t2.candidate_last_name, t2.candidate_first_name) as t2.name = t1.name
        ) y

Open in new window


It didn't work. Any thoughts?
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34949376
This would do mean rating for the name and surname

UPDATE q1 x
  JOIN (SELECT CONCAT_WS(" ",t1.name,t1.surname) as person
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON CONCAT_WS(" ",t2.name,t2.surname) = CONCAT_WS(" ",t1.name,t1.surname) GROUP BY CONCAT_WS(" ",t2.name,t2.surname)
        ) y
     ON y.person = CONCAT_WS(" ",x.name,x.surname)
    SET x.mean_rating = av

Open in new window

0
 

Author Comment

by:imperialguy
ID: 34949392
Gotcha. That sure works. I am still facing troubles with finding the median. Any ideas?
0
 

Author Comment

by:imperialguy
ID: 34949399
This link gives some solution, but it is not clear enough for me.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34949431
Did you actually try it ?
0
 

Author Comment

by:imperialguy
ID: 34949435
No, I did not understand it properly because I am new to mysql. So, I couldn't try that website's solution.
0
 
LVL 27

Accepted Solution

by:
Lukasz Chmielewski earned 500 total points
ID: 34950004
Well, this should work - a PHP code - it updates the table q1 with a median value

$q = "select CONCAT_WS(' ',name,surname) as person, rating from q1 group by person";
$r = mysql_query($q);
while($w = mysql_fetch_array($r))
{
    echo"$w[person] - ";

    $q1 = "select * from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r1 = mysql_query($q1);
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;
    //echo"$median_row";
    
    // prevent the situation where median is limited with 2 in the query

    
    if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }    
    
    $r2 = mysql_query($q2);
    $w2 = mysql_fetch_array($r2);
    
    $median = $w2['median'];
    $q3 = "update q1 set median = $w2[median] where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r3 = mysql_query($q3);
    
}

Open in new window

0
 
LVL 2

Expert Comment

by:ASta
ID: 34952650
May be add view (or table updated by trigger) for "mean rating" and "median" because it's data duplication?
0
 

Author Comment

by:imperialguy
ID: 34955070
Thank you Roads_Roads

if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }

Open in new window


I did not understand how exactly did you calculate the median in the above lines. Let us say we have two series of numbers:

1.) Series 1 -> 2 ,4, 5, 6, 8 (even)
2.) Series 2 -> 2, 4, 5, 6 (odd)

For even series, your code sums up all the ratings and then divides it by 2. I don't understand why is this done. For odd series, the code picks a rating based on the median_row and the limit which makes more sense, because median is all about picking the number and not summing up.

Could you please explain what is the median in each of the above series according to your code and why?
0
 

Author Comment

by:imperialguy
ID: 34955805
Also, I don't see where is the sorting of elements done?
0
 

Assisted Solution

by:imperialguy
imperialguy earned 0 total points
ID: 34957065
I made some changes to your code. It works fine now. Here is the modified code:

$sqlstr = 'select candidate_name, candidate_rating from `'.$db_name.'`.`'.$db_table.'` WHERE candidate_name IS NOT NULL AND candidate_name<>"" AND candidate_rating IS NOT NULL AND candidate_rating<>""  GROUP BY candidate_name;';
$result = mysql_query($sqlstr);
if($result == false){
	die('Query error: '.$sqlstr);
}
while($w = mysql_fetch_array($result))
{	
    $sqlstr = 'select * from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null;';
    $r1 = mysql_query($sqlstr);
    if($r1 == false){
	die('Query error: '.$sqlstr);
    }
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;        
    
    // prevent the situation where median is limited with 2 in the query
    
    if($nr % 2 == 0) { // even number of rows
        if($nr == 1) {
		$limit = 1;
	}
	else {
        	$limit = 2;
	}
        $sqlstr = 'select avg(candidate_rating) as median from '.
        	  '(select candidate_rating from`'.$db_name.'`.`'.$db_table.'` WHERE candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.') x_tbl;';
    }
    else{ // odd number
        if($nr == 1) {
		$limit = 1;
	}
        else {
		$limit = 1;
	}
        $sqlstr = 'select candidate_rating as median from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.';';
    }    
    
    $r2 = mysql_query($sqlstr);
    if($r2 == false){
	die('Query error: '.$sqlstr);
    }
    $w2 = mysql_fetch_array($r2);    
    
    $median = $w2["median"];
    $sqlstr = 'update `'.$db_name.'`.`'.$db_table.'` set candidate_median_rating = "'.$median.'" where candidate_name = "'.$w["candidate_name"].'";';
    $r3 = mysql_query($sqlstr);
    if($r3 == false){
	die('Query error: '.$sqlstr);
    }	    
}

Open in new window


I removed the concat function and instead concatenated the first and last names in the database itself. It helps. I think this works good. The only issue is if $median is a double value and I try to insert it into the table using UPDATE SET through, then it gets inserted as 0.99 always. I don't know why. So, I just converted median into varchar in mysql tables and then updated them. It worked. I appreciate your help a lot.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
ID: 34958430
Sorry for so long time without response, I was away - if you need additional help or explanation I will be glad to help.
0
 

Author Closing Comment

by:imperialguy
ID: 35126353
Made a few changes to the provided solution to adapt to my needs.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction In this article, I will by showing a nice little trick for MySQL similar to that of my previous EE Article for SQLite (http://www.sqlite.org/), A SQLite Tidbit: Quick Numbers Table Generation (http://www.experts-exchange.com/A_3570.htm…
Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL (http://www.experts-exchange.com/articles/201/Handling-Date-and-Time-in-PHP-and-MySQL.html) several years ago, it seemed like now was a good time to updat…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

832 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question