Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 577
  • Last Modified:

Add mean and median columns to mysql table

I have a mysql table with the fields -> name, rating

I want to add another column to the table which shows the mean and median value of all ratings. For example, let us say I have the following 5 records

Name           Date                 Rating

John            2011-02-20          5
John            2011-02-19          4
Ben             2011-02-19          3
John            2011-02-13          4
Ben             2011-02-11          3

I want a table like this

Name           Date                 Rating            Mean Rating          Median

John            2011-02-20          5                       4                         1
John            2011-02-19          4                       4                         0  
Ben             2011-02-19          3                       3                          0  
John            2011-02-13          4                       4                          0
Ben             2011-02-11          3                       3                          0

where mean = average rating
median = mean-current rating

Can anyone please help me with this? It is urgent.
0
imperialguy
Asked:
imperialguy
  • 13
  • 9
2 Solutions
 
Lukasz ChmielewskiCommented:
This would be for the mean (replace q1 with your table name and mean_rating field with yours)

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av

Open in new window

0
 
Lukasz ChmielewskiCommented:
And the whole I guess

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av,
    x.median = x.rating - av

Open in new window

0
 
imperialguyAuthor Commented:
Thanks for the solution. It works great for calculating the mean. But, according to definition, the median of an ordered list is the middle value if the number of elements is odd.  If the number of elements is even, then the median is the average of the two middle values. So, that means I have to order the ratings first and pick the middle value. Could you please guide me on how to do this ?
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
Lukasz ChmielewskiCommented:
I guess this would not be possible with single query. Do you need only MySQL query ? Can this be done with a little help of PHP ?
0
 
imperialguyAuthor Commented:
I strongly prefer if it can be done with mysql queries (it need not necessary be one single query, it can be a set of queries). But, if you think php is a must, then it is fine.
0
 
imperialguyAuthor Commented:
Either way (with or without php), please let me know if you have a solution.
0
 
Lukasz ChmielewskiCommented:
It is easier for me to construct a php function which updates the table with a median value rather then querying.
0
 
imperialguyAuthor Commented:
Sure. That is fine. Could you please construct that function for me?

Also, for the mean, I want it based on both first and last names, because first names can be the same for two or more people. Could you tell me how do I achieve the same thing based on both first and last names together ?
0
 
imperialguyAuthor Commented:
i am thinking something like concat first and last names and then use them in the sql query you gave. but, i don't know how exactly to do that.
0
 
Lukasz ChmielewskiCommented:
I'm working on it, but have to get back to you in the morning because it's terribly late here
If you want to work with concat, use CONCAT_WS(" ",name, surname)
0
 
imperialguyAuthor Commented:
Okay sure. I appreciate it. Please do get back to me. I am not exactly sure of how to use the concat feature, but here is what I wrote:

UPDATE q1 x
  JOIN (SELECT concat_ws(t1.candidate_last_name, t1.candidate_first_name) as t1.name
             ,AVG(t2.candidate_rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON concat(t2.candidate_last_name, t2.candidate_first_name) as t2.name = t1.name
        ) y

Open in new window


It didn't work. Any thoughts?
0
 
Lukasz ChmielewskiCommented:
This would do mean rating for the name and surname

UPDATE q1 x
  JOIN (SELECT CONCAT_WS(" ",t1.name,t1.surname) as person
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON CONCAT_WS(" ",t2.name,t2.surname) = CONCAT_WS(" ",t1.name,t1.surname) GROUP BY CONCAT_WS(" ",t2.name,t2.surname)
        ) y
     ON y.person = CONCAT_WS(" ",x.name,x.surname)
    SET x.mean_rating = av

Open in new window

0
 
imperialguyAuthor Commented:
Gotcha. That sure works. I am still facing troubles with finding the median. Any ideas?
0
 
imperialguyAuthor Commented:
This link gives some solution, but it is not clear enough for me.
0
 
Lukasz ChmielewskiCommented:
Did you actually try it ?
0
 
imperialguyAuthor Commented:
No, I did not understand it properly because I am new to mysql. So, I couldn't try that website's solution.
0
 
Lukasz ChmielewskiCommented:
Well, this should work - a PHP code - it updates the table q1 with a median value

$q = "select CONCAT_WS(' ',name,surname) as person, rating from q1 group by person";
$r = mysql_query($q);
while($w = mysql_fetch_array($r))
{
    echo"$w[person] - ";

    $q1 = "select * from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r1 = mysql_query($q1);
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;
    //echo"$median_row";
    
    // prevent the situation where median is limited with 2 in the query

    
    if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }    
    
    $r2 = mysql_query($q2);
    $w2 = mysql_fetch_array($r2);
    
    $median = $w2['median'];
    $q3 = "update q1 set median = $w2[median] where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r3 = mysql_query($q3);
    
}

Open in new window

0
 
AStaCommented:
May be add view (or table updated by trigger) for "mean rating" and "median" because it's data duplication?
0
 
imperialguyAuthor Commented:
Thank you Roads_Roads

if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }

Open in new window


I did not understand how exactly did you calculate the median in the above lines. Let us say we have two series of numbers:

1.) Series 1 -> 2 ,4, 5, 6, 8 (even)
2.) Series 2 -> 2, 4, 5, 6 (odd)

For even series, your code sums up all the ratings and then divides it by 2. I don't understand why is this done. For odd series, the code picks a rating based on the median_row and the limit which makes more sense, because median is all about picking the number and not summing up.

Could you please explain what is the median in each of the above series according to your code and why?
0
 
imperialguyAuthor Commented:
Also, I don't see where is the sorting of elements done?
0
 
imperialguyAuthor Commented:
I made some changes to your code. It works fine now. Here is the modified code:

$sqlstr = 'select candidate_name, candidate_rating from `'.$db_name.'`.`'.$db_table.'` WHERE candidate_name IS NOT NULL AND candidate_name<>"" AND candidate_rating IS NOT NULL AND candidate_rating<>""  GROUP BY candidate_name;';
$result = mysql_query($sqlstr);
if($result == false){
	die('Query error: '.$sqlstr);
}
while($w = mysql_fetch_array($result))
{	
    $sqlstr = 'select * from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null;';
    $r1 = mysql_query($sqlstr);
    if($r1 == false){
	die('Query error: '.$sqlstr);
    }
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;        
    
    // prevent the situation where median is limited with 2 in the query
    
    if($nr % 2 == 0) { // even number of rows
        if($nr == 1) {
		$limit = 1;
	}
	else {
        	$limit = 2;
	}
        $sqlstr = 'select avg(candidate_rating) as median from '.
        	  '(select candidate_rating from`'.$db_name.'`.`'.$db_table.'` WHERE candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.') x_tbl;';
    }
    else{ // odd number
        if($nr == 1) {
		$limit = 1;
	}
        else {
		$limit = 1;
	}
        $sqlstr = 'select candidate_rating as median from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.';';
    }    
    
    $r2 = mysql_query($sqlstr);
    if($r2 == false){
	die('Query error: '.$sqlstr);
    }
    $w2 = mysql_fetch_array($r2);    
    
    $median = $w2["median"];
    $sqlstr = 'update `'.$db_name.'`.`'.$db_table.'` set candidate_median_rating = "'.$median.'" where candidate_name = "'.$w["candidate_name"].'";';
    $r3 = mysql_query($sqlstr);
    if($r3 == false){
	die('Query error: '.$sqlstr);
    }	    
}

Open in new window


I removed the concat function and instead concatenated the first and last names in the database itself. It helps. I think this works good. The only issue is if $median is a double value and I try to insert it into the table using UPDATE SET through, then it gets inserted as 0.99 always. I don't know why. So, I just converted median into varchar in mysql tables and then updated them. It worked. I appreciate your help a lot.
0
 
Lukasz ChmielewskiCommented:
Sorry for so long time without response, I was away - if you need additional help or explanation I will be glad to help.
0
 
imperialguyAuthor Commented:
Made a few changes to the provided solution to adapt to my needs.
0

Featured Post

[Webinar] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

  • 13
  • 9
Tackle projects and never again get stuck behind a technical roadblock.
Join Now