Solved

Add mean and median columns to mysql table

Posted on 2011-02-21
23
547 Views
Last Modified: 2012-05-11
I have a mysql table with the fields -> name, rating

I want to add another column to the table which shows the mean and median value of all ratings. For example, let us say I have the following 5 records

Name           Date                 Rating

John            2011-02-20          5
John            2011-02-19          4
Ben             2011-02-19          3
John            2011-02-13          4
Ben             2011-02-11          3

I want a table like this

Name           Date                 Rating            Mean Rating          Median

John            2011-02-20          5                       4                         1
John            2011-02-19          4                       4                         0  
Ben             2011-02-19          3                       3                          0  
John            2011-02-13          4                       4                          0
Ben             2011-02-11          3                       3                          0

where mean = average rating
median = mean-current rating

Can anyone please help me with this? It is urgent.
0
Comment
Question by:imperialguy
  • 13
  • 9
23 Comments
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
This would be for the mean (replace q1 with your table name and mean_rating field with yours)

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av

Open in new window

0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
And the whole I guess

UPDATE q1 x
  JOIN (SELECT t1.name
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON t2.name = t1.name GROUP BY t1.name
        ) y
     ON y.name = x.name
    SET x.mean_rating = av,
    x.median = x.rating - av

Open in new window

0
 

Author Comment

by:imperialguy
Comment Utility
Thanks for the solution. It works great for calculating the mean. But, according to definition, the median of an ordered list is the middle value if the number of elements is odd.  If the number of elements is even, then the median is the average of the two middle values. So, that means I have to order the ratings first and pick the middle value. Could you please guide me on how to do this ?
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
I guess this would not be possible with single query. Do you need only MySQL query ? Can this be done with a little help of PHP ?
0
 

Author Comment

by:imperialguy
Comment Utility
I strongly prefer if it can be done with mysql queries (it need not necessary be one single query, it can be a set of queries). But, if you think php is a must, then it is fine.
0
 

Author Comment

by:imperialguy
Comment Utility
Either way (with or without php), please let me know if you have a solution.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
It is easier for me to construct a php function which updates the table with a median value rather then querying.
0
 

Author Comment

by:imperialguy
Comment Utility
Sure. That is fine. Could you please construct that function for me?

Also, for the mean, I want it based on both first and last names, because first names can be the same for two or more people. Could you tell me how do I achieve the same thing based on both first and last names together ?
0
 

Author Comment

by:imperialguy
Comment Utility
i am thinking something like concat first and last names and then use them in the sql query you gave. but, i don't know how exactly to do that.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
I'm working on it, but have to get back to you in the morning because it's terribly late here
If you want to work with concat, use CONCAT_WS(" ",name, surname)
0
 

Author Comment

by:imperialguy
Comment Utility
Okay sure. I appreciate it. Please do get back to me. I am not exactly sure of how to use the concat feature, but here is what I wrote:

UPDATE q1 x
  JOIN (SELECT concat_ws(t1.candidate_last_name, t1.candidate_first_name) as t1.name
             ,AVG(t2.candidate_rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON concat(t2.candidate_last_name, t2.candidate_first_name) as t2.name = t1.name
        ) y

Open in new window


It didn't work. Any thoughts?
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
This would do mean rating for the name and surname

UPDATE q1 x
  JOIN (SELECT CONCAT_WS(" ",t1.name,t1.surname) as person
             ,AVG(t2.rating) av 
          FROM q1 t1 
          JOIN q1 t2 
            ON CONCAT_WS(" ",t2.name,t2.surname) = CONCAT_WS(" ",t1.name,t1.surname) GROUP BY CONCAT_WS(" ",t2.name,t2.surname)
        ) y
     ON y.person = CONCAT_WS(" ",x.name,x.surname)
    SET x.mean_rating = av

Open in new window

0
 

Author Comment

by:imperialguy
Comment Utility
Gotcha. That sure works. I am still facing troubles with finding the median. Any ideas?
0
 

Author Comment

by:imperialguy
Comment Utility
This link gives some solution, but it is not clear enough for me.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
Did you actually try it ?
0
 

Author Comment

by:imperialguy
Comment Utility
No, I did not understand it properly because I am new to mysql. So, I couldn't try that website's solution.
0
 
LVL 27

Accepted Solution

by:
Lukasz Chmielewski earned 500 total points
Comment Utility
Well, this should work - a PHP code - it updates the table q1 with a median value

$q = "select CONCAT_WS(' ',name,surname) as person, rating from q1 group by person";
$r = mysql_query($q);
while($w = mysql_fetch_array($r))
{
    echo"$w[person] - ";

    $q1 = "select * from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r1 = mysql_query($q1);
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;
    //echo"$median_row";
    
    // prevent the situation where median is limited with 2 in the query

    
    if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }    
    
    $r2 = mysql_query($q2);
    $w2 = mysql_fetch_array($r2);
    
    $median = $w2['median'];
    $q3 = "update q1 set median = $w2[median] where CONCAT_WS(' ',name,surname) = '".$w[person]."'";
    $r3 = mysql_query($q3);
    
}

Open in new window

0
 
LVL 2

Expert Comment

by:ASta
Comment Utility
May be add view (or table updated by trigger) for "mean rating" and "median" because it's data duplication?
0
 

Author Comment

by:imperialguy
Comment Utility
Thank you Roads_Roads

if($nr % 2 == 0){ // even number of rows
        if($nr == 1) $limit = 1;
        else $limit = 2;
        $q2 = "select sum(rating)/2 as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;
    }
    else{ // odd number
        if($nr == 1) $limit = 1;
        else $limit = 1;
        $q2 = "select rating as median from q1 where CONCAT_WS(' ',name,surname) = '".$w[person]."' LIMIT ".$median_row.",".$limit;    
    }

Open in new window


I did not understand how exactly did you calculate the median in the above lines. Let us say we have two series of numbers:

1.) Series 1 -> 2 ,4, 5, 6, 8 (even)
2.) Series 2 -> 2, 4, 5, 6 (odd)

For even series, your code sums up all the ratings and then divides it by 2. I don't understand why is this done. For odd series, the code picks a rating based on the median_row and the limit which makes more sense, because median is all about picking the number and not summing up.

Could you please explain what is the median in each of the above series according to your code and why?
0
 

Author Comment

by:imperialguy
Comment Utility
Also, I don't see where is the sorting of elements done?
0
 

Assisted Solution

by:imperialguy
imperialguy earned 0 total points
Comment Utility
I made some changes to your code. It works fine now. Here is the modified code:

$sqlstr = 'select candidate_name, candidate_rating from `'.$db_name.'`.`'.$db_table.'` WHERE candidate_name IS NOT NULL AND candidate_name<>"" AND candidate_rating IS NOT NULL AND candidate_rating<>""  GROUP BY candidate_name;';
$result = mysql_query($sqlstr);
if($result == false){
	die('Query error: '.$sqlstr);
}
while($w = mysql_fetch_array($result))
{	
    $sqlstr = 'select * from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null;';
    $r1 = mysql_query($sqlstr);
    if($r1 == false){
	die('Query error: '.$sqlstr);
    }
    $nr = mysql_num_rows($r1);
    $median_row = ceil($nr/2)-1;        
    
    // prevent the situation where median is limited with 2 in the query
    
    if($nr % 2 == 0) { // even number of rows
        if($nr == 1) {
		$limit = 1;
	}
	else {
        	$limit = 2;
	}
        $sqlstr = 'select avg(candidate_rating) as median from '.
        	  '(select candidate_rating from`'.$db_name.'`.`'.$db_table.'` WHERE candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.') x_tbl;';
    }
    else{ // odd number
        if($nr == 1) {
		$limit = 1;
	}
        else {
		$limit = 1;
	}
        $sqlstr = 'select candidate_rating as median from `'.$db_name.'`.`'.$db_table.'` where candidate_name = "'.$w["candidate_name"].'" AND candidate_rating is not null ORDER BY candidate_rating LIMIT '.$median_row.','.$limit.';';
    }    
    
    $r2 = mysql_query($sqlstr);
    if($r2 == false){
	die('Query error: '.$sqlstr);
    }
    $w2 = mysql_fetch_array($r2);    
    
    $median = $w2["median"];
    $sqlstr = 'update `'.$db_name.'`.`'.$db_table.'` set candidate_median_rating = "'.$median.'" where candidate_name = "'.$w["candidate_name"].'";';
    $r3 = mysql_query($sqlstr);
    if($r3 == false){
	die('Query error: '.$sqlstr);
    }	    
}

Open in new window


I removed the concat function and instead concatenated the first and last names in the database itself. It helps. I think this works good. The only issue is if $median is a double value and I try to insert it into the table using UPDATE SET through, then it gets inserted as 0.99 always. I don't know why. So, I just converted median into varchar in mysql tables and then updated them. It worked. I appreciate your help a lot.
0
 
LVL 27

Expert Comment

by:Lukasz Chmielewski
Comment Utility
Sorry for so long time without response, I was away - if you need additional help or explanation I will be glad to help.
0
 

Author Closing Comment

by:imperialguy
Comment Utility
Made a few changes to the provided solution to adapt to my needs.
0

Featured Post

Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

Join & Write a Comment

A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL (http://www.experts-exchange.com/articles/201/Handling-Date-and-Time-in-PHP-and-MySQL.html) several years ago, it seemed like now was a good time to updat…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now