Link to home
Start Free TrialLog in
Avatar of friskee
friskee

asked on

Join on Median, Quartile and Average queries

I would like to combine my query below so that the results look like this:
Quality               Average Score         Quartile 1                  Quartile 3         Median
Promptness         10                          9                                  8                 9.5



SELECT Quality, [Average Score]
FROM
(
select  '1. Promptness'  as Quality, AVG(CASE WHEN prac_promptness NOT IN ('N/A', 'NA') THEN Convert(integer,prac_promptness) ELSE NULL END) as [Average Score]
from dbo.TBL_EDTL_PRACEVAL where prac_term = 'Fall Semester 2006'

) sub
WHERE NOT [Average Score] IS NULL
union
select '1. Promptness'  as Quality, sum(promptness)/count(*) as [Quartile 1]
from(
select top 25 Convert(integer,prac_promptness)as promptness
from dbo.TBL_EDTL_PRACEVAL
where prac_term = 'Fall Semester 2006' and prac_promptness not in ( 'N/A','NA')
order by Convert(integer,prac_promptness)
) as l

union

select '1. Promptness'  as Quality, sum(promptness)/count(*) as [Quartile3]
from(
select top 25 percent Convert(integer,prac_promptness)as promptness
from dbo.TBL_EDTL_PRACEVAL
where prac_term = 'Fall Semester 2006' and prac_promptness not in ( 'N/A','NA')
order by Convert(integer,prac_promptness)
) tt

union

declare @pos1 int <----- error messsage says Incorrect syntax near the keyword 'declare'
declare @pos2 int

declare @oneA float
declare @twoA float

select  @pos1 = (count(*)+1)/2, @pos2 = count(*)/2 + 1 from dbo.TBL_EDTL_PRACEVAL where prac_promptness not in ( 'N/A','NA')and IsNumeric(prac_comm_skills)=1
set rowcount @pos1
select @oneA = prac_promptness from dbo.TBL_EDTL_PRACEVAL where IsNumeric( prac_promptness)=1


set rowcount @pos2
select @twoA = prac_promptness from dbo.TBL_EDTL_PRACEVAL where IsNumeric( prac_promptness)=1
group by prac_promptness
set rowcount 0

select (@oneA + @twoA)/2.0 as Median

please help  thnx

Avatar of Brendt Hess
Brendt Hess
Flag of United States of America image

First - declare any vars before starting the SELECT statements.  Moving those to the top will correct the error you are receiving.

Second - there is no reason to include 1. Promptness as a constant in all of these queries.  

Third - once you start a SELECT statement, you can only include parts of one continuous statement in the SELECT.  The MEDIAN calculation would need to be performed prior to the SELECT statement designed to return your data.

Here is one way to do this entire thing in one SELECT (including the MEDIAN call).  I believe that Quartile1 and Quartile3 need definition.  For example, if Quartile1 would be the high scores, then you need to add a DESC qualifier to the ORDER BY clause.  Similarly, if the Quartile3 value would be from the first set lower than the median, you would need to adjust the DESC qualifier so that it is on the outer query, not the inner query.

There may be some syntax errors, but I don't identify any when testing here.

SELECT '1. Promptness' AS Quality,
      (
      SELECT AVG(
            CASE
                  WHEN prac_promptness NOT IN ('N/A', 'NA')
                  THEN Convert(integer,prac_promptness)
                  ELSE NULL
            END
            )
      FROM dbo.TBL_EDTL_PRACEVAL
      WHERE prac_term = 'Fall Semester 2006'
      ) [Average Score],

      (
      SELECT SUM(promptness) / COUNT(promptness)
      FROM (
            SELECT TOP 25 PERCENT
                  Convert(integer, prac_promptness )as promptness
            FROM dbo.TBL_EDTL_PRACEVAL
            WHERE prac_term = 'Fall Semester 2006'
                  AND prac_promptness not in ( 'N/A','NA')
            ORDER BY Convert(integer, prac_promptness)
            ) AS X
      ) AS Quartile1,

      (
      SELECT SUM(promptness)/COUNT(promptness)
      FROM (
            SELECT TOP 25 PERCENT promptness
            FROM (
                  SELECT TOP 50 PERCENT Convert(integer, prac_promptness )as promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_term = 'Fall Semester 2006'
                        AND prac_promptness not in ( 'N/A','NA')
                  ORDER by Convert(integer, prac_promptness) DESC
                  ) Src
            ORDER BY promptness
            ) tt
      ) AS Quartile3,

      (SELECT
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness)
                  ) s1
            ) +
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness) DESC
                  ) s2
            )
       / 2.0) AS Media
Avatar of friskee
friskee

ASKER

The suggested solution thast is posted is giving me the correct results for the average, quartile 1 and 3 but for the median, the answer is not correct.  I am trying to use the Group by clause in the last select statement for the median as I've done in the original code above, but am running into an error message about line 1 not being contained in an aggregrate function or grouped by clause  
What median value do you expect?  What do you get from the above query?

Actually, I do not understand why the grouping occurs in your query.  Median has a simple definition - the middle value in a list (interpolated when there are an even number of lines in the list).  So, for this list (14 items):

1
1
1
2
2
3
4
5
6
6
7
7
8
9

The Median is (4+5)/2 = 4.5

If we add an additional '1' to the top of the list, the median is 4.  If we add a '9' at the bottom, the median is 5.

BUT:  if we use your procedure, we end up with a different result:

@Pos1 will be 7
@Pos2 will be 8

@oneA will be 4

However, @twoA will be selected from a grouped list:

1
2
3
4
5
6
7
8
9

This will return an 8, instead of a 5, resulting in a median value of (4+8)/2 = 6, not the 4.5 it should have returned.  I recommend verifying your median routine on a simplified set of data.
Avatar of friskee

ASKER

This is what is in the column:

prac_promptness
12
12
12
7
7
5
N/A

so based on the logic above, there should be a median of 9.5 since there are six values, then we take two values in the middle, add them and devide by 2-
Avatar of friskee

ASKER

With the suggestion submitted above by bhess1, however, I get a median of 11.000 for the values in this column
Interesting.  Looking it over, it appears that I placed the division by 2 in the wrong place.  This code fragment provides the correct value for me from your data:

      0.5 * (
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness)
                  ) s1
            ) +
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness) DESC
                  ) s2
            )
      ) AS Media

Note that I chose to multiply by 0.5 instead of dividing by 2.0 for two reasons:
    1)  Multiplication, even of fractional values, is faster than dividing.
    2)  Multiplying by 0.5 returns a result like 8.5, whereas dividing by 2.0 returns something like 8.500000

Avatar of friskee

ASKER

Using the online statistical calculator, I get a median of 9.5 for the values entered above, but SQL may be considering the NA value a little differently in the calculation, would I be correct in assuming this?
ASKER CERTIFIED SOLUTION
Avatar of Brendt Hess
Brendt Hess
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of friskee

ASKER

Thank you bhess1, I really like your thoroughness in explaining things you've been a tremendous help.