• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 296
  • Last Modified:

Join on Median, Quartile and Average queries

I would like to combine my query below so that the results look like this:
Quality               Average Score         Quartile 1                  Quartile 3         Median
Promptness         10                          9                                  8                 9.5



SELECT Quality, [Average Score]
FROM
(
select  '1. Promptness'  as Quality, AVG(CASE WHEN prac_promptness NOT IN ('N/A', 'NA') THEN Convert(integer,prac_promptness) ELSE NULL END) as [Average Score]
from dbo.TBL_EDTL_PRACEVAL where prac_term = 'Fall Semester 2006'

) sub
WHERE NOT [Average Score] IS NULL
union
select '1. Promptness'  as Quality, sum(promptness)/count(*) as [Quartile 1]
from(
select top 25 Convert(integer,prac_promptness)as promptness
from dbo.TBL_EDTL_PRACEVAL
where prac_term = 'Fall Semester 2006' and prac_promptness not in ( 'N/A','NA')
order by Convert(integer,prac_promptness)
) as l

union

select '1. Promptness'  as Quality, sum(promptness)/count(*) as [Quartile3]
from(
select top 25 percent Convert(integer,prac_promptness)as promptness
from dbo.TBL_EDTL_PRACEVAL
where prac_term = 'Fall Semester 2006' and prac_promptness not in ( 'N/A','NA')
order by Convert(integer,prac_promptness)
) tt

union

declare @pos1 int <----- error messsage says Incorrect syntax near the keyword 'declare'
declare @pos2 int

declare @oneA float
declare @twoA float

select  @pos1 = (count(*)+1)/2, @pos2 = count(*)/2 + 1 from dbo.TBL_EDTL_PRACEVAL where prac_promptness not in ( 'N/A','NA')and IsNumeric(prac_comm_skills)=1
set rowcount @pos1
select @oneA = prac_promptness from dbo.TBL_EDTL_PRACEVAL where IsNumeric( prac_promptness)=1


set rowcount @pos2
select @twoA = prac_promptness from dbo.TBL_EDTL_PRACEVAL where IsNumeric( prac_promptness)=1
group by prac_promptness
set rowcount 0

select (@oneA + @twoA)/2.0 as Median

please help  thnx

0
friskee
Asked:
friskee
  • 5
  • 4
1 Solution
 
Brendt HessSenior DBACommented:
First - declare any vars before starting the SELECT statements.  Moving those to the top will correct the error you are receiving.

Second - there is no reason to include 1. Promptness as a constant in all of these queries.  

Third - once you start a SELECT statement, you can only include parts of one continuous statement in the SELECT.  The MEDIAN calculation would need to be performed prior to the SELECT statement designed to return your data.

Here is one way to do this entire thing in one SELECT (including the MEDIAN call).  I believe that Quartile1 and Quartile3 need definition.  For example, if Quartile1 would be the high scores, then you need to add a DESC qualifier to the ORDER BY clause.  Similarly, if the Quartile3 value would be from the first set lower than the median, you would need to adjust the DESC qualifier so that it is on the outer query, not the inner query.

There may be some syntax errors, but I don't identify any when testing here.

SELECT '1. Promptness' AS Quality,
      (
      SELECT AVG(
            CASE
                  WHEN prac_promptness NOT IN ('N/A', 'NA')
                  THEN Convert(integer,prac_promptness)
                  ELSE NULL
            END
            )
      FROM dbo.TBL_EDTL_PRACEVAL
      WHERE prac_term = 'Fall Semester 2006'
      ) [Average Score],

      (
      SELECT SUM(promptness) / COUNT(promptness)
      FROM (
            SELECT TOP 25 PERCENT
                  Convert(integer, prac_promptness )as promptness
            FROM dbo.TBL_EDTL_PRACEVAL
            WHERE prac_term = 'Fall Semester 2006'
                  AND prac_promptness not in ( 'N/A','NA')
            ORDER BY Convert(integer, prac_promptness)
            ) AS X
      ) AS Quartile1,

      (
      SELECT SUM(promptness)/COUNT(promptness)
      FROM (
            SELECT TOP 25 PERCENT promptness
            FROM (
                  SELECT TOP 50 PERCENT Convert(integer, prac_promptness )as promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_term = 'Fall Semester 2006'
                        AND prac_promptness not in ( 'N/A','NA')
                  ORDER by Convert(integer, prac_promptness) DESC
                  ) Src
            ORDER BY promptness
            ) tt
      ) AS Quartile3,

      (SELECT
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness)
                  ) s1
            ) +
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness) DESC
                  ) s2
            )
       / 2.0) AS Media
0
 
friskeeAuthor Commented:
The suggested solution thast is posted is giving me the correct results for the average, quartile 1 and 3 but for the median, the answer is not correct.  I am trying to use the Group by clause in the last select statement for the median as I've done in the original code above, but am running into an error message about line 1 not being contained in an aggregrate function or grouped by clause  
0
 
Brendt HessSenior DBACommented:
What median value do you expect?  What do you get from the above query?

Actually, I do not understand why the grouping occurs in your query.  Median has a simple definition - the middle value in a list (interpolated when there are an even number of lines in the list).  So, for this list (14 items):

1
1
1
2
2
3
4
5
6
6
7
7
8
9

The Median is (4+5)/2 = 4.5

If we add an additional '1' to the top of the list, the median is 4.  If we add a '9' at the bottom, the median is 5.

BUT:  if we use your procedure, we end up with a different result:

@Pos1 will be 7
@Pos2 will be 8

@oneA will be 4

However, @twoA will be selected from a grouped list:

1
2
3
4
5
6
7
8
9

This will return an 8, instead of a 5, resulting in a median value of (4+8)/2 = 6, not the 4.5 it should have returned.  I recommend verifying your median routine on a simplified set of data.
0
Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

 
friskeeAuthor Commented:
This is what is in the column:

prac_promptness
12
12
12
7
7
5
N/A

so based on the logic above, there should be a median of 9.5 since there are six values, then we take two values in the middle, add them and devide by 2-
0
 
friskeeAuthor Commented:
With the suggestion submitted above by bhess1, however, I get a median of 11.000 for the values in this column
0
 
Brendt HessSenior DBACommented:
Interesting.  Looking it over, it appears that I placed the division by 2 in the wrong place.  This code fragment provides the correct value for me from your data:

      0.5 * (
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness)
                  ) s1
            ) +
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.TBL_EDTL_PRACEVAL
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness) DESC
                  ) s2
            )
      ) AS Media

Note that I chose to multiply by 0.5 instead of dividing by 2.0 for two reasons:
    1)  Multiplication, even of fractional values, is faster than dividing.
    2)  Multiplying by 0.5 returns a result like 8.5, whereas dividing by 2.0 returns something like 8.500000

0
 
friskeeAuthor Commented:
Using the online statistical calculator, I get a median of 9.5 for the values entered above, but SQL may be considering the NA value a little differently in the calculation, would I be correct in assuming this?
0
 
Brendt HessSenior DBACommented:
ARRGH!  Can you tell that Monday is not my best day?  Please use the code below instead.

      0.5 *(
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 100 PERCENT * FROM (
                        SELECT TOP 50 PERCENT prac_promptness
                    FROM dbo.Test1
                    WHERE prac_promptness NOT IN ( 'N/A','NA')
                    ORDER BY Convert(integer, prac_promptness)
                    ) Src
                  ORDER BY convert(integer, prac_promptness) DESC
                  ) S1
            ) +
            (SELECT TOP 1 Convert(integer, prac_promptness)
            FROM (
                  SELECT TOP 50 PERCENT prac_promptness
                  FROM dbo.Test1
                  WHERE prac_promptness NOT IN ( 'N/A','NA')
                  ORDER BY Convert(integer, prac_promptness) DESC
                  ) s2
            )
      ) AS Media
0
 
friskeeAuthor Commented:
Thank you bhess1, I really like your thoroughness in explaining things you've been a tremendous help.
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

  • 5
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now