Link to home
Start Free TrialLog in
Avatar of KP_SoCal
KP_SoCalFlag for United States of America

asked on

Median Function in QMF SQL Query for DB2

According to this link, DB2 does not offer the "Median" function that is available in MS-Excel for instance.

Does anyone know of an alternative method that I could run in my query to produce the "Median" result?  The SQL below better demonstrates what I'm looking to achieve if "Median" was actually a function that was available to me.

Select FIELD1, Median([FIELD2])
From TABLE1
Group by FIELD1
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Hi KP,

It will vary a bit depending on which flavor of DB2 that you use, but this is one way:

  with a (rn, field1)
  as
  (
    SELECT row_number () over (), FIELD1 FROM mytable
  )
  SELECT field1 FROM a WHERE rn = (SELECT cast (max(rn) / 2 as int) FROM a);


Good Luck,
Kent
ASKER CERTIFIED SOLUTION
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Kent's solution, unfortunately (and unusually!), has a couple of problems :

1) It derives a single median value from an entire data set, not a series of medians from each partition of a partitioned data set, as requested (GROUP BY Field1).
2) It only works for lists with an odd number of values, and I believe it actually selects the wrong value (by one position) even in that case (though that is easy to fix by making it max(rn)+1 / 2).

http://mathworld.wolfram.com/StatisticalMedian.html

- Gary Patterson
Hi Gary,

Good catch on all of that.  That's what I get for trying to think after a day of hard labor.  :)  


If he's got the OLAP extensions, this gets a lot easier.  :)



Kent

  with a (rn, field2)
  as
  (
    SELECT row_number () over (partition by FIELD1), FIELD2 FROM mytable
  )
  SELECT field2 FROM a WHERE rn = (SELECT cast ((max(rn) + 1) / 2 as int) FROM a);

Open in new window

Avatar of KP_SoCal

ASKER

Guys, thanks for the quick responses.  I'm not sure which DB2 I'm running.  I'm querying the server from my PC via QMF.  My iSeries version is V5R4.  I'm not sure if this information helps.

I was able to run Kent's SQL (listed below), but I didn't get the expected results.  When grouping FIELD1, I have a total of 5 records.  The SQL only returned 4 records.  

I attached an Excel spreadsheet that better illustrates the results I'm looking to achieve.  In my spreadsheet, you'll see that I actually need to group the data on FIELD0 and FIELD1. Hope this makes sense.  I really appreciate the input.  Thanks! ;-)


with a (rn, field2)
  as
  (
    SELECT row_number () over (partition by FIELD1), FIELD2 FROM mytable
  )
  SELECT field2 FROM a WHERE rn = (SELECT cast ((max(rn) + 1) / 2 as int) FROM a);

Open in new window

Results.xls
ScreenPrint.bmp
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks guys for all your help on this!  I'll test this out tomorrow.  If I run into any more snags, I'll create a separate post.  Thanks again.

KP
As an FYI, here's a really great article on the subject as well, though it's primarily related to Access.

https://www.experts-exchange.com/Microsoft/Development/MS_Access/A_2529.html
That's closer, but is only correct for data sets with an odd number of members.  

For data sets with an even number of members, you need to calculate the average of the values found at max(rn)/2 and max(rn)/2+1.

- Gary Patterson