Link to home
Start Free TrialLog in
Avatar of Abi Sa
Abi Sa

asked on

First, Third and Inter Quartile Country wise

Below is the sample data:
Country;     Value
AAA          10
AAA           12
AAA           14
AAA           35
AAA           72
AAA           827
AAA           992
AAA           1000
AAA           1001
AAA           1002
BBB           12
BBB          15
BBB          26
BBB           27
BBB          30
BBB           33
BBB           50
BBB           400
BBB          800

My Final table should be as below
Country     FirstQ     ThirdQ     InterQ
AAA         XXX        YYY        ZZZ
BBB         XXX        YYY        ZZZ

I got the solution partially from this site  by CYBERKIWI

;with tmp as (
      select c=COUNT(*) over (), rn=ROW_NUMBER() over (order by n), n
      from quartile_test)
,qs as (
      select
            q1=1+(c-1.0)/4, q1a=FLOOR(1+(c-1.0)/4), q1b=CEILING(1+(c-1.0)/4),
            q2=1+(c-1.0)/2, q2a=FLOOR(1+(c-1.0)/2), q2b=CEILING(1+(c-1.0)/2),
            q3=1+(c-1.0)*3/4, q3a=FLOOR(1+(c-1.0)*3/4), q3b=CEILING(1+(c-1.0)*3/4)
      from (select top 1 c from tmp) x)
select (1-q1+q1a)*MIN(n)+(q1-q1a)*MAX(n)
from tmp,qs where rn in (q1a,q1b) group by qs.q1,qs.q1a,qs.q1b
union all
select (1-q2+q2a)*MIN(n)+(q2-q2a)*MAX(n)
from tmp,qs where rn in (q2a,q2b) group by qs.q2,qs.q2a,qs.q2b
union all
select (1-q3+q3a)*MIN(n)+(q3-q3a)*MAX(n)
from tmp,qs where rn in (q3a,q3b) group by qs.q3,qs.q3a,qs.q3b
union all
select n
from tmp,qs where rn=c


Can someone help me modify this as per country?
Avatar of Pawan Kumar
Pawan Kumar
Flag of India image

;with tmp as (
      select c=COUNT(*) over (), rn=ROW_NUMBER() over (PARTITION BY Country order by n), n
      from quartile_test)
,qs as (
      select
            q1=1+(c-1.0)/4, q1a=FLOOR(1+(c-1.0)/4), q1b=CEILING(1+(c-1.0)/4),
            q2=1+(c-1.0)/2, q2a=FLOOR(1+(c-1.0)/2), q2b=CEILING(1+(c-1.0)/2),
            q3=1+(c-1.0)*3/4, q3a=FLOOR(1+(c-1.0)*3/4), q3b=CEILING(1+(c-1.0)*3/4)
      from (select top 1 c from tmp) x)
select (1-q1+q1a)*MIN(n)+(q1-q1a)*MAX(n)
from tmp,qs where rn in (q1a,q1b) group by Country ,qs.q1,qs.q1a,qs.q1b
union all
select (1-q2+q2a)*MIN(n)+(q2-q2a)*MAX(n)
from tmp,qs where rn in (q2a,q2b) group by Country ,qs.q2,qs.q2a,qs.q2b
union all
select (1-q3+q3a)*MIN(n)+(q3-q3a)*MAX(n)
from tmp,qs where rn in (q3a,q3b) group by Country ,qs.q3,qs.q3a,qs.q3b
union all
select n
from tmp,qs where rn=c

Open in new window

Avatar of Abi Sa
Abi Sa

ASKER

This is my recent code:

with tmp (c, rn,Country,Value)
     as (
                        SELECT
                            c=COUNT(*) over ()
                            ,rn=ROW_NUMBER() over (order by Value)
                    ,Country
                    ,Value
                        FROM Countrysample
                                                
            ),
        qs (q1,q1a,q1b,q3,q3a,q3b)
     as (            
                       Select
                                q1=1+(c-1.0)/4, q1a=FLOOR(1+(c-1.0)/4), q1b=CEILING(1+(c-1.0)/4),
                                q3=1+(c-1.0)*3/4, q3a=FLOOR(1+(c-1.0)*3/4), q3b=CEILING(1+(c-1.0)*3/4)
                      From
                         (select top 1 c from tmp order by Value) x
            ),
FirstQuartile as (

Select (1-q1+q1a)*MIN(Value)+(q1-q1a)*MAX(Value) FQ
From tmp,qs
Where rn in (q1a,q1b)
Group By qs.q1,qs.q1a,qs.q1b

),
ThirdQuartile as (

Select (1-q3+q3a)*MIN(Value)+(q3-q3a)*MAX(Value) TQ
From tmp,qs
Where rn in (q3a,q3b)
Group By qs.q3,qs.q3a,qs.q3b

),

InterQuartile as (

Select (TQ-FQ) IQR
From FirstQuartile,ThirdQuartile

)


Select Country, FQ, TQ, IQR
From tmp,qs,FirstQuartile,ThirdQuartile,InterQuartile
/*Where rn=c*/
GO


For the above code i get output as below:

Country      FQ      TQ      IQR
AAA        XXX       XXX       XXX

It is giving FQ, TQ and IQR for the whole dataset and not for AAA alone.
As well i need BBB country data in a new line with its corresponding data
You need to use Group By Country
Avatar of Abi Sa

ASKER

It is erroring out since there are no aggregate function used...
Avatar of Abi Sa

ASKER

May be I am overlooking the code or something. Can you help me out here please.
Hv u tried my first suggestion ?
Avatar of Abi Sa

ASKER

Yes.. It is not working out :(
Pls send some data in the excel and the expected output.
Avatar of Abi Sa

ASKER

Hi .. Here's a sample file attached with 2 country data and the expected result.

Thanks in advance!
CountrySample.xlsx
Avatar of Kevin Cross
What about using NTILE?
By the way, while I work out what you are doing to see if it is simpler with NTILE, note that you need to adjust both the COUNT and the ROW_NUMBER with PARTITION BY Country.

Here is what I have to start on the NTILE solution but not sure if the math matches up with yours as I just used AVG for the example.
;WITH tmp(Country, V, Q) AS (
	SELECT [Country],[Value]
		 , NTILE(4) OVER(PARTITION BY [Country] ORDER BY [Value])
	FROM Countrysample
)
SELECT Country, FQ = [1], TQ = [3], IQR = [3] - [1]
FROM tmp t
PIVOT (
	AVG(V) FOR Q IN ([1], [3])
) p

Open in new window

Okay, I figured out how to correct the original solution you have but confused on what the FQ and TQ should represent math wise from your sample.  Currently the challenge is that AAA for example has 19 rows, so you do not get four even splits of five rows each.  Therefore, I believe CyberKiwi came up with a solution to find the row before and after.  However, in your sample data, it is coming up with rows 5 and 6 for AAA.  This is why you get 68.795 because it is 0.5 * 68.770  + 0.5 * 68.820.  Is this correct?  It essentially is MAX value of the quartile if you had number of rows divisible by 4.
ASKER CERTIFIED SOLUTION
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Abi Sa

ASKER

Thanks Kevin.. Using NTile gave me approximate value which should be fine for my calculation... Thank you :)