Help with an Aggregate and Sum Query

The goal is to display any dictatorid with more than one last name assigned to it along with the total of the billing lines for each of the multiple last names.

This does not quite work:
SELECT   distinct dictatorid, dictatorlastname,reportperiod, cnt,sumx
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 reportperiod,
                 COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
                 SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
            FROM ichart.transactions
           WHERE reportperiod = '201109')          
   WHERE cnt > 1 and reportperiod = '201109'
ORDER BY dictatorid ASC;
 
Gives me:
 
DICTATORID DICTATORLASTNAME                         REPORTPERIOD CNT                    SUMX                  
---------- ---------------------------------------- ------------ ---------------------- ----------------------
1893NOC    Felder, K                                201109       2                      6340                  
1893NOC    Zavatsky, J                              201109       2                      7985                  
 
The actual sum of billing lines for 1893NOC in reportperiod '201109' is 6669.  Not sure what I am doing wrong.  The goal is to display any dictatorid with more than one last name assigned along with the total of the billing lines for each of the multiple last names.
 
Can some one help me tweak this?

Thanks in advance!

codefingerAsked:
Who is Participating?
 
sdstuberCommented:
yes, I remember the question

you need to partition/group by both ID and NAME


when you do something like this...
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname)
it's reporting the sum of all lines by name, regardless of which id that name might be associated with.

So, if Felder, K  is on ID 1, ID 2 and ID 3,  you're summing all of those.

Is there a problem with the queries in http:#36957600  or http:#36957796   ?

they both return the requested results for the sample data provided.

0
 
HainKurtSr. System AnalystCommented:
can you give a sample data and result that you expect...

maybe there are other ways to write what you want, just a simpler way...
0
 
sdstuberCommented:
if you want to SUM by ID instead of name you need to change the partition clause

 SUM(BILLINGLINES) OVER (PARTITION BY DICTATORID ) sumx

but then you won't get values for each name.

maybe I'm not understanding what you're really looking for
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
HainKurtSr. System AnalystCommented:
i could not get either :) thats why I asked for a sample data and result he is after...
0
 
codefingerAuthor Commented:
Here is the requested sample and expected results requested (formatting doesn't hold together well, but hopefully you can still read  it :


SAMPLE:			
DICTATORID	DICTATORLASTNAME	BILLINGLINES	REPORTPERIOD
----------	----------------------------------------	----------------------	------------
5774BRC	Kindred, J	20	201109
5774BRC	Monaco, B	47	201109
5774BRC	Monaco, B	38	201109
3972BRC	Sledge, E	26	201109
5774BRC	Kindred, J	47	201109
5774BRC	Kindred, J	38	201109
3972BRC	Sledge, E	26	201109


EXPECTED RESULT OF QUERY			
		sumx	cnt
5774BRC	Monaco, B	85	2
5774BRC	Kindred, J	105	3

Open in new window

0
 
sdstuberCommented:
the query above in your question  returns these values
I don't see a problem

the column order is different  and you took out the period but the other content is the same
0
 
sdstuberCommented:
oops, nevermind  I had an error  I see the count differs
0
 
sdstuberCommented:
I don't understand why Sledge, E   isn't in the expected results though  did you filter on DICTATORID?
0
 
sdstuberCommented:
assuming you meant to exclude dictator's that only had one distinct billinglines value then try this


SELECT   DISTINCT dictatorid, dictatorlastname, sumx, cnt
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 reportperiod,
                 COUNT(DISTINCT billinglines) OVER (PARTITION BY dictatorid, dictatorlastname) cnt,
                 SUM(billinglines) OVER (PARTITION BY dictatorlastname) sumx
            FROM transactions
           WHERE reportperiod = '201109')
   WHERE cnt > 1
ORDER BY dictatorid ASC, cnt;
0
 
HainKurtSr. System AnalystCommented:
is this what you want?

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1

or is it too simple?

0
 
HainKurtSr. System AnalystCommented:
oops, forgot table name :)

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
  from  ichart.transactions
 group by DICTATORID, DICTATORLASTNAME
having count(1) > 1
0
 
sdstuberCommented:
that will return Sledge, E
0
 
sdstuberCommented:
you could however do something like this...


Select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(distinct billinglines) cnt
  from  ichart.transactions
 group by DICTATORID, DICTATORLASTNAME
having count(distinct billinglines) > 1
0
 
codefingerAuthor Commented:
No, that brings back way too many.rows.

I want to exclude dictatorid's that have only a one-to-one match with a dictatorlastname.

The goal is to present dictatorid's that are used with more than one lastname (this is incorrect data that will have to be corrected.), along with the count of how many times the combination is used and a sum of the billing lines for each combination.

If a dictatorid appears at all, it should appear at least twice, accompanied by different last names, with a count of at least 2, with a sum of the billing lines for each of the last names for the over-used dictatorid.



0
 
sdstuberCommented:
>>> No, that brings back way too many.rows.


what is "that"  ?  There are 4 queries above since you lasted posted.


 2 of the queries produce what you asked for given the data.

If they aren't correct, Can you expand your sample a little so we can see the pattern?
0
 
HainKurtSr. System AnalystCommented:
according to your data the one I posted (also the one posted by sdstuber - which is almost same) should give the desired result... or come up with a different sample data... the one you posted has all same lastname, or I dont see the difference...
0
 
sdstuberCommented:
>>> according to your data the one I posted


not quite, it returns extra row for Sledge which, according to the sample, isn't correct

my followup to that one does match the sample, but possibly the sample is full enough to reveal problems in either of the queries I posted.
0
 
codefingerAuthor Commented:
sdstuber,

If it helps, this question is related to one I posted a few weeks ago, for which YOU got the points:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_27344186.html

I tried to tweak it to get the sum of billing lines for each name:

SELECT   distinct dictatorid, dictatorlastname,sumx
    FROM (SELECT dictatorid,
                 dictatorlastname,
                 COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
                 SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
            FROM ichart.transactions
           WHERE reportperiod = '201109')
   WHERE cnt > 1
ORDER BY dictatorid ASC;

...and got back something like this:

DICTATORID DICTATORLASTNAME                         SUMX                  
---------- ---------------------------------------- ----------------------
1893NOC    Felder, K                                6340                  
1893NOC    Zavatsky, J                              7985                  
30164NOC   Gunn, S                                  4239                  
30164NOC   Venturini, A                             604                    
3682BRC    Alleman, L                               767                    
3682BRC    Lindsey, S                               21245            

But when I double check the first dictatorid with an independent query:
Select Sum(billinglines) from ichart.transactions where reportperiod='201109' and dictatorid = '1893NOC';
I get 6669, NOT 14325 (6340 + 7985).
so since the simpler query returns different results, I have to assume there is something wrong with the more complicated one...

I hope this gives enough information. If not, I will keep trying.
(Its not always easy to provide enough information on the first or even subsequent tries.
I appreciate your patience.)



     






0
 
codefingerAuthor Commented:
sdstuber:

DOH! 15 posts, wasting your time and mine (and HainKurt's), waiting most of the day for that simple solution because I did not ask the question properly in the first place....lesson learned....

In answer to your other question:
The expected output from the actual data is 52 rows (that is what I got when I did not try to sum anything).

Both of the queries you reference returned over 300 rows (even after adding the reportperiod criteria to the one where it was left out).  So I will take the blame for that as well and assume there was something wrong with the sample data I provided as compared to the actual data.

At any rate, despite the length of the journey, I have the solution I came for, and that is what is most important.  Once again, THANK YOU!



0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.