# Help with an Aggregate and Sum Query

The goal is to display any dictatorid with more than one last name assigned to it along with the total of the billing lines for each of the multiple last names.

This does not quite work:
SELECT   distinct dictatorid, dictatorlastname,reportperiod, cnt,sumx
FROM (SELECT dictatorid,
dictatorlastname,
reportperiod,
COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
FROM ichart.transactions
WHERE reportperiod = '201109')
WHERE cnt > 1 and reportperiod = '201109'
ORDER BY dictatorid ASC;

Gives me:

DICTATORID DICTATORLASTNAME                         REPORTPERIOD CNT                    SUMX
---------- ---------------------------------------- ------------ ---------------------- ----------------------
1893NOC    Felder, K                                201109       2                      6340
1893NOC    Zavatsky, J                              201109       2                      7985

The actual sum of billing lines for 1893NOC in reportperiod '201109' is 6669.  Not sure what I am doing wrong.  The goal is to display any dictatorid with more than one last name assigned along with the total of the billing lines for each of the multiple last names.

Can some one help me tweak this?

###### Who is Participating?

Commented:
yes, I remember the question

you need to partition/group by both ID and NAME

when you do something like this...
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname)
it's reporting the sum of all lines by name, regardless of which id that name might be associated with.

So, if Felder, K  is on ID 1, ID 2 and ID 3,  you're summing all of those.

Is there a problem with the queries in http:#36957600  or http:#36957796   ?

they both return the requested results for the sample data provided.

0

Sr. System AnalystCommented:
can you give a sample data and result that you expect...

maybe there are other ways to write what you want, just a simpler way...
0

Commented:
if you want to SUM by ID instead of name you need to change the partition clause

SUM(BILLINGLINES) OVER (PARTITION BY DICTATORID ) sumx

but then you won't get values for each name.

maybe I'm not understanding what you're really looking for
0

Sr. System AnalystCommented:
i could not get either :) thats why I asked for a sample data and result he is after...
0

Author Commented:
Here is the requested sample and expected results requested (formatting doesn't hold together well, but hopefully you can still read  it :

``````SAMPLE:
DICTATORID	DICTATORLASTNAME	BILLINGLINES	REPORTPERIOD
----------	----------------------------------------	----------------------	------------
5774BRC	Kindred, J	20	201109
5774BRC	Monaco, B	47	201109
5774BRC	Monaco, B	38	201109
3972BRC	Sledge, E	26	201109
5774BRC	Kindred, J	47	201109
5774BRC	Kindred, J	38	201109
3972BRC	Sledge, E	26	201109

EXPECTED RESULT OF QUERY
sumx	cnt
5774BRC	Monaco, B	85	2
5774BRC	Kindred, J	105	3
``````
0

Commented:
the query above in your question  returns these values
I don't see a problem

the column order is different  and you took out the period but the other content is the same
0

Commented:
oops, nevermind  I had an error  I see the count differs
0

Commented:
I don't understand why Sledge, E   isn't in the expected results though  did you filter on DICTATORID?
0

Commented:
assuming you meant to exclude dictator's that only had one distinct billinglines value then try this

SELECT   DISTINCT dictatorid, dictatorlastname, sumx, cnt
FROM (SELECT dictatorid,
dictatorlastname,
reportperiod,
COUNT(DISTINCT billinglines) OVER (PARTITION BY dictatorid, dictatorlastname) cnt,
SUM(billinglines) OVER (PARTITION BY dictatorlastname) sumx
FROM transactions
WHERE reportperiod = '201109')
WHERE cnt > 1
ORDER BY dictatorid ASC, cnt;
0

Sr. System AnalystCommented:
is this what you want?

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1

or is it too simple?

0

Sr. System AnalystCommented:
oops, forgot table name :)

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
from  ichart.transactions
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1
0

Commented:
that will return Sledge, E
0

Commented:
you could however do something like this...

Select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(distinct billinglines) cnt
from  ichart.transactions
group by DICTATORID, DICTATORLASTNAME
having count(distinct billinglines) > 1
0

Author Commented:
No, that brings back way too many.rows.

I want to exclude dictatorid's that have only a one-to-one match with a dictatorlastname.

The goal is to present dictatorid's that are used with more than one lastname (this is incorrect data that will have to be corrected.), along with the count of how many times the combination is used and a sum of the billing lines for each combination.

If a dictatorid appears at all, it should appear at least twice, accompanied by different last names, with a count of at least 2, with a sum of the billing lines for each of the last names for the over-used dictatorid.

0

Commented:
>>> No, that brings back way too many.rows.

what is "that"  ?  There are 4 queries above since you lasted posted.

2 of the queries produce what you asked for given the data.

If they aren't correct, Can you expand your sample a little so we can see the pattern?
0

Sr. System AnalystCommented:
according to your data the one I posted (also the one posted by sdstuber - which is almost same) should give the desired result... or come up with a different sample data... the one you posted has all same lastname, or I dont see the difference...
0

Commented:
>>> according to your data the one I posted

not quite, it returns extra row for Sledge which, according to the sample, isn't correct

my followup to that one does match the sample, but possibly the sample is full enough to reveal problems in either of the queries I posted.
0

Author Commented:
sdstuber,

If it helps, this question is related to one I posted a few weeks ago, for which YOU got the points:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_27344186.html

I tried to tweak it to get the sum of billing lines for each name:

SELECT   distinct dictatorid, dictatorlastname,sumx
FROM (SELECT dictatorid,
dictatorlastname,
COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
FROM ichart.transactions
WHERE reportperiod = '201109')
WHERE cnt > 1
ORDER BY dictatorid ASC;

...and got back something like this:

DICTATORID DICTATORLASTNAME                         SUMX
---------- ---------------------------------------- ----------------------
1893NOC    Felder, K                                6340
1893NOC    Zavatsky, J                              7985
30164NOC   Gunn, S                                  4239
30164NOC   Venturini, A                             604
3682BRC    Alleman, L                               767
3682BRC    Lindsey, S                               21245

But when I double check the first dictatorid with an independent query:
Select Sum(billinglines) from ichart.transactions where reportperiod='201109' and dictatorid = '1893NOC';
I get 6669, NOT 14325 (6340 + 7985).
so since the simpler query returns different results, I have to assume there is something wrong with the more complicated one...

I hope this gives enough information. If not, I will keep trying.
(Its not always easy to provide enough information on the first or even subsequent tries.

0

Author Commented:
sdstuber:

DOH! 15 posts, wasting your time and mine (and HainKurt's), waiting most of the day for that simple solution because I did not ask the question properly in the first place....lesson learned....

The expected output from the actual data is 52 rows (that is what I got when I did not try to sum anything).

Both of the queries you reference returned over 300 rows (even after adding the reportperiod criteria to the one where it was left out).  So I will take the blame for that as well and assume there was something wrong with the sample data I provided as compared to the actual data.

At any rate, despite the length of the journey, I have the solution I came for, and that is what is most important.  Once again, THANK YOU!

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.