Solved

# Help with an Aggregate and Sum Query

Posted on 2011-10-12
221 Views
The goal is to display any dictatorid with more than one last name assigned to it along with the total of the billing lines for each of the multiple last names.

This does not quite work:
SELECT   distinct dictatorid, dictatorlastname,reportperiod, cnt,sumx
FROM (SELECT dictatorid,
dictatorlastname,
reportperiod,
COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
FROM ichart.transactions
WHERE reportperiod = '201109')
WHERE cnt > 1 and reportperiod = '201109'
ORDER BY dictatorid ASC;

Gives me:

DICTATORID DICTATORLASTNAME                         REPORTPERIOD CNT                    SUMX
---------- ---------------------------------------- ------------ ---------------------- ----------------------
1893NOC    Felder, K                                201109       2                      6340
1893NOC    Zavatsky, J                              201109       2                      7985

The actual sum of billing lines for 1893NOC in reportperiod '201109' is 6669.  Not sure what I am doing wrong.  The goal is to display any dictatorid with more than one last name assigned along with the total of the billing lines for each of the multiple last names.

Can some one help me tweak this?

0
Question by:codefinger

LVL 51

Expert Comment

can you give a sample data and result that you expect...

maybe there are other ways to write what you want, just a simpler way...
0

LVL 73

Expert Comment

if you want to SUM by ID instead of name you need to change the partition clause

SUM(BILLINGLINES) OVER (PARTITION BY DICTATORID ) sumx

but then you won't get values for each name.

maybe I'm not understanding what you're really looking for
0

LVL 51

Expert Comment

i could not get either :) thats why I asked for a sample data and result he is after...
0

Author Comment

Here is the requested sample and expected results requested (formatting doesn't hold together well, but hopefully you can still read  it :

``````SAMPLE:
DICTATORID	DICTATORLASTNAME	BILLINGLINES	REPORTPERIOD
----------	----------------------------------------	----------------------	------------
5774BRC	Kindred, J	20	201109
5774BRC	Monaco, B	47	201109
5774BRC	Monaco, B	38	201109
3972BRC	Sledge, E	26	201109
5774BRC	Kindred, J	47	201109
5774BRC	Kindred, J	38	201109
3972BRC	Sledge, E	26	201109

EXPECTED RESULT OF QUERY
sumx	cnt
5774BRC	Monaco, B	85	2
5774BRC	Kindred, J	105	3
``````
0

LVL 73

Expert Comment

the query above in your question  returns these values
I don't see a problem

the column order is different  and you took out the period but the other content is the same
0

LVL 73

Expert Comment

oops, nevermind  I had an error  I see the count differs
0

LVL 73

Expert Comment

I don't understand why Sledge, E   isn't in the expected results though  did you filter on DICTATORID?
0

LVL 73

Expert Comment

assuming you meant to exclude dictator's that only had one distinct billinglines value then try this

SELECT   DISTINCT dictatorid, dictatorlastname, sumx, cnt
FROM (SELECT dictatorid,
dictatorlastname,
reportperiod,
COUNT(DISTINCT billinglines) OVER (PARTITION BY dictatorid, dictatorlastname) cnt,
SUM(billinglines) OVER (PARTITION BY dictatorlastname) sumx
FROM transactions
WHERE reportperiod = '201109')
WHERE cnt > 1
ORDER BY dictatorid ASC, cnt;
0

LVL 51

Expert Comment

is this what you want?

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1

or is it too simple?

0

LVL 51

Expert Comment

oops, forgot table name :)

select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(1) cnt
from  ichart.transactions
group by DICTATORID, DICTATORLASTNAME
having count(1) > 1
0

LVL 73

Expert Comment

that will return Sledge, E
0

LVL 73

Expert Comment

you could however do something like this...

Select DICTATORID, DICTATORLASTNAME, sum(BILLINGLINES) sumx, count(distinct billinglines) cnt
from  ichart.transactions
group by DICTATORID, DICTATORLASTNAME
having count(distinct billinglines) > 1
0

Author Comment

No, that brings back way too many.rows.

I want to exclude dictatorid's that have only a one-to-one match with a dictatorlastname.

The goal is to present dictatorid's that are used with more than one lastname (this is incorrect data that will have to be corrected.), along with the count of how many times the combination is used and a sum of the billing lines for each combination.

If a dictatorid appears at all, it should appear at least twice, accompanied by different last names, with a count of at least 2, with a sum of the billing lines for each of the last names for the over-used dictatorid.

0

LVL 73

Expert Comment

>>> No, that brings back way too many.rows.

what is "that"  ?  There are 4 queries above since you lasted posted.

2 of the queries produce what you asked for given the data.

If they aren't correct, Can you expand your sample a little so we can see the pattern?
0

LVL 51

Expert Comment

according to your data the one I posted (also the one posted by sdstuber - which is almost same) should give the desired result... or come up with a different sample data... the one you posted has all same lastname, or I dont see the difference...
0

LVL 73

Expert Comment

>>> according to your data the one I posted

not quite, it returns extra row for Sledge which, according to the sample, isn't correct

my followup to that one does match the sample, but possibly the sample is full enough to reveal problems in either of the queries I posted.
0

Author Comment

sdstuber,

If it helps, this question is related to one I posted a few weeks ago, for which YOU got the points:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_27344186.html

I tried to tweak it to get the sum of billing lines for each name:

SELECT   distinct dictatorid, dictatorlastname,sumx
FROM (SELECT dictatorid,
dictatorlastname,
COUNT(DISTINCT dictatorlastname) OVER (PARTITION BY dictatorid) cnt,
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname) sumx
FROM ichart.transactions
WHERE reportperiod = '201109')
WHERE cnt > 1
ORDER BY dictatorid ASC;

...and got back something like this:

DICTATORID DICTATORLASTNAME                         SUMX
---------- ---------------------------------------- ----------------------
1893NOC    Felder, K                                6340
1893NOC    Zavatsky, J                              7985
30164NOC   Gunn, S                                  4239
30164NOC   Venturini, A                             604
3682BRC    Alleman, L                               767
3682BRC    Lindsey, S                               21245

But when I double check the first dictatorid with an independent query:
Select Sum(billinglines) from ichart.transactions where reportperiod='201109' and dictatorid = '1893NOC';
I get 6669, NOT 14325 (6340 + 7985).
so since the simpler query returns different results, I have to assume there is something wrong with the more complicated one...

I hope this gives enough information. If not, I will keep trying.
(Its not always easy to provide enough information on the first or even subsequent tries.

0

LVL 73

Accepted Solution

yes, I remember the question

you need to partition/group by both ID and NAME

when you do something like this...
SUM(BILLINGLINES) OVER (PARTITION BY dictatorlastname)
it's reporting the sum of all lines by name, regardless of which id that name might be associated with.

So, if Felder, K  is on ID 1, ID 2 and ID 3,  you're summing all of those.

Is there a problem with the queries in http:#36957600  or http:#36957796   ?

they both return the requested results for the sample data provided.

0

Author Comment

sdstuber:

DOH! 15 posts, wasting your time and mine (and HainKurt's), waiting most of the day for that simple solution because I did not ask the question properly in the first place....lesson learned....

The expected output from the actual data is 52 rows (that is what I got when I did not try to sum anything).

Both of the queries you reference returned over 300 rows (even after adding the reportperiod criteria to the one where it was left out).  So I will take the blame for that as well and assume there was something wrong with the sample data I provided as compared to the actual data.

At any rate, despite the length of the journey, I have the solution I came for, and that is what is most important.  Once again, THANK YOU!

0

## Featured Post

Introduction A previously published article on Experts Exchange ("Joins in Oracle", http://www.experts-exchange.com/Database/Oracle/A_8249-Joins-in-Oracle.html) makes a statement about "Oracle proprietary" joins and mixes the join syntax with genâ€¦
Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.â€‹
This video shows how to copy a database user from one database to another user DBMS_METADATA.  It also shows how to copy a user's permissions and discusses password hash differences between Oracle 10g and 11g.
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.