Link to home
Start Free TrialLog in
Avatar of Haiden Turner
Haiden TurnerFlag for United States of America

asked on

Aggregate Query return incorrect SUM values

I have an aggregate query in Access 2010 that is returning values for a summed field that are extremely higher than what you see when look at the raw data with Excel.

When I was setting up the query in the query designer I noticed that, before turning it into an aggregate query, the records were appearing multiple times in the result; so I added SELECT DISTINCT to get around this. Then I added the aggregate SUM and now it doesn't work correctly.

At first I thought it was somehow ignoring the SELECT DISTINCT, but when made a copy of the query and removed the aggregate and then ran the query and popped it out into Excel and put subtotals on it the values returned in the aggregate query are not exact multiples of the subtotals from Excel. In fact, when I divided the aggregate group by summed value by the respective subtotal from Excel none of the multipliers are the same.

SELECT DISTINCT vd_SalesRep.[SalesRep Name] AS Rep, 
                             Sum(vf_SALESDETAIL.[SalesTxn Document Subtotal Amount]) AS [Total Quotes Amount]
                FROM ((vf_SALESDETAIL INNER JOIN vd_Company ON vf_SALESDETAIL.LinkToCompanyID = vd_Company.LinkForCompanyID)
       INNER JOIN vd_SalesRep ON vf_SALESDETAIL.LinkToSalesRepID = vd_SalesRep.LinkForSalesRepID) INNER JOIN vd_CalYrTxnDate ON vf_SALESDETAIL.LinkToTxnDateID = vd_CalYrTxnDate.LinkForTxnDateID
             WHERE (((vd_Company.[Company Name])="ABC Company, Inc.") AND ((vd_CalYrTxnDate.[CalYr Txn Date]) Between [Forms]![frmLaunchpad]![StartDate] And [Forms]![frmLaunchpad]![EndDate])
                  AND ((vf_SALESDETAIL.[SalesTxn Document Type])="Estimate"))
GROUP BY vd_SalesRep.[SalesRep Name]
ORDER BY vd_SalesRep.[SalesRep Name];

Open in new window

Avatar of ThomasMcA2
ThomasMcA2

The "records appearing multiple times" is probably the problem. You may need to fix the join criteria, or perhaps the tables have bad/unexpected data. Each of your joins only has 1 join criteria, but maybe one of the tables needs another field in the join criteria. Or maybe the vd_SalesRep table has some records where a sales rep number is duplicated.
Avatar of Haiden Turner

ASKER

Seems like they would divide out to an integer if that was going on.
Not true. One instance is if there is bad data where only a handful of records are duplicated. Another is that how the data gets duplicated depends on the records being joined. Records for one sales rep might get duplicated twice, while records for another sales rep could get duplicated 3 or 4 or 5 times.
Alright, how do I get a DISTINCT result and why does it work without the aggregate?
ASKER CERTIFIED SOLUTION
Avatar of Dale Fye
Dale Fye
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Create a query that sums the sales

Select vf_SALESDETAIL.LinkToSalesRepID, vf_SALESDETAIL.LinkToTxnDateID, Sum(vf_SALESDETAIL.[SalesTxn Document Subtotal Amount]) AS [Total Quotes Amount]
From vf_SALESDETAIL
Where vf_SALESDETAIL.[SalesTxn Document Type])="Estimate"
Group By vf_SALESDETAIL.LinkToSalesRepID, vf_SALESDETAIL.LinkToTxnDateID;

You should be able to manually verify the sums for specific sales people for a given date range.

Then create an outer query that joins to the other tables and adds the criteria.  You will need aggregation in this query also since the date range will most likely bring back more than one of the records from the totals query.

PS, using distinct is fine if you join to a many-side table and are not returning enough information to make the rows unique but you can't use it if you need to aggregate.  As others have mentioned, you need to use Group By.

PPS, I'm not sure what value you are getting by abstracting the sales date.  What I do see is that it is forcing an otherwise unnecessary join.  This abstraction is sometimes done in data warehouses where you are aggregating huge sets of data, but I've never seen it done in a transactional database.  Also, having to join to the company table to apply criteria seems inefficient.  If you are using a form to select a company, you will probably have the companyID and so should use that instead.  Both of those changes would allow you to apply criteria to the main table rather than to the lookup tables.
OK, based on Dale's suggestion this works:

SELECT sq.Rep, Sum(sq.[Total Quotes Amount]) AS [SumOfTotal Quotes Amount]
FROM (SELECT DISTINCT vd_SalesRep.[SalesRep Name] AS Rep, vf_SALESDETAIL.[SalesTxn Document Reference Number], vd_CalYrTxnDate.[CalYr Txn Date], vf_SALESDETAIL.[SalesTxn Document Subtotal Amount] AS [Total Quotes Amount] FROM ((vf_SALESDETAIL INNER JOIN vd_Company ON vf_SALESDETAIL.LinkToCompanyID = vd_Company.LinkForCompanyID) INNER JOIN vd_SalesRep ON vf_SALESDETAIL.LinkToSalesRepID = vd_SalesRep.LinkForSalesRepID) INNER JOIN vd_CalYrTxnDate ON vf_SALESDETAIL.LinkToTxnDateID = vd_CalYrTxnDate.LinkForTxnDateID WHERE (((vd_CalYrTxnDate.[CalYr Txn Date]) Between Forms!frmLaunchpad!StartDate And Forms!frmLaunchpad!EndDate) And ((vd_Company.[Company Name])="The Playwell Group, Inc.") And ((vf_SALESDETAIL.[SalesTxn Document Type])="Estimate")))  AS sq
GROUP BY sq.Rep
ORDER BY sq.Rep;

Open in new window


However, I need a front-end on this for a user to pick specific sales reps to be included in a report based on the query. To do this I populate a table with the sales reps names and give the user form with check boxes for each name in the table. The a query gets the selected reps and I have incorporated that into a criteria on the first query:

SELECT sq.Rep, Sum(sq.[Total Quotes Amount]) AS [SumOfTotal Quotes Amount]
FROM (SELECT DISTINCT vd_SalesRep.[SalesRep Name] AS Rep, vf_SALESDETAIL.[SalesTxn Document Reference Number], vd_CalYrTxnDate.[CalYr Txn Date], vf_SALESDETAIL.[SalesTxn Document Subtotal Amount] AS [Total Quotes Amount] FROM ((vf_SALESDETAIL INNER JOIN vd_Company ON vf_SALESDETAIL.LinkToCompanyID = vd_Company.LinkForCompanyID) INNER JOIN vd_SalesRep ON vf_SALESDETAIL.LinkToSalesRepID = vd_SalesRep.LinkForSalesRepID) INNER JOIN vd_CalYrTxnDate ON vf_SALESDETAIL.LinkToTxnDateID = vd_CalYrTxnDate.LinkForTxnDateID WHERE (((vd_CalYrTxnDate.[CalYr Txn Date]) Between Forms!frmLaunchpad!StartDate And Forms!frmLaunchpad!EndDate) And ((vd_Company.[Company Name])="The Playwell Group, Inc.") And ((vf_SALESDETAIL.[SalesTxn Document Type])="Estimate")))  AS sq INNER JOIN qryGetSelectedReps ON sq.Rep = qryGetSelectedReps.[SalesRep Name]
WHERE (((sq.Rep) In ([qryGetSelectedReps]![SalesRep Name])))
GROUP BY sq.Rep
ORDER BY sq.Rep;

Open in new window


For some reason this breaks the DISTINCT and now records that have the same sales rep and the same dollar amount, but different SalesTxn Document Reference Numbers only appear once in the result.

I have not experimented PatHartman's idea, but to your point on the date abstraction - this data is coming from a data warehousing application.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks many times for your help and patience.
Glad I could help.

I still recommend that you go back and look at all of the tables you are joining and the fields you are joining on in those tables and search for duplicates.  Eliminating duplicates, or creating subqueries that eliminate the duplicates, is really more important than simply getting this one query to work correctly.

Pats idea of summing from the Sales Details table and making that a subquery and then joining to the other tables (might resolve the issue as well), but I'm convinced there are some duplicates in those other table that should be resolved, possibly by adding additional fields to the JOIN clause.

Dale