Solved

How to Sum fields from Duplicate records

Posted on 2014-09-23
14
638 Views
Last Modified: 2014-09-24
I am importing Excel spreadsheets that has duplicate records (Job Numbers).  The Job Number normally has only one record.  However, occasionally there are corrections made to (sales) totals by using the same Job Number causing duplicate records that need to be Summed for the correct total to that record (Job Number).

How do I combine the field [sales] and leave only one record [Field: Job Number]?

I have used DSUM in a Query with some success but I need certain fields in order to have a complete report (Sales, Job Number, Customer Number, Description, Quantity, etc).  As I add in the fields Description and/or Quantity the DSUM no longer works and I'm back to getting my Duplicate records.  I thought that it may be the Field's Properties that was throwing it off but the Customer Number and Description are both Text and Customer Number does not hurt the Query and Description does.  (FYI - Quantity is Number and Sales is Currency)

Are there other attributes that I need to look into and change?  Thank you in advance.
0
Comment
Question by:lordzack
  • 5
  • 5
  • 4
14 Comments
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40339910
Hello,

one way is to create the query which will contain the Job Number and sum(sales) grouped by Job Number which will then have single records and then you can join those data to the original table to get extra descriptions which you need for the report...
0
 

Author Comment

by:lordzack
ID: 40339940
I've done that with taking the Query of the fields where the DSUM works and the Table that has the additional fields I need and making an additional Query but when I add the field Description I immediately get my duplicate records back.  I link the Job Number(s) on that Query but still no luck.
0
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40339947
Can you post sample data so that i can recreate the problem to find solution?
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 40340001
You would not use a domain function to do this, you would create a totals query.
1. Open the query that returns the detail records in the QBE.
2. Click the big sigma button.  This adds Group By to the Total cell for all columns.
3. Change Group By to Sum for the columns you want summarized.
0
 

Author Comment

by:lordzack
ID: 40340002
Here is a sample of the data in Excel format.  Highlighted are the Duplicate Job Numbers and you'll notice that the Sales total (as well as Target & Materials) need to be summed.
0
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40340102
There is no file attached...
0
 

Author Comment

by:lordzack
ID: 40340359
Sorry.  Had it in a .xlsx instead of .xls
Test.xls
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 34

Expert Comment

by:PatHartman
ID: 40340379
The problem with the aggregation is that you are grouping by fields that are unique.  Description for example has different values on each of the "duplicate" rows.  Which description did you want to see?  Instead of using Group By for Description, you need to use First/Last/Min/Max  to pick up one of the values.  There may be other fields that have the same issue.  I can't really tell without seeing the query.

PS - using domain functions in queries is problematic.  Each domain function runs a separate query so in the very small recordset you included, there are 27 rows that means that each domain function will generate 27 SEPARATE queries.  As you can imagine, as your recordset grows, producing it will take ever more time.  That is why I suggested using a totals query instead.  You won't be generating the extraneous queries and performance should be fine even into hundreds of thousands of rows.
0
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40340388
Hello,

I imported the data in access and processed it and attached is the end result. End result is obtained by running total_end query.
test---Copy.accdb
0
 

Author Comment

by:lordzack
ID: 40341332
samo4fun - your attached file did not translate into an access database.  It was just text gibberish.  Can you send something different?

PatHartman - Under Group By - by using "First" I was able to get the result I need but you are saying that in large tables/queries that will slow down the process greatly? (my database will have nearly 10,000 records) I have no concern for which "Description" that I use as long as one of them are in the Description field. Can you further explain the use of a "totals query"?
0
 
LVL 15

Expert Comment

by:Haris Djulic
ID: 40341336
hi,

here is the code for first query namd total_temp

SELECT SPCJCHST_DSUM_Test.JobNumber, SPCJCHST_DSUM_Test.CustomerNumber, SPCJCHST_DSUM_Test.Salesman, Sum(SPCJCHST_DSUM_Test.Sales) AS SumOfSales, Sum(SPCJCHST_DSUM_Test.Target) AS SumOfTarget, Sum(SPCJCHST_DSUM_Test.Materials) AS SumOfMaterials, Sum(SPCJCHST_DSUM_Test.Quantity) AS SumOfQuantity, SPCJCHST_DSUM_Test.Date
FROM SPCJCHST_DSUM_Test
GROUP BY SPCJCHST_DSUM_Test.JobNumber, SPCJCHST_DSUM_Test.CustomerNumber, SPCJCHST_DSUM_Test.Salesman, SPCJCHST_DSUM_Test.Date;

Open in new window


and after that you can create the second query for the end result total_end

SELECT total_temp.*, SPCJCHST_DSUM_Test.cat1, SPCJCHST_DSUM_Test.cat2, SPCJCHST_DSUM_Test.cat3, SPCJCHST_DSUM_Test.cat4, SPCJCHST_DSUM_Test.Description
FROM SPCJCHST_DSUM_Test INNER JOIN total_temp ON (SPCJCHST_DSUM_Test.Salesman = total_temp.Salesman) AND (SPCJCHST_DSUM_Test.CustomerNumber = total_temp.CustomerNumber) AND (SPCJCHST_DSUM_Test.JobNumber = total_temp.JobNumber)
WHERE SPCJCHST_DSUM_Test.cat1 not like '9';

Open in new window


Just make sure that your table has same name as in queries
0
 
LVL 34

Accepted Solution

by:
PatHartman earned 500 total points
ID: 40341807
Domain functions start with a "d" and they run their own query.  First() is not a domain function.

Select First(Desc) As FirstDesc, Somefld1, Somefld2, Sum(Amt1) as SumAmt1, Avg(Amt2) as AvgAmt2
From YourTAble
Group By Somefld1, Somefld2;

Notice that every field in the Select clause is either included in the Group By clause or has its aggregation method specified in a function.  This is clearer when you look at the query in the QBE since the aggregation method is spelled out explicitly for each column.

Select DSum("Amt1", "yourtable", "SomeKey = " & SomeKey) As SumAmt1, Somefld1, Somefld2
From YourTable

In this case Dsum() runs a separate query for each row in the outer query so if the query returns 10,000 rows, then 10,000!!!!! queries have to run to calculate each SumAmt1 value.

Domain functions are best used in forms where you are looking at one record at a time so you won't notice how slow they are.  You might occasionally need to use a domain function in a RecordSource query for a form because you need some aggregated information and doing the aggregation in the query or joining to a totals query will render the main query not updatable.  For example, I might want to show a running balance for an account on a form much like a checkbook.  I can do this the efficient way by joining to a totals query or I can do it the inefficient way by using a DSum().  If the form doesn't need to be updateable the totals query is preferable but you can get by using the DSum() as long as the query is not returning too many rows.
0
 

Author Closing Comment

by:lordzack
ID: 40341865
This was very helpful.  Not only to answer this particular problem but in design as a whole.  Thank you!
0
 
LVL 34

Expert Comment

by:PatHartman
ID: 40342340
You're welcome.  Many MS examples use domain functions because they are simple to understand and easy to use but they never mention the ramifications or explain alternatives when you have too many records to select.  I discovered the issue quite by accident very early in my Access career.  I wrote a procedure that read 100,000 records and modified fields as part of a conversion.  The update loop ran in about 5 minutes.  I needed to add a code conversion and so I took the easy road and used a DLookup().  The lookup table had only 9 code values but that changed the procedure to take 110 minutes.  The solution, and this is the solution to eliminate almost all DLookup()'s is to modify the base query to join to the lookup table.  Once I did this, the code went back to taking 5 minutes to run.  Then as I was reading the code to figure out why it took so long, I realized that I could have done the complete update in a query.  I didn't have to write a DAO code loop at all.  So, I converted the update to a query and it ran in less than 2 minutes.  I had started out with code because I came to Access as a COBOL programmer and code was the solution to everything in that world.  I learned two things from this exercise,
1. Never use a domain function inside a code loop or a query
2. Never write a code loop when an action query will do the job.
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

In Debugging – Part 1, you learned the basics of the debugging process. You learned how to avoid bugs, as well as how to utilize the Immediate window in the debugging process. This article takes things to the next level by showing you how you can us…
I see at least one EE question a week that pertains to using temporary tables in MS Access.  But surprisingly, I was unable to find a single article devoted solely to this topic. I don’t intend to describe all of the uses of temporary tables in t…
Familiarize people with the process of utilizing SQL Server stored procedures from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Micr…
With Microsoft Access, learn how to specify relationships between tables and set various options on the relationship. Add the tables: Create the relationship: Decide if you’re going to set referential integrity: Decide if you want cascade upda…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now