Help Grouping Data

Posted on 2014-03-27
Last Modified: 2014-03-28
Hello EE,

I have a set of data that shows what devices customers are using.  I need to group the data to show how many customers are using 1 device, 2 devices, 3 devices in any given month by client. My end product is to determine what % of my customers are are using 1 device, 2 devices or 3 devices by client each month

Here is my sample date
Client      Cust      Date      Device
100      145      7/1/2013      iPad
100      145      8/1/2013      iPad
100      145      9/1/2013      iPad
100      145      10/1/2013      iPad
100      145      11/1/2013      iPad
100      145      12/1/2013      iPad
100      145      1/1/2014      iPad
100      145      2/1/2014      iPad
100      145      7/1/2013      iPhone
100      145      8/1/2013      iPhone
100      145      9/1/2013      iPhone
100      145      10/1/2013      iPhone
100      145      11/1/2013      iPhone
100      145      12/1/2013      iPhone
100      145      1/1/2014      iPhone
100      145      2/1/2014      iPhone
100      145      2/1/2014      Windows
100      149      1/1/2013      Android
100      149      2/1/2013      Android
100      149      3/1/2013      Android
100      149      4/1/2013      Android
100      149      7/1/2013      Android
100      149      8/1/2013      Android
100      149      9/1/2013      Android
100      149      10/1/2013      Android
100      149      11/1/2013      Android
100      149      12/1/2013      Android
100      149      1/1/2014      Android
100      149      2/1/2014      Android
101      156      1/1/2013      ipad
101      156      2/1/2013      ipad
101      156      3/1/2013      ipad
101      156      4/1/2013      ipad
101      156      5/1/2013      ipad
101      156      6/1/2013      ipad
101      156      7/1/2013      ipad
101      156      8/1/2013      ipad
101      156      9/1/2013      ipad
101      156      10/1/2013      ipad
101      156      11/1/2013      ipad
101      156      12/1/2013      ipad
101      156      1/1/2014      ipad
101      156      2/1/2014      ipad
101      156      1/1/2013      iPhone
101      156      2/1/2013      iPhone
101      156      3/1/2013      iPhone
101      156      4/1/2013      iPhone
101      156      5/1/2013      iPhone
101      156      6/1/2013      iPhone
101      156      7/1/2013      iPhone
101      156      8/1/2013      iPhone
101      156      9/1/2013      iPhone
101      156      10/1/2013      iPhone
101      156      11/1/2013      iPhone
101      156      12/1/2013      iPhone
101      156      1/1/2014      iPhone
101      156      2/1/2014      iPhone

Your assistance is greatly appreciated
Question by:fb1990
  • 6
  • 4
  • 2
  • +1
LVL 32

Expert Comment

ID: 39958824
Given your sample data, what do you expect the results to be?
LVL 35

Assisted Solution

PatHartman earned 250 total points
ID: 39959361
You need to start with a query that counts the devices per month.

Select Client, Cust, [Date], Count(*) As DeviceCount
From Your Table
Group by Client, Cust, [Date];

Then you would Count the clients who use each DeviceCount.

Select [Date], DeviceCount, Count(*) As ClientCount
From qry1
Group by [Date], DeviceCount

You can then use that query in a crosstab or other query to calculate the percent.

Author Comment

ID: 39959410
Thanks Pat.  I think you are on to something useful for me.  There can be times when the customer used each device more than 1 time in the month, but i only want to show that device is used one time regardless of how many times the device is used.  So i think an iteration of case statement might be necessary.  Do you have an idea of how this might be implemented?
Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

LVL 35

Expert Comment

ID: 39959459
Your dates were all the first of the month so I though the data was already summarized.  The first step would be to summarize it so that you have one record per user per device type per month.

Select Distinct Client, Cust, Format([Date], "yyyy/mm" As YearMonth, Device
From YourTable;

Queries are based on set theory so you don't do loops.  You define sets of data by using criteria.
LVL 32

Assisted Solution

awking00 earned 50 total points
ID: 39959490
So, if client 100 customer 145 used an iPad on 7/6/2013 and again on 7/19/2013 along with the use shown on 7/1/2013, that would only count as 1 device for client 100 customer 145 for the month of July 2013? I'd still like to see your expected results.

Author Comment

ID: 39959493
Got it..  The distinct is on customer.  I will need to apply your solution in a subquery.

Author Comment

ID: 39959641
Hello awking00, Yes you are correct with the description of my data that you gave above.

Here is my expected result:
Date      CountOfCust      CountOfDevice
1/1/2013      1      1
1/1/2013      1      2
2/1/2013      1      1
2/1/2013      1      2
3/1/2013      1      1
3/1/2013      1      2
4/1/2013      1      1
4/1/2013      1      2
5/1/2013      1      2
6/1/2013      1      2
7/1/2013      1      1
7/1/2013      2      2
8/1/2013      1      1
8/1/2013      2      2
9/1/2013      1      1
9/1/2013      2      2
10/1/2013      1      1
10/1/2013      2      2
11/1/2013      1      1
11/1/2013      2      2
12/1/2013      1      1
12/1/2013      2      2
1/1/2014      1      1
1/1/2014      2      2
2/1/2014      1      1
2/1/2014      1      2
2/1/2014      1      3

For example in Feb 2014, i have 1 customer using 1 device, 1 customer used 2 device and 1 customer used 3 devices.  

I am going about doing this hard way right now.  Any way to simplify the task is much appreciated..

LVL 35

Expert Comment

ID: 39960381
Nesting queries isn't the hard way if that is what you are doing.  You can use sub queries but they are harder to read in the QBE and due to the multiple summarizations that need to happen, they will be nested also.

My preference is the nested query method.  I know it looks cumbersome but it is easier to build since you are focused on one part at a time and they are easier to test since you can test them in phases where it is easier to check the details.  If you do the whole thing in a single query, there is no easy way to break it down for testing.

Author Comment

ID: 39960460
Hello EE,

Here is the query that accomplished the task, but like Pat suggested when this is implemented in my large database, readability becomes an issue.  The sample data was already summarize. The inner most query already summarized. My database looks like awking00 comment above

----So, if client 100 customer 145 used an iPad on 7/6/2013 and again on 7/19/2013 along with the use shown on 7/1/2013, that would only count as 1 device for client 100 customer 145 for the month of July 2013? I'd still like to see your expected results.

Do you guys have any idea to streamline and optimize the code for me.  You already got me thinking:
SELECT t.Date, Count(t.Cust) AS CountOfCust, t.CountOfDevice
FROM (SELECT [Sample].Client, [Sample].Cust, [Sample].Date, Count([Sample].Device) AS CountOfDevice FROM Sample GROUP BY [Sample].Client, [Sample].Cust, [Sample].Date)  AS t
GROUP BY t.Date, t.CountOfDevice

Open in new window

LVL 35

Accepted Solution

Mark Geerlings earned 200 total points
ID: 39962680
That query is close, but it is likely to give you a row for each day (that exists in the data), not one summary row for the month.  You will likely need a "to_char" or a "trunc" and a corresponding "group by" on the date column in your inner query to combine possible separate records into one, like this:

SELECT t.Date, Count(t.Cust) AS CountOfCust, t.CountOfDevice
      FROM (SELECT s.Client, s.Cust, to_char(s.Date,'MM') as Date, Count(s.Device) AS CountOfDevice
        FROM Sample s GROUP BY s.Client, s.Cust, to_char(s.Date,'MM'))  AS t
GROUP BY t.Date, t.CountOfDevice

Also, this does not handle records for different years correctly.  You will either need to add a filter condition to your inner query to limit the records to the current year, or you will need to change the 'MM' to something more like 'YYYY-MM'.

I like the fact that you did not include the "distinct" keyword in your query as PatHartman had suggested.  Using "distinct" in Oracle queries is something that I recommend avoiding.  In this case, using a group operator "count" plus a "group by" on any non-grouped columns gives you a distinct result already.  The "distinct" keyword in Oracle always forces a sort operation (which you may or may not need, but it takes time) and the results may not be as distinct as you expect especially if select multiple columns and one of them contains "date" values.  (They may then be "distinct" down to the second, and not to the day like you might expect.)
LVL 35

Expert Comment

ID: 39962723
The question was posted to the Access forum which is why I answered with Access syntax.  When you are summarizing by month, always include the year unless you have selection criteria that selects only a single year.

When you format the date column into yyyy/mm you also need to alter the group by clause because you don't want to keep grouping by date.  You need to group by the formatted date field.
LVL 35

Expert Comment

by:Mark Geerlings
ID: 39962762
Sorry, I didn't notice which area this was posted in.  I saw it under the Oracle area, so I responded assuming it was an Oracle question.

Author Comment

ID: 39962990
Thanks to everyone that responded.  i am actually getting the data from an oracle database and then using access as secondary since it allows me better view and manipulation of the data.  Pat got me thinking in the right direction, but i also like Mark's response.  I will close this case and accept both of your solutions as helping me solve this problem.  THANK YOU!

Author Closing Comment

ID: 39962994
Thanks everyone

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I remember the day when someone asked me to create a user for an application developement. The user should be able to create views and materialized views and, so, I used the following syntax: (CODE) This way, I guessed, I would ensure that use…
As tax season makes its return, so does the increase in cyber crime and tax refund phishing that comes with it
Via a live example, show how to restore a database from backup after a simulated disk failure using RMAN.
With Microsoft Access, learn how to specify relationships between tables and set various options on the relationship. Add the tables: Create the relationship: Decide if you’re going to set referential integrity: Decide if you want cascade upda…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question