Counting and distint counting - query help!

James Murphy
James Murphy used Ask the Experts™
on
Hi all,

firstly thank you for taking a look at my question - your help is invaluable!!

basically if I have the following table called a_debt:

row_id      de_account_number      de_listed_date
1      123    1/01/2018
2      124    1/01/2018
3      126    1/01/2018
4      128    1/01/2018
5      129    1/01/2018
6      123    2/01/2018
7      124    2/01/2018
8      126    2/01/2018
9      128    2/01/2018
10      129    2/01/2018
11      123    3/01/2018
12      124    3/01/2018
13      126    3/01/2018
14      129    3/01/2018
15      130    3/01/2018
16      131    3/01/2018
17      132    3/01/2018
18      123    4/01/2018

The data I am trying to pull is this.

column 1: count of the different account numbers loaded for each date
column 2: the date that has been counted
column 3: a count of the files for each date, that do not have a prior listed date, ie for the 4/1/2018 this column would = 1, for the 1/1/2018 the value would be 5, the 2/1/2018 would be 0 etc.

Is this something I could get help with?

many thanks!!

cheers
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Raja Jegan RSQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018

Commented:
Hope this meets your requirement, if not post the result set query is returning along with expected result set.. Change your table name before executing the script..
select COUNT(DISTINCT de_account_number) , de_listed_date, COUNT(de_account_number)
FROM ur_table
GROUP BY de_listed_date

Open in new window

Author

Commented:
Hi This is the result.
COLUMN1      de_listed_date      COLUMN2
1486      30/01/2018      1486
1491      31/01/2018      1491
1421      1/02/2018      1421
1313      2/02/2018      1313
1429      5/02/2018      1429
2377      6/02/2018      2377


the count is correct, but what the second count needs to do count up the unique files that were loaded today not the files that were loaded up yetserday or prior that have the same de_cl_account as other files.


This is the query as it currently stands.

SELECT
  Count(DISTINCT d.de_cl_account),
  d.de_listed_date,
  Count(d.de_cl_account)
FROM
  a_debtor d
GROUP BY
  d.de_listed_date
Raja Jegan RSQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018

Commented:
>> what the second count needs to do count up the unique files that were loaded today not the files that were loaded up yetserday or prior that have the same de_cl_account as other files.

Kindly give some sample data to explain this scenario so that I can modify the query accordingly..
Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
Hi,

I have attached a file with a few days in it.

Basically each day there a number of files presented each day from a customer.  Say the customer gives us 2780 files on Monday 28th of May, and then on Tuesday we received a file with 3100 files in it, 2500 of these files might be in the list that was given to us on the Monday, so for the Tuesday the count might be 3100, but the unique count for that day (ie files that hadn't been received prior) would be 600 (3100-2500) and so so for each day.

Does that make sense?  sorry it has been a long day, and getting a little weary so my words are starting to fail me a little!
SQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018
Commented:
No issues, got your requirement well now and below code works fine with your sample data..
Hope this helps and change your table_name before running query.
; with cte as (
select de_listed_date, de_account_number, row_number() over (partition by de_account_number order by de_listed_date) rnum
from ur_table
)
select de_listed_date, count(distinct de_account_number) count_per_day, sum(case when rnum = 1 then 1 else 0 end) new_count_per_day
from cte
group by de_listed_date

Open in new window

Author

Commented:
Hi,

I ran it, but I only got 1's and 0's for the results rather than a count.

Did I copy it in wrongly?

Author

Commented:
I tried this:

 SELECT count(arcdeb.de_cl_account)
FROM   a_debtor arcdeb
 where not exists (SELECT
  a_debtor.de_cl_account as debt_acc
FROM
  a_debtor
WHERE
  a_debtor.de_listed_date BETWEEN '2017-12-01' AND '2018-05-27'
  group by a_debtor.de_cl_account
     ) and arcdeb.de_listed_date = '2018-05-28'

but it didn't work.
Raja Jegan RSQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018

Commented:
Seems so, query I've provided was working fine with the below result set for the sample data in your question..
de_listed_date      count_per_day      new_count_per_day
2018-01-01 00:00:00.000      5      5
2018-02-01 00:00:00.000      5      0
2018-03-01 00:00:00.000      7      3
2018-04-01 00:00:00.000      1      0
PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
column 3: a count of the files for each date, that do not have a prior listed date, ie for the 4/1/2018 this column would = 1, for the 1/1/2018 the value would be 5, the 2/1/2018 would be 0 etc.
This does not make sense to me.

row_id      de_account_number      de_listed_date
1      123    1/01/2018 <<== account 123 here
2      124    1/01/2018
3      126    1/01/2018
4      128    1/01/2018
5      129    1/01/2018
6      123    2/01/2018 <<== account 123 here
7      124    2/01/2018
8      126    2/01/2018
9      128    2/01/2018
10      129    2/01/2018
11      123    3/01/2018 <<== account 123 here
12      124    3/01/2018
13      126    3/01/2018
14      129    3/01/2018
15      130    3/01/2018
16      131    3/01/2018
17      132    3/01/2018
18      123    4/01/2018  <<== you say 1, but account 123 has been mentioned 3 times before this date

Please explain you arrive at a count of 1 on 04/01/2018

The results by Raja Jegan R look correct to me although I would alter the method ever so slightly. I prefer to use COUNT() when counting. The count function only increments non-null values
SELECT
    de_listed_date
  , COUNT(DISTINCT de_account_number) count_per_day
  , COUNT(CASE WHEN rnum = 1 THEN 1
        ELSE NULL
    END)                              new_count_per_day
FROM (
            SELECT
                de_listed_date
              , de_account_number
              , ROW_NUMBER() OVER (PARTITION BY de_account_number ORDER BY de_listed_date) rnum
            FROM ur_table
        ) derived
GROUP BY
    de_listed_date

Open in new window

I also prefer not to use CTEs unless there is a compelling reason for their use. In this query a CTE is just another form of derived table.

No points required please.

Author

Commented:
i do apolagise!!! I just found the error, when I gave you the data, I must have renamed the field, and it looked correct to me, it is working precisely now as it should!!

damn I feel like an idiot!!!!!
Raja Jegan RSQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018

Commented:
Thanks, no issues.
Please feel free to close the question if the query is working fine..

Author

Commented:
Thank you! Very much appreciated, you got me out of a bind!! this site is gold!!

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial