We help IT Professionals succeed at work.

how to get sum of distinct records

Hi experts
i have an SQL query that return a distinct records

select distinct VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX,check_amount
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
 where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015'

Open in new window


vno          vname   check_date    email               fax         check_amount
---------------------------------------------------------------------------------------------------
4382501      Zayani      2015-01-15       xx@bb.com      NULL      1460.000
4382501      Zayani      2015-01-29       xx@bb.com      NULL      5556.000
4382501      Zayani      2015-02-15       xx@bb.com      NULL      730.000
4382501      Zayani      2015-02-15       xx@bb.com      NULL      11781.000

i need to get the sum of each record if i have more than one record
i need the result to look like the following

vno          vname   check_date    email               fax         SUM check_amount
---------------------------------------------------------------------------------------------------
4382501      Zayani      2015-01-15       xx@bb.com      NULL      1460.000
4382501      Zayani      2015-01-29       xx@bb.com      NULL      5556.000
4382501      Zayani      2015-02-15       xx@bb.com      NULL      12511.000

any suggestion
Comment
Watch Question

Do you really mean to select disctinct or do you just want this? This will sum on check_amount.

select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX, SUM(check_amount) AS check_total
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
 where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015'

Author

Commented:
i tried to use SUM function but i am getting wrong result

 select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX, sum(check_amount)
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
 where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015' 
 group by VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
 order by CHECK_DATE desc

Open in new window


beacuse it dose'nt select distinct record
select DISTINCT VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX, check_amount
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
 where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015'  INTO TABLE temp_mof

select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX, SUM(check_amount) AS check_total
from temp_mof
GROUP BY VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
Vitor MontalvãoIT Engineer
Distinguished Expert 2017

Commented:
Give us an example of non distinct values returned by the query with the SUM function.
PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
"SELECT DISTINCT" cannot & will not produce non-distinct rows.

Please note that EVERY COLUMN is considered when determining "distinct" so the numbers below are considered.

4382501      Zayani      2015-02-15       xx@bb.com      NULL      730.000
4382501      Zayani      2015-02-15       xx@bb.com      NULL      11781.000

and the two rows ARE distinct because the numbers are different.
--------------------------
recommendation:

don't use "select distinct", use a group by instead. Like this:
SELECT
      VENDOR_NUMBER
    , VENDOR_NAME
    , CHECK_DATE
    , EMAIL_ADDRESS
    , FAX
    , SUM(check_amount) summed_check_amount
FROM MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
WHERE VENDOR_NUMBER = 4382501
      AND DATEPART(yy, CHECK_DATE) = '2015'
GROUP BY
      VENDOR_NUMBER
    , VENDOR_NAME
    , CHECK_DATE
    , EMAIL_ADDRESS
    , FAX
;

Open in new window

see Select Distinct is returning duplicates
Lokesh B RDeveloper
Top Expert 2015

Commented:
SELECT   VENDOR_NUMBER  , VENDOR_NAME  , CHECK_DATE , EMAIL_ADDRESS  , FAX , SUM(check_amount) AS check_amount
FROM MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY WHERE VENDOR_NUMBER = 4382501 AND DATEPART(YEAR, CHECK_DATE) = '2015'
GROUP BY   VENDOR_NUMBER  , VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX

Open in new window

Dheeraj BabooTeam Lead

Commented:
select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX,SUM(check_amount) [Sum check_amount]
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
group by VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
 having VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015'
PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
@Dheeraj Baboo converting the where clause into a having clause is incorrect.
The having clause is for conditions that exist AFTER the group by
       e.g. having sum(value) > 1000

Author

Commented:
thank you all for the replies
i found solution in another post
select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
       EMAIL_ADDRESS,FAX,SUM(check_amount) as check_amount 
from (select distinct 
             VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
             EMAIL_ADDRESS,FAX,check_amount 
      from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
      where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015') as fo
GROUP BY VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX

Open in new window

PortletPaulEE Topic Advisor
Most Valuable Expert 2014
Awarded 2013

Commented:
except that the results may be incorrect.

You should not apply "select distinct" amongst rows that get summed.

If 2 identical cheques are written for the same amount, using select distinct, the result is only 50% of the true total.

Please reconsider your chosen solution. It has the potential to be incorrect.

Also, for the example data presented in the question, you are not getting any benefit from  select distinct, all it will do is slow down the query.
EE Topic Advisor
Most Valuable Expert 2014
Awarded 2013
Commented:
Your chosen query:
select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
       EMAIL_ADDRESS,FAX,SUM(check_amount) as check_amount 
from (select distinct 
             VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
             EMAIL_ADDRESS,FAX,check_amount 
      from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
      where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015') as fo
GROUP BY VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
;

Open in new window


Produced this result:
| VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
|---------------|-------------|--------------------|---------------|--------|--------------|
|       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
|       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
|       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |        12511 |

Open in new window


and this execution plan:distinct.png
This query (NOT using distinct) produces the EXACT SAME result:
    SELECT
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX
        , SUM(check_amount) check_amount
    FROM MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
    WHERE VENDOR_NUMBER = 4382501
          AND DATEPART(yy, CHECK_DATE) = '2015'
    GROUP BY
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX



| VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
|---------------|-------------|--------------------|---------------|--------|--------------|
|       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
|       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
|       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |        12511 |

Open in new window


with this execution planno distinct, same result, cheaper in effort
So, avoiding "select distinct" produces the same result, for less effort.

The above may be re-proven by visiting http://sqlfiddle.com/#!3/39eca/2
details of that:
**MS SQL Server 2008 Schema Setup**:

    
    CREATE TABLE MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
    	([VENDOR_NUMBER] int, [VENDOR_NAME] varchar(6), [check_date] datetime
         , [EMAIL_ADDRESS] varchar(9), [fax] varchar(4), [check_amount] int)
    ;
    	
    INSERT INTO MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
    	([VENDOR_NUMBER], [VENDOR_NAME], [check_date], [EMAIL_ADDRESS], [fax], [check_amount])
    VALUES
    	(4382501, 'Zayani', '2015-01-15 00:00:00', 'xx@bb.com', NULL, 1460.000),
    	(4382501, 'Zayani', '2015-01-29 00:00:00', 'xx@bb.com', NULL, 5556.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 11781.000)
    ;
    
    
    
    CREATE TABLE sample_data 
    	([VENDOR_NUMBER] int, [VENDOR_NAME] varchar(6), [check_date] datetime
         , [EMAIL_ADDRESS] varchar(9), [fax] varchar(4), [check_amount] int)
    ;
    	
    INSERT INTO sample_data
    	([VENDOR_NUMBER], [VENDOR_NAME], [check_date], [EMAIL_ADDRESS], [fax], [check_amount])
    VALUES
    	(4382501, 'Zayani', '2015-01-15 00:00:00', 'xx@bb.com', NULL, 1460.000),
    	(4382501, 'Zayani', '2015-01-29 00:00:00', 'xx@bb.com', NULL, 5556.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000)
    ;
    

**Query 1**:

    
    
    select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
           EMAIL_ADDRESS,FAX,SUM(check_amount) as check_amount 
    from (select distinct 
                 VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
                 EMAIL_ADDRESS,FAX,check_amount 
          from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
          where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015') as fo
    GROUP BY VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
    

**[Results][2]**:
    
    | VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
    |---------------|-------------|--------------------|---------------|--------|--------------|
    |       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
    |       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
    |       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |        12511 |


**Query 2**:

    SELECT
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX
        , SUM(check_amount) check_amount
    FROM MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
    WHERE VENDOR_NUMBER = 4382501
          AND DATEPART(yy, CHECK_DATE) = '2015'
    GROUP BY
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX
    

**[Results][3]**:
    
    | VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
    |---------------|-------------|--------------------|---------------|--------|--------------|
    |       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
    |       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
    |       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |        12511 |



  [1]: http://sqlfiddle.com/#!3/39eca/2

  [2]: http://sqlfiddle.com/#!3/39eca/2/0

  [3]: http://sqlfiddle.com/#!3/39eca/2/1

Open in new window


Now let's look at when there really is two valid, but identical transactions

CREATE TABLE sample_data
    ([VENDOR_NUMBER] int, [VENDOR_NAME] varchar(6), [check_date] datetime
     , [EMAIL_ADDRESS] varchar(9), [fax] varchar(4), [check_amount] int)
;
   
INSERT INTO sample_data
    ([VENDOR_NUMBER], [VENDOR_NAME], [check_date], [EMAIL_ADDRESS], [fax], [check_amount])
VALUES
    (4382501, 'Zayani', '2015-01-15 00:00:00', 'xx@bb.com', NULL, 1460.000),
    (4382501, 'Zayani', '2015-01-29 00:00:00', 'xx@bb.com', NULL, 5556.000),
   (4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000), --<< identical but valid
    (4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000)
 --<< identical but valid
;

Using SELECT DISTINCT (incorrect):
| VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
|---------------|-------------|--------------------|---------------|--------|--------------|
|       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
|       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
|       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |          730 |

Open in new window


There is no reason why a person could not write 2 checks for the same amount on the same day. So "select distinct" has ignored a row, and the reult is half of what it should be.

NOT using select distinct (correct):
| VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
|---------------|-------------|--------------------|---------------|--------|--------------|
|       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
|       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
|       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |         1460 |

Open in new window


This can be re-proven at: http://sqlfiddle.com/#!3/39eca/3
details of this:
**MS SQL Server 2008 Schema Setup**:

    
    CREATE TABLE MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
    	([VENDOR_NUMBER] int, [VENDOR_NAME] varchar(6), [check_date] datetime
         , [EMAIL_ADDRESS] varchar(9), [fax] varchar(4), [check_amount] int)
    ;
    	
    INSERT INTO MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY 
    	([VENDOR_NUMBER], [VENDOR_NAME], [check_date], [EMAIL_ADDRESS], [fax], [check_amount])
    VALUES
    	(4382501, 'Zayani', '2015-01-15 00:00:00', 'xx@bb.com', NULL, 1460.000),
    	(4382501, 'Zayani', '2015-01-29 00:00:00', 'xx@bb.com', NULL, 5556.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 11781.000)
    ;
    
    
    
    CREATE TABLE sample_data 
    	([VENDOR_NUMBER] int, [VENDOR_NAME] varchar(6), [check_date] datetime
         , [EMAIL_ADDRESS] varchar(9), [fax] varchar(4), [check_amount] int)
    ;
    	
    INSERT INTO sample_data
    	([VENDOR_NUMBER], [VENDOR_NAME], [check_date], [EMAIL_ADDRESS], [fax], [check_amount])
    VALUES
    	(4382501, 'Zayani', '2015-01-15 00:00:00', 'xx@bb.com', NULL, 1460.000),
    	(4382501, 'Zayani', '2015-01-29 00:00:00', 'xx@bb.com', NULL, 5556.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000),
    	(4382501, 'Zayani', '2015-02-15 00:00:00', 'xx@bb.com', NULL, 730.000)
    ;
    

**Query 1**:

    
    
    select VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
           EMAIL_ADDRESS,FAX,SUM(check_amount) as check_amount 
    from (select distinct 
                 VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,
                 EMAIL_ADDRESS,FAX,check_amount 
          from sample_data
          where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015') as fo
    GROUP BY VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX
    

**[Results][2]**:
    
    | VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
    |---------------|-------------|--------------------|---------------|--------|--------------|
    |       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
    |       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
    |       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |          730 |


**Query 2**:

    SELECT
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX
        , SUM(check_amount) check_amount
    FROM sample_data
    WHERE VENDOR_NUMBER = 4382501
          AND DATEPART(yy, CHECK_DATE) = '2015'
    GROUP BY
          VENDOR_NUMBER, VENDOR_NAME , CHECK_DATE , EMAIL_ADDRESS , FAX
    

**[Results][3]**:
    
    | VENDOR_NUMBER | VENDOR_NAME |         CHECK_DATE | EMAIL_ADDRESS |    FAX | CHECK_AMOUNT |
    |---------------|-------------|--------------------|---------------|--------|--------------|
    |       4382501 |      Zayani |  January, 15 2015  |     xx@bb.com | (null) |         1460 |
    |       4382501 |      Zayani |  January, 29 2015  |     xx@bb.com | (null) |         5556 |
    |       4382501 |      Zayani | February, 15 2015  |     xx@bb.com | (null) |         1460 |



  [1]: http://sqlfiddle.com/#!3/39eca/3

  [2]: http://sqlfiddle.com/#!3/39eca/3/0

  [3]: http://sqlfiddle.com/#!3/39eca/3/1

Open in new window

awking00Information Technology Specialist

Commented:
You can use distinct with analytics -
select distinct VENDOR_NUMBER,VENDOR_NAME,CHECK_DATE,EMAIL_ADDRESS,FAX,
sum(check_amount) over (partition by vendor_number, check_date order by check_date) as check_amount
from MOF_AP_REMITTANCE_GFOS_HEADER_HISTORY
where VENDOR_NUMBER = 4382501 and DATEPART(yy, CHECK_DATE) = '2015'
order by check_date;