SQL Query that uses count(*) over, possible to use distinct?

I have the following query that I use as a part of a custom search engine on my site. I've stripped it down for readability, the joined tables will have columns in the select clause, that's why they're there:

select distinct ( u.userID ), u.lastLogin, totalUsers = COUNT( u.userID ) OVER() 
From   ZipCodes z  
	inner join tblUsers u on z.zipCode = u.zipCode  
	inner join tblUserOffers uo on u.userID = uo.userID 
Where <a big long where clause>
order by u.lastLogin OFFSET 0 
ROWS FETCH NEXT 500 ROWS ONLY 

Open in new window


my problem is with the totalUsers field. It's listing all of the records when I should get results based off of distinct( userID ).

For example, when I run the query with certain parameters, I get back 21 rows of data, however the totalUsers column has a value of 46. If I remove the DISTINCT clause, I get back 46 rows of data, however there are duplicates (and I'm not sure if I'm enough of a sql ninja to tune the complex query to prevent dupes). This query (part of a stored procedure) will be used often, so it needs to be as efficient as possible.

Cheers in advance for any help :)
LVL 34
Big MontySenior Web Developer / CEO of ExchangeTree.org Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

HainKurtSr. System AnalystCommented:
above query looks weird...
what are you trying to do?

give a sample from input and the result you are looking for...
0
PortletPaulfreelancerCommented:
"select distinct" is NOT A FUNCTION, the parentheses around u.userID have NO EFFECT whatsoever

"select distinct" ensures the output to be different on each row - so when you state that the rows have duplicates technically that is impossible - there has to be something different on each row.

You need to remove your reliance on "select distinct" - this will probably mean a complete re-write of the query. I'd like to help more but without the full query it isn't really possible. Please read the following:
https://www.experts-exchange.com/articles/12282/Select-Distinct-is-returning-duplicates.html

Basically you need to use one or more subqueries, in one of them you will be able to do the count, and then join it all together without need "select distinct" at all. Something like this:
SELECT
      u.userID
    , u.lastLogin
    , z.totalUsers
from tblUsers u
INNER JOIN (
      SELECT
            zipCode
          , COUNT(*) AS totalUsers
      FROM tblUsers
      GROUP BY zipCode
) z ON z.zipCode = u.zipCode
INNER JOIN tblUserOffers uo ON u.userID = uo.userID
WHERE <a big long where clause>
ORDER BY u.lastLogin OFFSET 0
ROWS FETCH NEXT 500 ROWS ONLY

Open in new window

0
Nakul VachhrajaniTechnical Architect, Capgemini IndiaCommented:
Without sample input  and output, I'm flying blind but here goes:

If you are expecting the total column to return 21, you can do the following:

select distinct u.userID, u.lastLogin, totalUsers = COUNT( DISTINCT u.userID ) OVER() 
From   ZipCodes z  
	inner join tblUsers u on z.zipCode = u.zipCode  
	inner join tblUserOffers uo on u.userID = uo.userID 
Where <a big long where clause>
order by u.lastLogin OFFSET 0 
ROWS FETCH NEXT 500 ROWS ONLY 

Open in new window

0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
Huseyin - i'm pulling data from my users table, along with related data from other tables (thus the reason for the join). I'm also using paging through the use of OFFSET FETCH NEXT. I tried to simplify the query for readability, that's probably why it looks weird.

Nakul - that was one of the first things I tried. Unfortunately it's invalid syntax, distinct can't be used with COUNT OVER

Paul - I'm aware that distinct isn't a function, the extra parentheses were left over from my trying count(*) inside the distinct, and I just forgot to remove them :) I think you're right, I think the best approach is to eliminate my use of using DISTINCT. I think I've identified the issue, the 2nd join I think is causing the issue:

 inner join tblUserOffers uo on u.userID = uo.userID

This join is here because I have references to it in my full select clause:

select u.userID, fullName = u.firstName + ' ' + u.lastName, u.displayName, u.city, u.stateProvince,
      Round( dbo.CalculateDistance(-71.43520, 42.32320, z.Longitude, z.Latitude), 2 ) As Distance,
      profilePicture = case when isNull( u.profilePicture, '' ) = '' then '/images/person.png' else u.profilePicture end,
      confirmed = case u.confirmed when 1 then '<html string>' else '' end,
      ( SELECT  offer + ', '       FROM tblOffers o             inner join tblUserOffers uo on o.offerID = uo.offerID       WHERE u.userID = uo.userID and       uo.visible = 1 FOR XML PATH ('') )  as offers,       
      ( SELECT count( o.offerID )         FROM tblOffers o             inner join tblUserOffers uo on uo.offerID = o.offerID         WHERE u.userID = uo.userID and uo.visible = 1 FOR XML PATH ('')    )  as offerCount,

      u.lastLogin, totalUsers = COUNT( * ) OVER()

the purpose of the bolded part is to list what each user is offering on the site (the site is an exchange and barter website, where users fill out profiles and what the want to offer on the site).

- tblOffers is a table filled with offer definitions. It allows users to select from a generic list and attach them to their profile
- tblUserOffers is cross table indicating the offerID and who it belongs to, defined by userID.

When I comment out the 2nd join, I get back individual results. However, it returns ALL records found, including the users with no offers, which I want to avoid.

What am I missing here?
0
PortletPaulfreelancerCommented:
could you provide a few rows of data for each table please?

and, from that data what you would expect to get as the result?

this is always the smplest, most productive/fastes method for solving SQL problems
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
sure, here is what I expect the results to look like:

userID      fullName      displayName      city      stateProvince      Distance      profilePicture      confirmed      offers      offerCount      lastLogin      totalUsers
15188      xxx      xxx      Framingham      Massachusetts      0      /uploads/members/15188.jpg      confirmedHTML      Diet & Lifestyle Coaching ,       1      NULL      21
135      xxx      xxx    Framingham      Massachusetts      0      /uploads/members/135.jpg      confirmedHTML      Disability Consulting, Editing Papers/Essays,       2      2016-02-23 11:54:52.390      21
15209      xxx     xxxx    Framingham      Massachusetts      0      /images/person.png            Viola Lessons, Violin Lessons, Ballroom Dance Lessons,       3      2016-12-01 20:37:07.223      21

tblUsers contains the following columns:

userID      fullName      displayName      city      stateProvince       profilePicture      confirmed      lastLogin      

tblOffers looks like:

offerID      offer      description      categoryID      createDate      active      isXTOffer
101      Diet & Lifestyle Coaching147      2015-04-27 16:08:50.027      1      1
102      Disability Consulting      Disability Consulting desc      113      2015-04-27 16:09:22.217      1      1
103      Web Design      web design desc      107      2015-04-27 16:10:42.580      1      1
106      Editing Papers/Essays      Editing Papers/Essays      104      2015-05-08 19:00:54.083      1      1

and tblUserOffers looks like:

userOfferID      offerID      userID
1194               101      100
573                102      5593
605                       103      6679
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
if a spreadsheet would be easier, let me know and I'll build one up
0
PortletPaulfreelancerCommented:
Try to imagine that you want to help somebody who is having trouble - but ALL you know about the problem is on this page.

Also remember there isn't a solitary thing we can do in SQL - to demonstrate or prove something - until we have BOTH tables and data.

Every referenced column of Every table needs to be in the sample - otherwise the query will just fail.

Here is a working example:
DECLARE @tblOffers TABLE
    ([offerID] int, [offer] varchar(28), [description] varchar(26), [categoryID] int, [createDate] varchar(7), [active] varchar(4), [isXTOffer] varchar(4))
;
    
INSERT INTO @tblOffers 	
    ([offerID], [offer], [description], [categoryID], [createDate], [active], [isXTOffer])
VALUES
    (101, 'Diet & Lifestyle Coaching147', '08:50.0', 1, '1', NULL, NULL),
    (102, 'Disability Consulting', 'Disability Consulting desc', 113, '09:22.2', '1', '1'),
    (103, 'Web Design', 'web design desc', 107, '10:42.6', '1', '1')
;

DECLARE @tblUserOffers TABLE
    ([userOfferID] int, [offerID] int, [userID] int, [visible] int)
;
    
INSERT INTO @tblUserOffers 
    ([userOfferID], [offerID], [userID])
VALUES
    (1194, 101, 100),
    (573, 102, 5593),
    (605, 103, 6679)
;

DECLARE @tblUsers TABLE
    ([userID] int, [firstName] varchar(10), [lastName] varchar(10), [displayName] varchar(12), [city] varchar(5), [stateProvince] varchar(14)
     , [profilePicture] varchar(14), [confirmed] int, [lastLogin] datetime)
;
    
INSERT INTO @tblUsers 
    ([userID], [firstName], [lastName], [displayName], [city], [stateProvince], [profilePicture], [confirmed])
VALUES
    (100, 'afirstName', 'alastName', 'adisplayName', 'acity', 'astateProvince', 'profilePicture', 1),
    (5593, 'bfirstName', 'b lastName', 'bdisplayName', 'bcity', 'bstateProvince', 'profilePicture', 1),
    (6679, 'cfirstName', 'clastName', 'cdisplayName', 'ccity', 'cstateProvince', 'profilePicture', 1)
;



SELECT
      u.userID
    , fullName = u.firstName + ' ' + u.lastName
    , u.displayName
    , u.city
    , u.stateProvince
    --, ROUND(dbo.CalculateDistance(-71.43520, 42.32320, z.Longitude, z.Latitude), 2) AS Distance
    , profilePicture = CASE
                            WHEN ISNULL(u.profilePicture, '') = '' THEN '/images/person.png'
                            ELSE u.profilePicture END
    , confirmed = CASE u.confirmed
                       WHEN 1 THEN '<html string>'
                       ELSE '' END
    , (
            SELECT
                  offer + ', '
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
            WHERE u.userID = uo.userID
            AND uo.visible = 1
            FOR xml PATH ('')
      )
      AS offers
    , (
            SELECT
                  COUNT(o.offerID)
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON uo.offerID = o.offerID
            WHERE u.userID = uo.userID
            AND uo.visible = 1
            FOR xml PATH ('')
      )
      AS offerCount
    , u.lastLogin
    , totalUsers = COUNT(*) OVER ()
FROM @tblUsers as U

Open in new window

That query, using that sample data, produces the following  result:
+---+--------+-----------------------+--------------+-------+----------------+----------------+---------------+--------+------------+-----------+------------+
|   | userID |       fullName        | displayName  | city  | stateProvince  | profilePicture |   confirmed   | offers | offerCount | lastLogin | totalUsers |
+---+--------+-----------------------+--------------+-------+----------------+----------------+---------------+--------+------------+-----------+------------+
| 1 |    100 | afirstName alastName  | adisplayName | acity | astateProvince | profilePicture | <html string> | NULL   |          0 | NULL      |          3 |
| 2 |   5593 | bfirstName b lastName | bdisplayName | bcity | bstateProvince | profilePicture | <html string> | NULL   |          0 | NULL      |          3 |
| 3 |   6679 | cfirstName clastName  | cdisplayName | ccity | cstateProvince | profilePicture | <html string> | NULL   |          0 | NULL      |          3 |
+---+--------+-----------------------+--------------+-------+----------------+----------------+---------------+--------+------------+-----------+------------+

Open in new window

Have a look at this url: http://rextester.com/KAP76290
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
Hey Paul,

I hear you, I'm trying to give (what I think) is the most useful data, and leaving out what isn't necessary so it'll be easier to read. I obviously need to work on this skill :)

I'll try to put something together in an hour or so on RexTester (nifty little site!). In the meantime, your example doesn't list anything for the OFFERS column, do you know why that is?
0
PortletPaulfreelancerCommented:
>>"example doesn't list anything for the OFFERS column, do you know why that is?"
I think it is due to not having any values in a field - I had to add it to the table but I didn't add data.

That missing data is referenced in a case expression. ( AND uo.visible = 1 )
If I comment out that conditoin you get data in the column.

By the way here is a query variant for you:
SELECT
      u.userID
    , fullName = u.firstName + ' ' + u.lastName
    , u.displayName
    , u.city
    , u.stateProvince
    --, ROUND(dbo.CalculateDistance(-71.43520, 42.32320, z.Longitude, z.Latitude), 2) AS Distance
    , profilePicture = CASE
                            WHEN ISNULL(u.profilePicture, '') = '' THEN '/images/person.png'
                            ELSE u.profilePicture END
    , confirmed = CASE u.confirmed
                       WHEN 1 THEN '<html string>'
                       ELSE '' END
    , oa1.offers
    , oa2.offerCount
    , u.lastLogin
    , totalUsers = COUNT(*) OVER ()
FROM @tblUsers as U
OUTER APPLY (
    SELECT
    STUFF((
          SELECT
                 ', ' + offer
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
            WHERE u.userID = uo.userID
            --AND uo.visible = 1
          FOR XML PATH ('')
          )
         , 1, 1, '')
     ) AS OA1 (offers)
OUTER APPLY (
          SELECT
                 COUNT(o.offerID)
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
            WHERE u.userID = uo.userID
            --AND uo.visible = 1
     ) AS OA2 (offerCount)

Open in new window

0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
using outer apply yielded the same results with null values being returned in the OFFERS column when I added the following to the FROM clause:
From   ZipCodes z  
	inner join tblUsers u on z.zipCode = u.zipCode  
	inner join tblUserOffers uo on u.userID = uo.userID 

Open in new window


I also have the following as my WHERE clause:
Where u.userID <> 100 and 
	Longitude Between -72.41289 And -70.45751 And 
	Latitude Between 41.59901 And 43.04739 And 
	dbo.CalculateDistance(-71.43520, 42.32320, Longitude, Latitude) <= 50 and 
	status = 1  

Open in new window


still working on the clean data set
0
PortletPaulfreelancerCommented:
The queries I supplied don't have any reference to table zipcode - because there were no facts about that table presented.  The second query DOES produce a result in the offers column - from the sample data - if  uo.visible = 1 or you ignore uo.visible completely.
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
I've come up with a sql script that has all of the tables and examples of their data, while keeping the data anonymous.

with this script, I only get back userID 135 if I comment out the where clause, which is weird because the longitude / latitude for that user is outside of the where clause boundaries. If I leave in the where clause, I get no results back. I would expect to get back userIDs 135 and 136, as those are the only records inside tblUserOffers

DECLARE @tblOffers TABLE
    ([offerID] int, [offer] varchar(28), [description] varchar(26), [categoryID] int, [createDate] varchar(7), [active] varchar(4), [isXTOffer] varchar(4))
;
    
INSERT INTO @tblOffers 	
    ([offerID], [offer], [description], [categoryID], [createDate], [active], [isXTOffer])
VALUES
    (1193, 'Diet & Lifestyle Coaching', '08:50.0', 1, '1', NULL, NULL),
    (658, 'Disability Consulting', 'Disability Consulting desc', 113, '09:22.2', '1', '1'),
    (713, 'Web Design', 'web design desc', 107, '10:42.6', '1', '1'),
	(643, 'Web Design2', 'web design desc2', 107, '10:42.6', '1', '1'),
	(220, 'Web Design3', 'web design desc3', 107, '10:42.6', '1', '1')
;

DECLARE @tblUserOffers TABLE
    ([userOfferID] int, [offerID] int, [userID] int, [visible] int)
;
    
INSERT INTO @tblUserOffers 
    ([userOfferID], [offerID], [userID])
VALUES
    (1, 1193, 135),
    (2, 658, 135),
    (3, 713, 135),
    (4, 643, 136),
    (5, 220, 136)
;

DECLARE @tblUsers TABLE
    ([userID] int, [firstName] varchar(10), [lastName] varchar(10), [displayName] varchar(12), [city] varchar(5), [stateProvince] varchar(14)
     , [profilePicture] varchar(14), [confirmed] int, [lastLogin] datetime, [ZipCode] [char](5))
;
    
INSERT INTO @tblUsers 
    ([userID], [firstName], [lastName], [displayName], [city], [stateProvince], [profilePicture], [confirmed], zipCode)
VALUES
    (112, 'afirstName', 'alastName', 'adisplayName', 'acity', 'astateProvince', 'profilePicture', 1, '11111'),
    (122, 'bfirstName', 'blastName', 'bdisplayName', 'bcity', 'bstateProvince', 'profilePicture', 1, '22222'),
    (135, 'cfirstName', 'clastName', 'cdisplayName', 'ccity', 'cstateProvince', 'profilePicture', 1, '33333'),
    (136, 'dfirstName', 'dlastName', 'ddisplayName', 'dcity', 'dstateProvince', 'profilePicture', 1, '44444')
;

declare @ZIPCodes TABLE ( [ZipCode] [char](5), [Latitude] [decimal](12, 4), [Longitude] [decimal](12, 4) );

insert into @ZipCodes 
	( zipCode, latitude, Longitude )
values
	( '11111', 42.59901, -71.41289 ),
	( '22222', 43.12901, -71.46389 ),
	( '33333', 44.59901, -74.12345 )



SELECT
      u.userID
    , fullName = u.firstName + ' ' + u.lastName
    , u.displayName
    , u.city
    , u.stateProvince
    , profilePicture = CASE
                            WHEN ISNULL(u.profilePicture, '') = '' THEN '/images/person.png'
                            ELSE u.profilePicture END
    , confirmed = CASE u.confirmed
                       WHEN 1 THEN '<html string>'
                       ELSE '' END
    , oa1.offers
    , oa2.offerCount
    , u.lastLogin
    , totalUsers = COUNT(*) OVER ()
From   @ZipCodes z  
	inner join @tblUsers u on z.zipCode = u.zipCode  
	inner join @tblUserOffers uo on u.userID = uo.userID 
OUTER APPLY (
    SELECT
    STUFF((
          SELECT
                 ', ' + offer
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
            WHERE u.userID = uo.userID
            --AND uo.visible = 1
          FOR XML PATH ('')
          )
         , 1, 1, '')
     ) AS OA1 (offers)
OUTER APPLY (
          SELECT
                 COUNT(o.offerID)
            FROM @tblOffers o
            INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
            WHERE u.userID = uo.userID
            --AND uo.visible = 1
     ) AS OA2 (offerCount)
--Where Longitude Between -72.41289 And -70.45751 
--And 	Latitude Between 41.59901 And 43.04739 

Open in new window

0
PortletPaulfreelancerCommented:
zipcode 33333 does not match the where clause
userid 135 is in zipcode 333333
useris 136 is i zipcode 44444 but that isn't in the zipcode table
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
good catch, I wanted to have one record in the zipCodes table that didn't fall into the range, and had "44444" in the table originally, then decided I didnt need it. Anyways, I fixed the data and now I'm getting an accurate representation of what I'm getting, meaning duplicate rows:

http://rextester.com/KHAX17922

I really appreciate you sticking with this and helping me out, one of the many reasons I've been an EE member for so long :)
0
PortletPaulfreelancerCommented:
SELECT
      u.userID
    , fullName = u.firstName + ' ' + u.lastName
    , u.displayName
    , u.city
    , u.stateProvince
    , profilePicture = CASE
                            WHEN ISNULL(u.profilePicture, '') = '' THEN '/images/person.png'
                            ELSE u.profilePicture END
    , confirmed = CASE u.confirmed
                       WHEN 1 THEN '<html string>'
                       ELSE '' END
    , oa1.offers
    , uc.offerCount
    , u.lastLogin
    , totalUsers = COUNT(*) OVER ()
    
From @tblUsers u  
    OUTER APPLY (
        SELECT
        STUFF((
              SELECT
                     ', ' + offer
                FROM @tblOffers o
                INNER JOIN @tblUserOffers uo ON o.offerID = uo.offerID
                WHERE u.userID = uo.userID
                --AND uo.visible = 1
              FOR XML PATH ('')
              )
             , 1, 1, '')
         ) AS OA1 (offers)
     inner join (
            SELECT
                  uo.userID
                , count(*) offerCount
            FROM @tblUserOffers uo
            GROUP BY
                uo.userID
             )  uc ON u.userID = uc.userID 
	inner join  @ZipCodes z   on z.zipCode = u.zipCode  
        and z.Longitude Between -72.41289 And -70.45751 
        and z.Latitude Between 41.59901 And 43.04739

Open in new window

+--------+----------------------+--------------+-------+----------------+----------------+---------------+-------------------------------------------------------------------+------------+-----------+------------+
| userID |       fullName       | displayName  | city  | stateProvince  | profilePicture |   confirmed   |                              offers                               | offerCount | lastLogin | totalUsers |
+--------+----------------------+--------------+-------+----------------+----------------+---------------+-------------------------------------------------------------------+------------+-----------+------------+
|    135 | cfirstName clastName | cdisplayName | ccity | cstateProvince | profilePicture | <html string> |  Diet &amp; Lifestyle Coaching, Disability Consulting, Web Design |          3 | NULL      |          2 |
|    136 | dfirstName dlastName | ddisplayName | dcity | dstateProvince | profilePicture | <html string> |  Web Design2, Web Design3                                         |          2 | NULL      |          2 |
+--------+----------------------+--------------+-------+----------------+----------------+---------------+-------------------------------------------------------------------+------------+-----------+------------+

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
this looks promising....let me test some more and I'll report back soon, but for now, it looks like it's the solution :)
0
Big MontySenior Web Developer / CEO of ExchangeTree.org Author Commented:
thanks for all of the help, big weight off my mind now that this works :)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.