Solved

Does select distinct in SQL select the first row of each group

Posted on 2014-04-10
9
7,716 Views
Last Modified: 2014-04-10
I want to select the first date of use of a service.

I have done a query that selects customers who have used a served in a period of one year.  I have ordered those subs by ascending order of date of first use.

I use the above query to as a sub-query to ("select distinct customer_number, date_used
   from (select  customer_number, date_used from [table1] where entry is this year order by date_used"

Does the above query give me the date customer first used the service?

Thank you

Anthony
0
Comment
Question by:Anthony Matovu
  • 2
  • 2
  • 2
  • +2
9 Comments
 
LVL 44

Expert Comment

by:AndyAinscow
ID: 39990943
DISTINCT will select unique rows.
so if you have the following rows:
x, dte1
y, dte1
x, dte2
x, dte2

you will get three rows back, the second duplicate row x, dte2 will not be returned.

if you only require the very first row then use TOP 1 instead of DISTINCT in the sql statement
0
 
LVL 1

Author Comment

by:Anthony Matovu
ID: 39990947
Help me, how do I use top 1 by a group, have never used it
0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 167 total points
ID: 39990953
please read up this article, it will help you to understand and solve this kind of "issues":
http://www.experts-exchange.com/Database/Miscellaneous/A_3203-DISTINCT-vs-GROUP-BY-and-why-does-it-not-work-for-my-query.html
0
3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

 
LVL 44

Expert Comment

by:AndyAinscow
ID: 39990955
I'm not certain if this is what you want:
select customer_number, date_used
   from (select  TOP 1 customer_number, date_used from [table1] where entry is this year order by date_used)
0
 
LVL 1

Author Comment

by:Anthony Matovu
ID: 39991066
I would want some thing like
    Customer      Data first active
1. C1                  12/12/2013
1. C2                  12/12/2013
1. C3                  13/12/2013
1. C4                  12/12/2013
1. C5                  09/12/2013

Thank you for your help so far
0
 
LVL 37

Assisted Solution

by:Geert Gruwez
Geert Gruwez earned 166 total points
ID: 39991071
you could use the first_value aggregate

select distinct 
  customer_number, 
  first_value(date_used) over (partition by customer_number order by date_used asc)  as first_date_used
from table1
where date_used >= to_date('01/01/2014', 'DD/MM/YYY');

Open in new window

0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 167 total points
ID: 39991319
DISTINCT  does NOT give you the first record of any group, in fact you cannot guarantee which record DISTINCT will display within a group (nor does this matter by the way).

DISTINCT is not a piece of magic, it does one thing and one thing only and that is to ensure that every row returned is unique. It is a "row operator" which means it operates over the entire contents of the row, not just the first column but each and every column. As a "row operator" DISTINCT also means that making rows unique happens after everything else in the query is completed first, hence it can slow down queries as well.

I suspect there is more to your question. What I think you are asking for is....

I want a complete record (not just one or two fields)
but I only want that complete record for the  "first date of use of a service"

There are many ways to achieve this, and some are already mentioned - but you have nominated Oracle and SQL Server as topics  - it's not clear which of these you actually want. For example first_value() has been mentioned but this exists in Oracle and not SQL Server.

row_number() is a common method for what I think you are asking for and it exists in both database types (depending on versions of those dbms's though I'm afraid). It may be used like this:


select
   sq.*
from (
           select
              *  , ROW_NUMBER() over (PARTITION BY service ORDER BY date_used ASC) as  rn
           from yourtable
        ) sq
where rn = 1

ROW_NUMBER() simply assigns an integer starting at 1 but to control which records get 1 you can "partition" and "order by"

Here we "partition" for each service, and by ordering on dates (ascending) we ensure that the record with the earliest date, for each serivice, gets the value 1

Once we have calculated that number (aliased as rn) we simply use that in a where clause (hence the need for a subquery as shown above).

Hope this helps and my assumptions about your question are relevant.

oh, if your unfortunate enough not to have row_number() available there are other techniques.
0
 
LVL 37

Expert Comment

by:Geert Gruwez
ID: 39991356
actually it's way easier than that ...

select 
  customer_number, 
  min(date_used) as first_date_used
from table1
where date_used >= to_date('01/01/2014', 'DD/MM/YYY')
group by customer_number

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39991381
:) thanks.

Geert, yes your group by is certainly easy ...
but not if you want all the fields from "table1"

In my experience when folks are using "select distinct" but it doesn't seem to work for them they are really asking for "complete records" where some condition is met (such as the earliest date per service)

I even tried to write an article on this:
http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/A_12282-Select-Distinct-is-returning-duplicates.html
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

774 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question