Avatar of knobbylowboy
knobbylowboy
 asked on

ORDER BY items must appear in the select list if SELECT DISTINCT is specified

I have a table where I store a history of what the user has looked at and I want to retrieve the 5 most recent things the user has looked at, without any duplicates. I tried the query below (table/column names changed for clarity) but get an error: "ORDER BY items must appear in the select list if SELECT DISTINCT is specified". If I add the datstamp column to the select list then that row is no longer distinct and I get duplicates of course. How can I rewrite this query so that it does what I want, and as this query will be run very frequently I am looking for a solution that is efficient.

SELECT DISTINCT TOP 5
table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid
ORDER BY table1.datestamp

(Running on SQL Server 2005 Express Edition)
Microsoft SQL Server 2005SQL

Avatar of undefined
Last Comment
Raja Jegan R

8/22/2022 - Mon
vinurajr

IF Distinct is there Need to add the selected Item in the Order by thats it.... Clear from the Error
Remove the Distinct Use Group by and Try ...
Rajkumar Gs

Did you try this query ? If not, try and let me know.

SELECT DISTINCT TOP 5
table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid
ORDER BY table1.id, table2.name, table3.name

~Raj
rushShah

hi..
try this..


SELECT  table1.id AS t1ID, table2.name AS t2Name, table3.name AS t3Name
INTO	#Temp
FROM	table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE	table1.userid = @userid
ORDER BY table1.datestamp

SELECT	DISTINCT TOP 5 t1ID, t2Name, t3Name
FROM	#Temp

DROP TABLE #Temp

Open in new window

Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Rajkumar Gs

Have a try.

~Raj
SELECT TOP 5
table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid
group by table1.id, table2.name, table3.name
ORDER BY table1.id, table2.name, table3.name

Open in new window

knobbylowboy

ASKER
vinurajr: Sorry but I didn't understand your suggested solution, perhaps a sample might help?

RajkumarGS: Problem is that I need the results sorted by the 'datestamp' field.

rushShah: This solution should work but I was looking for a more efficient method than using temporary tables.
rushShah

just try that query..it should be efficient..
Match the Excecution plans..
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Sharath S

This will do what you are looking for.

SELECT TOP 5 table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid
GROUP BY table1.id, table2.name, table3.name
ORDER BY min(table1.datestamp)

Open in new window

Raja Jegan R

This should be efficient:
SELECT id, t2name, t3name
FROM (
SELECT table1.id, table2.name t2name, table3.name t3name, ROW_NUMBER() over ( PARTITION by table1.id, table2.name, table3.name ORDER BY table1.datestamp) rnum
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid) temp
WHERE rnum <=5

Open in new window

vinurajr

you please try this code:

SELECT TOP 5
table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = @userid
group by table1.id, table2.name, table3.name
ORDER BY table1.datestamp
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
knobbylowboy

ASKER
rushShah: I tried this method but as soon as I add DISTINCT I lose the sort order, eg:
SELECT      AgencyID
INTO    #Temp
FROM    tblRecent
WHERE   AdminUserID = @AdminUserID
ORDER BY DateTimeStamp DESC
SELECT  DISTINCT TOP 5 AgencyID
FROM    #Temp
DROP TABLE #Temp

Sharath_123/vinurajr: I tried your method but the sort order is also not applied, eg (changed to MAX as wanted newest to oldest, but MIN doesn't work either):
SELECT TOP 5 AgencyID
FROM tblRecent
WHERE AdminUserID = @AdminUserID
GROUP BY AgencyID
ORDER BY MAX(DateTimeStamp)

rrjegan17: I also tried your method but the sort order is not being applied, eg:
SELECT AgencyID
FROM (
SELECT AgencyID, ROW_NUMBER() over ( PARTITION by AgencyID ORDER BY DateTimeStamp DESC) rnum
FROM tblRecent
WHERE AdminUserID = @AdminUserID) temp
WHERE rnum <=5

I have simplified my code and table for ease of testing. tblRecent contains four columns: RecentID, DateTimeStamp, AdminUserID and AgencyID.
My recordset contains four rows:
1      16/02/2010 12:27:03 PM      2      1
2      16/02/2010 12:27:08 PM      2      24
3      16/02/2010 12:27:12 PM      2      25
4      19/02/2010 2:44:03 PM      2      22

I would expect the order to be newest to oldest so would return AgencyID in order 22, 25, 24, 1, but all of the above test code returns 1, 22, 24, 25 regardless of min/max or asc/desc.
rushShah

hi..try this..


SELECT  AgencyID, Rank() over ( PARTITION by AdminUserID ORDER BY DateTimeStamp DESC) rank
INTO    #Temp
FROM    @t
WHERE   AdminUserID = @AdminUserID
ORDER BY DateTimeStamp DESC

SELECT	* FROM #Temp

SELECT  DISTINCT TOP 5 AgencyID,  Rank() over ( ORDER BY rank DESC) num
FROM    #Temp
Order by num DESC

DROP TABLE #Temp

Open in new window

knobbylowboy

ASKER
rushShah: the sort order is correct but once rows are added with the same AgencyID then duplicate rows are still included in the recordset as DISTINCT does not remove the duplicate rows because of the unique Rank().
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
rushShah

ok..

try this..
I don't know whether it is Efficient or not..
There should be some other way to do this, but this one gives required result..
SELECT  AgencyID, Rank() over ( PARTITION by AdminUserID ORDER BY DateTimeStamp DESC) rank
INTO    #Temp
FROM    tblRecent
WHERE   AdminUserID = @AdminUserID
ORDER BY DateTimeStamp DESC

SELECT	DISTINCT TOP 5 AgencyID, sortOrder
FROM	(
SELECT  AgencyID, Rank() over ( ORDER BY rank DESC) sortOrder, Rank() over (PARTITION by AgencyID ORDER BY rank DESC) num
FROM    #Temp
) t
WHERE	num=1
Order by sortOrder DESC

DROP TABLE #Temp

Open in new window

Rajkumar Gs

Hi knobbylowboy,

I am not sure whether I understood your question and I am doing right.

Anyway check this script.

Attached the table structure, sample data, full data view and the suggested result.

Check the last query, whether it is right ?

~Raj
create table table1
(
	id	int,
	table2id	int,
	table3id	int,
	userid		int,
	datestamp	datetime
)

create table table2
(
	id	int,
	name	varchar(20),
)

create table table3
(
	id	int,
	name	varchar(20),
)

-- SAMPLE DATA
insert into table2 values(1, 'autoriksha')
insert into table2 values(2, 'boat')
insert into table2 values(3, 'plane')
insert into table2 values(4, 'cycle')
insert into table2 values(5, 'bus')
insert into table2 values(6, 'car')

insert into table3 values(1, 'acdsee')
insert into table3 values(2, 'avant')
insert into table3 values(3, 'bitdefender')
insert into table3 values(4, 'norton')
insert into table3 values(5, 'windows xp')
insert into table3 values(6, 'vista')

insert into table1 values(1, 1, 1, 1, '1/3/2010 6:44 AM')
insert into table1 values(1, 1, 1, 1, '1/3/2010 7:00 AM')
insert into table1 values(1, 2, 3, 1, '1/3/2010 7:46 AM')
insert into table1 values(1, 3, 2, 1, '1/3/2010 8:14 AM')
insert into table1 values(1, 1, 1, 1, '1/3/2010 9:33 AM')
insert into table1 values(1, 3, 2, 1, '1/4/2010 9:01 AM')
insert into table1 values(1, 2, 3, 1, '1/4/2010 10:01 AM')
insert into table1 values(1, 1, 2, 1, '1/4/2010 10:01 AM')
insert into table1 values(1, 1, 2, 1, '1/4/2010 10:41 AM')
insert into table1 values(1, 2, 2, 1, '1/4/2010 10:01 AM')
insert into table1 values(1, 2, 2, 1, '1/5/2010 9:13 AM')
insert into table1 values(1, 3, 3, 1, '1/6/2010 11:11 AM')
insert into table1 values(1, 3, 3, 1, '1/7/2010 12:41 AM')

-- COMPLETE DATA
SELECT DISTINCT
table1.datestamp, table1.id, table2.name, table3.name
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = 1
ORDER BY table1.id, table2.name, table3.name

-- FILTERED VIEW - TOP 5 ONLY
SELECT 
table1.id, table2.name, table3.name, MAX(table1.datestamp)
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = 1
GROUP BY table1.id, table2.name, table3.name
ORDER BY MAX(table1.datestamp)

Open in new window

Rajkumar Gs

For Top 5 records only, I missed it.
~Raj
SELECT TOP 5
table1.id, table2.name, table3.name, MAX(table1.datestamp)
FROM table1
LEFT OUTER JOIN table2 ON table2.id = table1.table2id
LEFT OUTER JOIN table3 ON table3.id = table1.table3id
WHERE table1.userid = 1
GROUP BY table1.id, table2.name, table3.name
ORDER BY MAX(table1.datestamp)

Open in new window

Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
Raja Jegan R

>> I also tried your method but the sort order is not being applied, eg:

This would do the sorting.
SELECT AgencyID
FROM ( 
SELECT AgencyID, ROW_NUMBER() over ( PARTITION by AgencyID ORDER BY DateTimeStamp DESC) rnum 
FROM tblRecent 
WHERE AdminUserID = @AdminUserID) temp 
WHERE rnum <=5
ORDER BY AgencyID

Open in new window

Sharath S

>> Sharath_123/vinurajr: I tried your method but the sort order is also not applied, eg (changed to MAX as wanted newest to oldest, but MIN doesn't work either):

What do you mean by it does not work? I tested it with sample data before posting. Better provide some sample data and the expected result.
knobbylowboy

ASKER
Ok here is a simplified sample table and data to test with:

create table testtable
(
id int,
datestamp datetime
)
insert into testtable values(1, '2010-02-11')
insert into testtable values(2, '2010-02-12')
insert into testtable values(3, '2010-02-13')
insert into testtable values(4, '2010-02-14')
insert into testtable values(5, '2010-02-15')
insert into testtable values(5, '2010-02-16')
insert into testtable values(4, '2010-02-17')

So the intention is to return a recordset of a maximum of 5 records (id only), with no duplicates, ordered by newest to oldest. The desired result is 4, 5, 3, 2, 1.

Ideally the query would be "select distinct top 5 id from testtable order by datestamp desc" but that doesn't work of course.

I tried all of the above suggested solutions and none produced the desired result.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Sharath S

try this.

select id from (
select id,max(datestamp) as datestamp
  from testable
 group by id) as t1
 order by datestamp
Raja Jegan R

Try this:
SELECT AgencyID 
FROM (  
SELECT AgencyID, DateTimeStamp, ROW_NUMBER() over ( PARTITION by AgencyID ORDER BY DateTimeStamp DESC) rnum  
FROM tblRecent  
WHERE AdminUserID = @AdminUserID) temp  
WHERE rnum <=5 
ORDER BY DateTimeStamp desc

-- For your modified table run the one below 

SELECT ID 
FROM (  
SELECT ID, datestamp, ROW_NUMBER() over ( PARTITION by ID ORDER BY datestamp DESC) rnum  
FROM testtable ) temp  
WHERE rnum <=5 
ORDER BY datestamp desc

Open in new window

Rajkumar Gs

Just to confirm... Did you check my sample table & data with the query I provided. (my previous two posts)

Here is the query for you sample data. I tested it against your table and it returns the suggested result.

~Raj
SELECT ID, DATESTAMP FROM
(SELECT ID, MAX(datestamp) AS DATESTAMP
	FROM testtable
		GROUP BY ID) AS A
ORDER BY DATESTAMP DESC

Open in new window

All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
Rajkumar Gs

If we select the ID alone (4, 5, 3, 2, 1), it will lose the order (1, 2, 3, 4, 5) which was selected inside the subquery. So better to select datetime field also with ID. (Previous Query)

~Raj
ASKER CERTIFIED SOLUTION
Rajkumar Gs

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
knobbylowboy

ASKER
Sharath_123: your solution returned 1, 2, 3, 4, 5 - wrong sort order.
rrjegan17: your solution returned 4, 5, 5, 4, 3, 2, 1 - duplication and more than 5 records.
RajkumarGS: your solution works, thank you.
knobbylowboy

ASKER
Needed to add "TOP 5" to limit recordset to 5 rows.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Sharath S

yes, I missed DESC.
Raja Jegan R

Small Mistake:

SELECT ID
FROM (  
SELECT ID, datestamp, ROW_NUMBER() over ( PARTITION by ID ORDER BY datestamp DESC) rnum  
FROM testtable ) temp  
WHERE rnum =1
ORDER BY datestamp desc