asked on

MS SQL - Remove Duplicate Values in a Column & Display All Columns (Not DISTINCT)

Dear Experts,

In my query result I need to show results from a SQL alarms history table for 5 columns where Col1 = Date&Time, Col2 = Label, Col3 = Priority, Col4 = Status, Col5 = Cycle. I am using the following syntax which shows everything.

SELECT TOP 50 App_DateTime, Label, Priority, Enum_Status, Cycle FROM History

However because an alarm is recorded every time it cycles (ON/OFF) it is recorded as a new entry into the database with a unique Date&Time. This produces duplicate column values for Label, Priority and Status. I want to display only the maximum Cycle count for each alarm entry and not all the Cycle values below that for the same alarm. Therefore I want to eliminate duplicate entries in the Label column.

The problem I have in using 'GROUP BY Label' is that when I apply aggregates to all the other SELECT fields to allow the GROUP BY to function to work then I only get a single row result. I am expecting to see a list of different alarms together with their date, time, label, priority and status and maximum cycle value.

I am new to SQL so may well be approaching this the wrong way. I have searched around and tried using SELECT DISTINCT but that will eliminate a row only if all its fields are the same. Clearly this wont work for me since I am trying to eliminate duplicate values from a column and not duplicate rows.

Thanks, Gary

ASKER CERTIFIED SOLUTION

Scott Pletcher

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Elisys

ASKER

Hi Scott, thanks for your answer. You wouldn't believe how long I have been trying to put this together. A brief explanation of how the query works would be very much appreciated for us SQL novices. G

Scott Pletcher

I actually do believe it :-). It was a real pain to do certain things like this before ROW_NUMBER() came along.

Basically ROW_NUMBER() provides an inline ordering that resets to 1 on each PARTITION break.

For example, say you had a list of states and cities. You could PARTITION BY state ORDER BY city, and you'd get a result like:
1 CA Sacramento
2 CA San Diego
3 CA San Francisco
1 TN Memphis --note that the state change causes the number to reset
2 TN Nashville --
1 WY ...

Now, let's go back to:
PARTITION BY Label ORDER BY Cycle DESC
We can see that ROW_NUMBER/row_num 1 will always be the highest cycle number for each label, which is what we want to list, so we restrict the output to:
row_num = 1

This is wonderfully useful/flexible. For example, to get the top 3, you could use:
row_num BETWEEN 1 AND 3

How sweet is that!

MS SQL - Remove Duplicate Values in a Column &amp; Display All Columns (Not DISTINCT)

MS SQL - Remove Duplicate Values in a Column & Display All Columns (Not DISTINCT)