We help IT Professionals succeed at work.

MS SQL - Remove Duplicate Values in a Column & Display All Columns (Not DISTINCT)

Elisys
Elisys asked
on
400 Views
Last Modified: 2014-11-27
Dear Experts,

In my query result I need to show results from a SQL alarms history table for 5 columns where Col1 = Date&Time, Col2 = Label, Col3 = Priority, Col4 = Status, Col5 = Cycle. I am using the following syntax which shows everything.

SELECT TOP 50 App_DateTime, Label, Priority, Enum_Status, Cycle FROM History

However because an alarm is recorded every time it cycles (ON/OFF) it is recorded as a new entry into the database with a unique Date&Time. This produces duplicate column values for Label, Priority and Status. I want to display only the maximum Cycle count for each alarm entry and not all the Cycle values below that for the same alarm. Therefore I want to eliminate duplicate entries in the Label column.

The problem I have in using 'GROUP BY Label' is that when I apply aggregates to all the other SELECT fields to allow the GROUP BY to function to work then I only get a single row result. I am expecting to see a list of different alarms together with their date, time, label, priority and status and maximum cycle value.

I am new to SQL so may well be approaching this the wrong way. I have searched around and tried using SELECT DISTINCT but that will eliminate a row only if all its fields are the same. Clearly this wont work for me since I am trying to eliminate duplicate values from a column and not duplicate rows.

Thanks, Gary
Comment
Watch Question

Senior DBA
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Hi Scott, thanks for your answer. You wouldn't believe how long I have been trying to put this together. A brief explanation of how the query works would be very much appreciated for us SQL novices. G
Scott PletcherSenior DBA
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
I actually do believe it :-).  It was a real pain to do certain things like this before ROW_NUMBER() came along.

Basically ROW_NUMBER() provides an inline ordering that resets to 1 on each PARTITION break.

For example, say you had a list of states and cities.  You could PARTITION BY state ORDER BY city, and you'd get a result like:
1 CA Sacramento
2 CA San Diego
3 CA San Francisco
1 TN Memphis --note that the state change causes the number to reset
2 TN Nashville --
1 WY ...


Now, let's go back to:
PARTITION BY Label ORDER BY Cycle DESC
We can see that ROW_NUMBER/row_num 1 will always be the highest cycle number for each label, which is what we want to list, so we restrict the output to:
row_num = 1

This is wonderfully useful/flexible.  For example, to get the top 3, you could use:
row_num BETWEEN 1 AND 3

How sweet is that!
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.