Solved

# need unique rows

Posted on 2004-09-27
433 Views
Last Modified: 2008-02-01
i have a table:
order               char (8)
seq                  tinyint
hst_dt              long
hst_tm             long
promise_dt       long
status              char (1)
other columns
id column         identity

This query: SELECT order,seq,hst_dt,hst_tm,promise_dt,status FROM table1 WHERE item = 'item_no'
ORDER BY hst_dt DESC, hst_tm DESC,order,seq

returns:
order  seq  hst_dt          hst_tm  promise_dt  status
1     3        1     20040927     84308   20041011     N
2     2        1     20040918     90909  20040925     P
3     2        1     20040916     100000 20040925     P
4     2        1     20040912     111111 20040925     P
5     2        1     20040831     63148  20040925      P
6     2        1     20040821     85702   20040925     P
7     1        1     20040108     123456 20040228     P

How can I change my query such that only rows 1, 2, and 7 are returned.  I need only the most recent row per order + seq combination.

Thank you!
0
Question by:gary_j
• 9
• 5
• 3
• +2
21 Comments

LVL 15

Expert Comment

ID: 12163710
Hi gary_j,

Perhaps something like this?:

SELECT order,seq,MAX(hst_dt),MAX(hst_tm),promise_dt,status
FROM table1
WHERE item = 'item_no'
GROUP BY order,seq,promise_dt,status
ORDER BY order,seq
0

LVL 15

Expert Comment

ID: 12163725
Try this:

SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, order, seq
0

LVL 4

Expert Comment

ID: 12163782
Try this:

SELECT order,seq,hst_dt,hst_tm,promise_dt,status FROM table1 t1
WHERE item = 'item_no'
and not exists (select 1 from table1 t2 where t1.order = t2.order and t1.seq = t2.seq
and item = 'item_no' and t1.hst_dt <= t2.hst_dt and t1.hst_tm < t2.hst_tm)
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
0

LVL 15

Expert Comment

ID: 12163855
>SELECT order,seq,MAX(hst_dt),MAX(hst_tm),promise_dt,status

In SQL Server, this may return values for hst_dt and hst_tm from different rows.
0

LVL 15

Expert Comment

ID: 12163907
It didn't in my tests, can you show me an example when it does?
0

LVL 15

Expert Comment

ID: 12163969
Sure. TableA(pk, ColA, ColB, ColC)

pk, ColA, ColB, ColC
1, A, 10, 20
2, A, 20, 10

SELECT ColA, Max(ColB), Max(ColC) FROM TableA GROUP BY ColA

Results:
A, 20, 20
0

LVL 5

Author Comment

ID: 12164033
mcmonap and ATAHAC answers did not return unique rows, although in both cases I saw no more than 2 rows per order+seq combination

I couldn't get jdlambert1 answer to run at all:
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'order'.
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'seq'.
0

LVL 15

Expert Comment

ID: 12164141
Hi jdlambert1,

Thanks, I wasn't testing very well!
0

LVL 15

Expert Comment

ID: 12164265
Ooops. Left the table aliases off order & seq in the ORDER BY:

SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
0

LVL 5

Author Comment

ID: 12164747
Thanks, jdlambert1, it works now, but there is one problem

If there are two rows with the same history date, it returns both of them.

Do I just add MAX(hst_tm) to the subquery?

Thank you
0

LVL 15

Expert Comment

ID: 12165142
This means the combination of order, seq, and hst_dt don't uniquely identify a record. Just adding Max(hst_tm) to the subquery is extremely likely to provide misleading results.

There are several ways to handle this, most of them pretty ugly. Since you want the most recent record per order & seq, here's the best way, if ID is the table's primary key (and since it's defined with the Identity property):

SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(id) as id
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.id = sub.id
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq

This relies on the identity property, which ensures that the highest id is also the most recent, so it can substitute for the date column. It also relies on id being a primary key, which uniquely identifies each record.
0

LVL 9

Expert Comment

ID: 12165820
If you have sql2k, or higher, Than I would use functions.....they simplify a lot your job...
Something like this:

SELECT
a1.order,a1.seq,
getFirst_hst_dt(order,seq), getFirst_hst_tm(order,seq), getFirst_promise_dt(order,seq),getFirst_status(order,seq)
FROM
(select distinct order, seqfrom table1) a1
WHERE
item = 'item_no'
ORDER BY hst_dt DESC, hst_tm DESC,order,seq

Now a sample function:

CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq

---the same the other functions

Of course if you don't need the top 0, you can change the rule, like the top 1 but ordered in a certain order or so on....
EG:

CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq order by order,seq

this seems to me the clearer one...because you can change easily teh rule by changing the functions content...but you don't have to affect the content of the other storep procedures that are using those functions....

good luck,
xenon
0

LVL 15

Expert Comment

ID: 12165911
xenon, gary needs the most recent record for each order & seq, so every value needs to come from the same row. While a single function could be created that could be applied here, I think it's likely to hurt performance compared to a sub-select, and you'd have to be very careful to ensure all values for each order & seq come from the same row.
0

LVL 9

Expert Comment

ID: 12166002
As you may see in the function if you use the same conditions, the top 1 will return the values from the same row...so this is not an issue...

Regarding the speed...test and you will see that the top 1 will return very good speed,  much more better than using joins, especialy if you have index on the condition you use (in this case order and seq)....

I almost bet that the version with function and top 1 is much faster than the complex joins....

regards,
xenon

PS: the problem is that the code is much larger, because you have more functions to implement...but these functions have the quality to be very eficient, because their content is a top 1  on an index...
0

LVL 9

Expert Comment

ID: 12166130
CORRECTION:

my function was not ok..just the ideea...rewritten is:

CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS
BEGIN
declare @res int
select top 1 @res = hst_dt from table1 where order = @order and seq = @seq order by order,seq
return(@res)
END

....the rest is the same
0

LVL 5

Author Comment

ID: 12169803
Because I'm a VB programmer and don't know much about SQL functions, and probably won't be able to institute them from VB code, I was trying to solve the remaining issue by concatenating date and time.

This statement works:
select  Max(convert(varchar,hst_dt) + right('000000' + convert(varchar,hst_tm),6)) from table1

but when I put it in the query:

SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq,  Max(convert(varchar,hst_dt) + right('000000' + convert(varchar,hst_tm),6))  as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq

I get an error:  The conversion of the varchar value '20020911165750' overflowed an int column. Maximum integer value exceeded

Does anybody know why?  (Yes, both fields, date and time, are stored as integers)

0

LVL 15

Expert Comment

ID: 12169929
Possibly due to not specifying a length for varchar. From BOL: "When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified with the CAST function, the default length is 30." This doesn't specify the CONVERT function, and I would have expected it to use the same default as CAST, which should be long enough, but in case it's different, try substituting this:

SELECT order, seq,  Max(convert(varchar(100),hst_dt) + right('000000' + convert(varchar(100),hst_tm),6))  as hst_dt
0

LVL 5

Author Comment

ID: 12170008
I specified 8 (8 digit date) and 6 (6 digit time) as a result of your suggestion, so the concatenated string is 20020911165750 as it was before, and got the same error.

Then I tried 100 for each and got the same string and the same error

I don't understand why it's acceptable as a "stand-alone" query but not as part of the sub-query.

0

LVL 15

Accepted Solution

jdlambert1 earned 500 total points
ID: 12170273
Ah, you're converting hst_dt in the subquery, but not in the join (and the 30 character limit must apply to Convert as well). Try substituting this:

) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt convert(varchar,t1.hst_dt) + right('000000' + convert(varchar,t1.hst_tm),6) = sub.hst_dt
0

LVL 5

Author Comment

ID: 12170397
duh!  i should have caught that last little bit myself!

Thanks for all your help ...
0

LVL 15

Expert Comment

ID: 12170464
I once spent all day troubleshooting code to ultimately find that a period was out of place. That still hurts.

That was before the days of EE, where it's easy to get a bunch of other eyes to give it a fresh look.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

### Suggested Solutions

Title # Comments Views Activity
convert in derived column 7 30
sql server concatenate fields 10 35
Can > be used for a Text field 6 44
SQL Count issue 24 16
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backupâ€¦
I have a large data set and a SSIS package. How can I load this file in multi threading?
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

#### 773 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.