gary_j
asked on
need unique rows
i have a table:
order char (8)
seq tinyint
hst_dt long
hst_tm long
promise_dt long
status char (1)
other columns
id column identity
This query: SELECT order,seq,hst_dt,hst_tm,pr omise_dt,s tatus FROM table1 WHERE item = 'item_no'
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
returns:
order seq hst_dt hst_tm promise_dt status
1 3 1 20040927 84308 20041011 N
2 2 1 20040918 90909 20040925 P
3 2 1 20040916 100000 20040925 P
4 2 1 20040912 111111 20040925 P
5 2 1 20040831 63148 20040925 P
6 2 1 20040821 85702 20040925 P
7 1 1 20040108 123456 20040228 P
How can I change my query such that only rows 1, 2, and 7 are returned. I need only the most recent row per order + seq combination.
Thank you!
order char (8)
seq tinyint
hst_dt long
hst_tm long
promise_dt long
status char (1)
other columns
id column identity
This query: SELECT order,seq,hst_dt,hst_tm,pr
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
returns:
order seq hst_dt hst_tm promise_dt status
1 3 1 20040927 84308 20041011 N
2 2 1 20040918 90909 20040925 P
3 2 1 20040916 100000 20040925 P
4 2 1 20040912 111111 20040925 P
5 2 1 20040831 63148 20040925 P
6 2 1 20040821 85702 20040925 P
7 1 1 20040108 123456 20040228 P
How can I change my query such that only rows 1, 2, and 7 are returned. I need only the most recent row per order + seq combination.
Thank you!
Try this:
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, order, seq
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, order, seq
Try this:
SELECT order,seq,hst_dt,hst_tm,pr omise_dt,s tatus FROM table1 t1
WHERE item = 'item_no'
and not exists (select 1 from table1 t2 where t1.order = t2.order and t1.seq = t2.seq
and item = 'item_no' and t1.hst_dt <= t2.hst_dt and t1.hst_tm < t2.hst_tm)
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
SELECT order,seq,hst_dt,hst_tm,pr
WHERE item = 'item_no'
and not exists (select 1 from table1 t2 where t1.order = t2.order and t1.seq = t2.seq
and item = 'item_no' and t1.hst_dt <= t2.hst_dt and t1.hst_tm < t2.hst_tm)
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
>SELECT order,seq,MAX(hst_dt),MAX( hst_tm),pr omise_dt,s tatus
In SQL Server, this may return values for hst_dt and hst_tm from different rows.
In SQL Server, this may return values for hst_dt and hst_tm from different rows.
It didn't in my tests, can you show me an example when it does?
Sure. TableA(pk, ColA, ColB, ColC)
pk, ColA, ColB, ColC
1, A, 10, 20
2, A, 20, 10
SELECT ColA, Max(ColB), Max(ColC) FROM TableA GROUP BY ColA
Results:
A, 20, 20
pk, ColA, ColB, ColC
1, A, 10, 20
2, A, 20, 10
SELECT ColA, Max(ColB), Max(ColC) FROM TableA GROUP BY ColA
Results:
A, 20, 20
ASKER
mcmonap and ATAHAC answers did not return unique rows, although in both cases I saw no more than 2 rows per order+seq combination
I couldn't get jdlambert1 answer to run at all:
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'order'.
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'seq'.
I couldn't get jdlambert1 answer to run at all:
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'order'.
Server: Msg 209, Level 16, State 1, Line 1
Ambiguous column name 'seq'.
Hi jdlambert1,
Thanks, I wasn't testing very well!
Thanks, I wasn't testing very well!
Ooops. Left the table aliases off order & seq in the ORDER BY:
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(hst_dt) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
ASKER
Thanks, jdlambert1, it works now, but there is one problem
If there are two rows with the same history date, it returns both of them.
Do I just add MAX(hst_tm) to the subquery?
Thank you
If there are two rows with the same history date, it returns both of them.
Do I just add MAX(hst_tm) to the subquery?
Thank you
This means the combination of order, seq, and hst_dt don't uniquely identify a record. Just adding Max(hst_tm) to the subquery is extremely likely to provide misleading results.
There are several ways to handle this, most of them pretty ugly. Since you want the most recent record per order & seq, here's the best way, if ID is the table's primary key (and since it's defined with the Identity property):
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(id) as id
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.id = sub.id
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
This relies on the identity property, which ensures that the highest id is also the most recent, so it can substitute for the date column. It also relies on id being a primary key, which uniquely identifies each record.
There are several ways to handle this, most of them pretty ugly. Since you want the most recent record per order & seq, here's the best way, if ID is the table's primary key (and since it's defined with the Identity property):
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(id) as id
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.id = sub.id
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
This relies on the identity property, which ensures that the highest id is also the most recent, so it can substitute for the date column. It also relies on id being a primary key, which uniquely identifies each record.
If you have sql2k, or higher, Than I would use functions.....they simplify a lot your job...
Something like this:
SELECT
a1.order,a1.seq,
getFirst_hst_dt(order,seq) , getFirst_hst_tm(order,seq) , getFirst_promise_dt(order, seq),getFi rst_status (order,seq )
FROM
(select distinct order, seqfrom table1) a1
WHERE
item = 'item_no'
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
Now a sample function:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq
---the same the other functions
Of course if you don't need the top 0, you can change the rule, like the top 1 but ordered in a certain order or so on....
EG:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq order by order,seq
this seems to me the clearer one...because you can change easily teh rule by changing the functions content...but you don't have to affect the content of the other storep procedures that are using those functions....
good luck,
xenon
Something like this:
SELECT
a1.order,a1.seq,
getFirst_hst_dt(order,seq)
FROM
(select distinct order, seqfrom table1) a1
WHERE
item = 'item_no'
ORDER BY hst_dt DESC, hst_tm DESC,order,seq
Now a sample function:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq
---the same the other functions
Of course if you don't need the top 0, you can change the rule, like the top 1 but ordered in a certain order or so on....
EG:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS select top 1 hst_dt from table1 where order = @order and seq = @seq order by order,seq
this seems to me the clearer one...because you can change easily teh rule by changing the functions content...but you don't have to affect the content of the other storep procedures that are using those functions....
good luck,
xenon
xenon, gary needs the most recent record for each order & seq, so every value needs to come from the same row. While a single function could be created that could be applied here, I think it's likely to hurt performance compared to a sub-select, and you'd have to be very careful to ensure all values for each order & seq come from the same row.
As you may see in the function if you use the same conditions, the top 1 will return the values from the same row...so this is not an issue...
Regarding the speed...test and you will see that the top 1 will return very good speed, much more better than using joins, especialy if you have index on the condition you use (in this case order and seq)....
I almost bet that the version with function and top 1 is much faster than the complex joins....
regards,
xenon
PS: the problem is that the code is much larger, because you have more functions to implement...but these functions have the quality to be very eficient, because their content is a top 1 on an index...
Regarding the speed...test and you will see that the top 1 will return very good speed, much more better than using joins, especialy if you have index on the condition you use (in this case order and seq)....
I almost bet that the version with function and top 1 is much faster than the complex joins....
regards,
xenon
PS: the problem is that the code is much larger, because you have more functions to implement...but these functions have the quality to be very eficient, because their content is a top 1 on an index...
CORRECTION:
my function was not ok..just the ideea...rewritten is:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS
BEGIN
declare @res int
select top 1 @res = hst_dt from table1 where order = @order and seq = @seq order by order,seq
return(@res)
END
....the rest is the same
my function was not ok..just the ideea...rewritten is:
CREATE FUNCTION getFirst_hst_dt(@order int, @seq int)
RETURNS int
AS
BEGIN
declare @res int
select top 1 @res = hst_dt from table1 where order = @order and seq = @seq order by order,seq
return(@res)
END
....the rest is the same
ASKER
Because I'm a VB programmer and don't know much about SQL functions, and probably won't be able to institute them from VB code, I was trying to solve the remaining issue by concatenating date and time.
This statement works:
select Max(convert(varchar,hst_dt ) + right('000000' + convert(varchar,hst_tm),6) ) from table1
but when I put it in the query:
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(convert(varchar,hst_dt ) + right('000000' + convert(varchar,hst_tm),6) ) as hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
I get an error: The conversion of the varchar value '20020911165750' overflowed an int column. Maximum integer value exceeded
Does anybody know why? (Yes, both fields, date and time, are stored as integers)
This statement works:
select Max(convert(varchar,hst_dt
but when I put it in the query:
SELECT t1.order, t1.seq, t1.hst_dt, hst_tm, promise_dt, status
FROM table1 t1 INNER JOIN (
SELECT order, seq, Max(convert(varchar,hst_dt
FROM table1
WHERE item = 'item_no'
GROUP BY order, seq
) sub ON t1.order = sub.order AND t1.seq = sub.seq AND t1.hst_dt = sub.hst_dt
ORDER BY t1.hst_dt DESC, hst_tm DESC, t1.order, t1.seq
I get an error: The conversion of the varchar value '20020911165750' overflowed an int column. Maximum integer value exceeded
Does anybody know why? (Yes, both fields, date and time, are stored as integers)
Possibly due to not specifying a length for varchar. From BOL: "When n is not specified in a data definition or variable declaration statement, the default length is 1. When n is not specified with the CAST function, the default length is 30." This doesn't specify the CONVERT function, and I would have expected it to use the same default as CAST, which should be long enough, but in case it's different, try substituting this:
SELECT order, seq, Max(convert(varchar(100),h st_dt) + right('000000' + convert(varchar(100),hst_t m),6)) as hst_dt
SELECT order, seq, Max(convert(varchar(100),h
ASKER
I specified 8 (8 digit date) and 6 (6 digit time) as a result of your suggestion, so the concatenated string is 20020911165750 as it was before, and got the same error.
Then I tried 100 for each and got the same string and the same error
I don't understand why it's acceptable as a "stand-alone" query but not as part of the sub-query.
Then I tried 100 for each and got the same string and the same error
I don't understand why it's acceptable as a "stand-alone" query but not as part of the sub-query.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
duh! i should have caught that last little bit myself!
Thanks for all your help ...
Thanks for all your help ...
I once spent all day troubleshooting code to ultimately find that a period was out of place. That still hurts.
That was before the days of EE, where it's easy to get a bunch of other eyes to give it a fresh look.
That was before the days of EE, where it's easy to get a bunch of other eyes to give it a fresh look.
Perhaps something like this?:
SELECT order,seq,MAX(hst_dt),MAX(
FROM table1
WHERE item = 'item_no'
GROUP BY order,seq,promise_dt,statu
ORDER BY order,seq