Link to home
Create AccountLog in
Avatar of MrVault
MrVault

asked on

SQL join question

Novice SQL user here. Need help.

I have two tables with the same ID in them that I need to join. The second table has multiple instances of that ID with different timestamps for events. I need to join the two on that ID and return the most recent timestamp. Example:

T1:

ID     Name
1      John
2      Steve

T2:

ID     EventTime
1      2011-02-14
1      2011-02-13
2      2011-02-12
2      2011-02-14

RESULTS:

ID     Name             EventTime
1      John              2011-02-14
2      Steve            2011-02-14

The timestamps are BIGINT. Not sure if that matters.
Avatar of dwkor
dwkor
Flag of United States of America image

Something like that:

;with CTE(ID, EventTime)
as
(
	select ID, MAX(EventTime) as EventTime
	from T2
	group by ID
)
select t1.ID, t1.Name, CTE.EventTime
from T1 join CTE on 
	T1.ID = CTE.ID

Open in new window

SOLUTION
Avatar of Sharath S
Sharath S
Flag of United States of America image

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of MrVault
MrVault

ASKER

Sorry, I already have a select statement I'm working with. I'm trying to get the "max" value inside it.

My statement is essentially:

select a, b, c, d, e, f, g
from table1 t1
  inner join table2 t2 on t1.a = t2.a
  inner join table3 t3 on t2.h = t3.h
where a = 'someString'

So I'd rather not mix in a "with" or "as", etc.

this table has billions of rows, so the fastest method the better
Avatar of MrVault

ASKER

Sharath can you try to incorporate that version into the type of statement I already have above?
Try row_number. check my query.
which is your EventTime column in the above query?
Avatar of MrVault

ASKER

I'm confused. Are you suggesting the order in which the rows appear determines the most recent? or the ID? I don't think either would necessarily be the case due to updates to the rows.
If you don't want to use CTE, you can move it to the join
select t1.ID, t1.Name, CTE.EventTime
from 
	T1 join (
		select ID, MAX(EventTime) as EventTime
		from T2
		group by ID
	) CTE on T1.ID = CTE.ID

Open in new window

Avatar of MrVault

ASKER

dwkor what is cte?

I was looking at windows functions in sql 2008. anyone have experience with those? things like rank().
I already proposed row_number function (window function). Did you try this. To incorporate that in your actual query, I should know the column EventTime and which table has the column.
ASKER CERTIFIED SOLUTION
Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account
Avatar of MrVault

ASKER

Ztinel, I'm getting the following error:

An expression of non-boolean type specified in a context where a condition is expected
Avatar of MrVault

ASKER

Ztinel, that statement worked great. Thank you so much. Kept the query simple and inline. Sharath it was pretty close to yours, just used rank instead of row number.
CTE stands for "Common Table Expression".  SQL Server didn't have CTE's until 2005.  CTE's are fabulous.  I have stopped using inline views.  As far as I know, there is no performance penalty for using an CTE vs an inline view.  

The CTE syntax seems a little intimidating at first (at least it did to me), but after you use it a bit, it's easy.

Basic syntax:

WITH <CTE_1_name> AS
(
SELECT . . .
)
,
<CTE_2_name> AS
(
SELECT . . .
)
SELECT
<column list>
FROM
<CTE_1_name>  AS x
 INNER JOIN
<CTE_2_name>  AS y
 ON x.<column> = y.<column> (etc)
WHERE . . .

The CTE has two big advantages over the inline view:

1. It makes your code much easier to debug and maintain.  You define all the CTE's (you can have more than one) first, then finish with a normal looking query that just references the CTE's.  This means the "data returning" part of the query isn't all junked up with subqueries in the FROM clause.  If you want to tweak parts of your query during development, it's a lot easuer with a CTE.   I also find it easier to do maintenance on an existing query that uses CTE's, because I am not visually overwhelmed with all the additional SELECTs of the subqueries.

2.  If you have to do a self join, you can use the same CTE twice.  With Inline Views, you have to redefine the view for each instance of the join.  

A little tip:  The "WITH" keyword that introduces the CTE must be the first statement in query batch.  So if you are declaring variables or something before the CTE, just make sure to finish the line right before the CTE with a semi-colon.

Avatar of MrVault

ASKER

Thanks Bill for the tips. I appreciate it.