Solved

Inner Join or CTE  using a lookup table

Posted on 2013-06-18
8
425 Views
Last Modified: 2013-07-08
I want to return a result set that compares the rows in a table to the very same same that are offset by a day. I want to do this using a Lookup table rather than the DateDiff function.

I have one table named UPTO. It contains 4 columns, IDNo, Name, Amt and TimePoint. The second table is a lookup table, DateLookup. It contains two Columns, TimePoint and IntDate.

UPTo.TimePoint is a Date Column. I populate it with the Date and Hour. Minutes and seconds are always 0.
Sample Upto Table data:
1, Loc1, 5,01/01/2013 01:00:00
1, Loc1, 8,01/02/2013 01:00:00
1, Loc1, 10,01/03/2013 01:00:00

In the DateLookup table the IntDate column is just an integer that increases by 1 for each new day.
e.g. Typical
01/01/2013 01:00:00, 100
01/02/2013 01:00:00, 101
01/03/2013 01:00:00, 102

What I would like to accomplish is to compare the records in the UpTo table to the Day before records in the UpTo table. Sorry for the fuzzy explanation.

Here is an example.

I want to compare the Amt on 01/01/2013 01:00:00 to the Amt on 01/02/2013. The results set  would be:
 01/02/2013, 3.

Then if I did the next combination 01/02/2013 compared to 01/30/2013, that result set would be:
01/03/2013, 2

I want to use the lookup table and subtract 1 from the IntDate column. I do not want to use a DateDiff directly on the TimePoint column. The actual tables have 100's of millions of rows and I want to avoid the DateDiff overhead.

I think this requires a CTE but I am not sure.

Would someone please propose a possible query to get the desired result set?

Thanks,
pat
0
Comment
Question by:mpdillon
  • 4
  • 3
8 Comments
 
LVL 23

Expert Comment

by:Racim BOUDJAKDJI
ID: 39258065
The actual tables have 100's of millions of rows and I want to avoid the DateDiff overhead.

I think this requires a CTE but I am not sure.
You could do this using a CTE but it won't save much resources if you are using two tables because the main overhead you'd have to face is the JOIN. My first guess is that you'd save more performance by either adding a persisted table valued function column on UPTO then using a to minimize IO's.  Try to test it to see.
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39258165
I'm a little confused, by this:
>>I want to return a result set that compares the rows in a table to the very same same that are offset by a day
are you looking to do a "self join" like this?

select *
from UPTO as u1
|inner|left| join upto as u2 on dateadd(day,-1,u1.timepoint) = u2.timepoint
                                            and ...

but wanting to avoid a date function in that join?

also: What version of SQL Server is this for?
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39258172
by the way, are the there any instances where you want "the previous record" but that previous record is more than 1 day different (e.g. over a week-end)?

ThisRecord  PreviousRecord
Monday       FridayOfWeekBefore
0
 

Author Comment

by:mpdillon
ID: 39258236
Racimo,
I have never used a “persisted table valued function column”. I googled it and it seems to be adding an extra column to the table which is calculated by a function. If that is correct, I am concerned it will cause storage problems. Currently this table and indexes are just under 1 TB. I would need to add  5 calculated columns, Year, Month, Day, Hour and IntDay.
I have used the lookup table to do some calculation already. My testing indicated a 10 to 20 fold increase in performance over using the DateDiff and DatePart functions. So I would like to stay with the lookup table for now.

PortletPaul,
Yes the self-join is what I am looking for. I have already done the self-join in the format you have shown. But I want to eliminate the DateAdd function and replace it with the Lookup table using subtraction.
To your second question, the days are always continuous.
0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39258248
Thanks. I don't disagree with your desire to avoid the function in the join - just trying to clarify.

you didn't answer: what sql server version?
0
 

Author Comment

by:mpdillon
ID: 39258253
PortletPaul,
Sorry. both SQL 2012 and 2008 R2.
pat
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39258283
Well in SQL 2012 you have the lag() function
Accesses data from a previous row in the same result set without the use of a self-join
which you might consider as an alternative. It could be faster than the lookup table (needs testing of course).

Given the scale of what you are after I'm not sure a CTE would be the best approach. If your lookup table contained the date offset - and these columns are indexed - I believe a table would be a better approach. e.g.

tblLkpDateOffset
id, DateThisRecord* datetime, DatePrevRecord** datetime

*indexed
** indexed and one day earlier than DateThisRecord

PLUS only populate this table with the date range you are concerned with, then something along these lines?

select
*
from tblLkpDateOffset as D
inner join UPTO as ThisRec on D.DateThisRecord = ThisRec.TimePoint
left   join UPTO as PrevRec on D.DatePrevRecord = PrevRec.TimePoint
                                         and ThisRec.IDNo = PrevRec.IDNo -- at a guess i.e. ????

-- use of left join I'm not sure about but with a date offset you might have non-matching records

is this what you are seeking? (given that you have done similar things in the past I'm still a bit hazy on what you really need from us.)
0
 

Author Closing Comment

by:mpdillon
ID: 39307511
I never could get the Inner Join to work. I used a CTE

with CTEData(Name, Amt, TimePoint, INtDate)
as
(
Select U.Name, U.Amt, U.TImePoint, D.IntDate
)
Select A.Name, A.Amt-B.Amt, A.TimePoint, A.IntDate
From CTEData A
Inner join CTEData B
on A.TimePoint = B.TimePoint - 1
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Let's review the features of new SQL Server 2012 (Denali CTP3). It listed as below: PERCENT_RANK(): PERCENT_RANK() function will returns the percentage value of rank of the values among its group. PERCENT_RANK() function value always in be…
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now