pm620wh
asked on
Oracle 11g SQL LAG, PARTITION and RANK issue
Hi
I have a query that I'm trying to figure out and I've got my knickers in a real twist.
If you examine the attached spreadsheet, it has test data and the desired results.
I have an original report that was created programatically and I'm trying to recreate it using SQL. The report joins two existing views together using the PK UNIT ID. It RANKS and PARTITIONS the data and then attempts to peform some calculations on the resulting rowset using the LAG function.
There are 4 tabs in the spreadsheet:
Traction View - This is the main test data
Latest_Unit_Config_View - This is a static set of cars that is joined to Traction View
SQL And Original Report - This shows my SQL attempt and sample output from the old report
Desired Results - This shows how I would like the resulting output to loook
I have put notes in each tab to explain what I am trying to do
My SQL attempt is yielding rubbish and I'm hoping someone can help me figure out what to do so I can learn from this.
In summary:
Each UNIT checks-in multiple times per day and uploads the distance travelled since the last check-in
I need to find the final entry for each day for each UNIT and display them in order of UNIT then DATE
I then want to use the LAG function to manipulate the current rows values based on data in the previous row.
Thank you for any help you can provide.
Data.xls
I have a query that I'm trying to figure out and I've got my knickers in a real twist.
If you examine the attached spreadsheet, it has test data and the desired results.
I have an original report that was created programatically and I'm trying to recreate it using SQL. The report joins two existing views together using the PK UNIT ID. It RANKS and PARTITIONS the data and then attempts to peform some calculations on the resulting rowset using the LAG function.
There are 4 tabs in the spreadsheet:
Traction View - This is the main test data
Latest_Unit_Config_View - This is a static set of cars that is joined to Traction View
SQL And Original Report - This shows my SQL attempt and sample output from the old report
Desired Results - This shows how I would like the resulting output to loook
I have put notes in each tab to explain what I am trying to do
My SQL attempt is yielding rubbish and I'm hoping someone can help me figure out what to do so I can learn from this.
In summary:
Each UNIT checks-in multiple times per day and uploads the distance travelled since the last check-in
I need to find the final entry for each day for each UNIT and display them in order of UNIT then DATE
I then want to use the LAG function to manipulate the current rows values based on data in the previous row.
Thank you for any help you can provide.
Data.xls
ASKER
Hi PortletPaul
Thank you for taking the time to look at the report - I like your output :-)
Unfortunately, I have to replicate the old report faithfully with a single date column (this is to support a number of engineers who won't be retrained).
I do appreiciate your time, if you may help look at how I might achieve the desired results by joining the two views together I would be grateful.
Thank you again.
Thank you for taking the time to look at the report - I like your output :-)
Unfortunately, I have to replicate the old report faithfully with a single date column (this is to support a number of engineers who won't be retrained).
I do appreiciate your time, if you may help look at how I might achieve the desired results by joining the two views together I would be grateful.
Thank you again.
it's a work in progress, the second date column is optional... more to come
but Australia just won into the World Cup - have been distracted
but Australia just won into the World Cup - have been distracted
I believe I have it.
results:
http://sqlfiddle.com/#!4/4d1c0/34
UNITID DATE KILOMETERS DIFFERENCE KM PER DAY
46 2013-01-15 17:03:34 2274712 743 371.5 -- << 371.5
46 2013-01-17 04:02:56 2275455 2529 281.0 -- << 281
46
94 2013-01-17 06:54:37 2312275 558 558.0
94 2013-01-18 03:21:20 2312833 542 542.0
94 2013-01-19 15:15:48 2313375 206 206.0
94 2013-01-20 03:49:12 2313581 1980 495.0 --<< 495
94 2013-01-24 23:01:48 2315561 0 .0
94
106 2013-01-15 17:03:48 2279898 744 372.0
106 2013-01-17 04:03:10 2280642 2529 281.0
106
117 2013-01-17 06:54:12 81466 566 566.0
117 2013-01-18 03:20:53 82032 552 552.0
117 2013-01-19 15:15:22 82584 209 209.0
117 2013-01-20 03:48:46 82793 2007 501.8
117 2013-01-24 23:01:22 84800 0 .0
117
151 2013-01-15 17:03:22 2040699 742 371.0
151 2013-01-17 04:02:44 2041441 2518 279.8
151
-- code
with
sqf as (
select
unitID
, dateof
, kilometers
, row_number() over (partition by UnitID order by dateof ASC) as row_ref
from derived
)
, sqf2 as (
select
d.unitID
, d.dateof later_dt
, n.dateof earlier_dt
, n.kilometers km_1
, d.kilometers km_2
, d.kilometers - n.kilometers km_diff
, sum(d.kilometers - n.kilometers) over (partition by d.UnitID
, trunc(n.dateof)) km_period
, (trunc(d.dateof) - trunc(n.dateof)) day_diff
, d.row_ref
, n.row_ref as x
from sqf d
left join sqf n on d.unitid = n.unitid
and (d.row_ref-1) = n.row_ref
)
select
unitID
, to_char(earlier_dt,'YYYY-MM-DD HH24:mi:ss') "Date"
, km_1 Kilometers
, km_diff Difference
, to_char(km_period / greatest(day_diff,1),'999999.9') "KM Per Day"
from sqf2
order by
unitid
, earlier_dt
now, although I have got this far if you look at the query you will see there is still more to do (i.e. how to get to the derived table that one sees on the 'expected results' tabhttp://sqlfiddle.com/#!4/4d1c0/34
ASKER
This is awesome stuff - thank you for your help so far :-)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi PortletPaul
This is so close I can almost taste it :-)
Attached is the resulting output from live data - if you look at the coloured rows you will see that there are issues when a UNITID grouping ends. We have repeating null values in two rows for some reason (not sure if this is related to the missing filter also - see below)
You will also see that we have some poorly displayed values coming up as ########### My guess is bacause the number is too large. This I believe is because a filter is missing (see next).
One thing I've noticed and I don't know where to put this in new SQL - we need to filter the returned rows (as the original query did) so that we only get rows back where the TP_DESCRIPTION = 'Distance Travelled'.
If we don't filter, we get all kinds of other kilometer values for other components.
Woud you mind taking a look at the attached and letting me know your views please?
Thank you so much for your patience and help.
output.xls
This is so close I can almost taste it :-)
Attached is the resulting output from live data - if you look at the coloured rows you will see that there are issues when a UNITID grouping ends. We have repeating null values in two rows for some reason (not sure if this is related to the missing filter also - see below)
You will also see that we have some poorly displayed values coming up as ########### My guess is bacause the number is too large. This I believe is because a filter is missing (see next).
One thing I've noticed and I don't know where to put this in new SQL - we need to filter the returned rows (as the original query did) so that we only get rows back where the TP_DESCRIPTION = 'Distance Travelled'.
If we don't filter, we get all kinds of other kilometer values for other components.
Woud you mind taking a look at the attached and letting me know your views please?
Thank you so much for your patience and help.
output.xls
'A.' some groupings of data commence with nulls
'B.' some groupings commence and also finish with nulls
after b. groupings, you will get those highlighted double-null rows
see unitid 46 (it's an 'A.')
see unitids 94 & 106 (both 'B.'s)
which line need to be suppressed? is it the last line of 'B.'?
'B.' some groupings commence and also finish with nulls
after b. groupings, you will get those highlighted double-null rows
see unitid 46 (it's an 'A.')
see unitids 94 & 106 (both 'B.'s)
which line need to be suppressed? is it the last line of 'B.'?
You will also see that we have some poorly displayed values coming up as ########### My guess is because the number is too large.
line 10 of code last provided, try adding 3 more 9's
THEN to_char(km_per_day,'999999 999.9')
regarding filtering, immediately after the 'from traction_view', put in a where clause of your choosing
see line 50 of last provided code - immediately after that.
where traction_view.TP_DESCRIPTI ON = 'Distance Travelled'
line 10 of code last provided, try adding 3 more 9's
THEN to_char(km_per_day,'999999
regarding filtering, immediately after the 'from traction_view', put in a where clause of your choosing
see line 50 of last provided code - immediately after that.
where traction_view.TP_DESCRIPTI
ASKER
Hi Portlet Paul
Fantastic!! Now I've filterd out the unwanted rows - it works a treat!! I can see now where the duplicate rows occur as the kilometer value for the day is the same as the previous.
I am so grateful to you for your help.
Thank you once again for sticking with this.
Fantastic!! Now I've filterd out the unwanted rows - it works a treat!! I can see now where the duplicate rows occur as the kilometer value for the day is the same as the previous.
I am so grateful to you for your help.
Thank you once again for sticking with this.
ASKER
Fantastic - thank you
Not even an assist :-(
@pm620wh it would be great if you could award awking00 the accepted solution please.
>>v.nice awking00 ! (I just added a presentation layer and "in case" a greatest() on a divisor)
whilst I did do some pre-work and post-work - it wasn't my query in the end that solved the issues
you may request attention and ask for the question to be re-opened for re-assignment of points.
>>v.nice awking00 ! (I just added a presentation layer and "in case" a greatest() on a divisor)
whilst I did do some pre-work and post-work - it wasn't my query in the end that solved the issues
you may request attention and ask for the question to be re-opened for re-assignment of points.
ASKER
What is an assist please?
ASKER
Ahh - ok, please accept my sincerest apologies.
Please understand what went wrong here. I have been gliding down to the very bottom of the solution in haste and did not see awking00's input.
I am truly sorry. A lesson learnt for me to be alert to the content in future.
I feel truly bad now as awking00 has helped me in the past with other questions. I did not mean to ignore.
I will action this now.
Please understand what went wrong here. I have been gliding down to the very bottom of the solution in haste and did not see awking00's input.
I am truly sorry. A lesson learnt for me to be alert to the content in future.
I feel truly bad now as awking00 has helped me in the past with other questions. I did not mean to ignore.
I will action this now.
as you mark the answer, you may choose one as the accepted answer - others that you award points to become an 'assisted answer'
here awking00 worked some magic and that query is the core of the solution
you may award all points to that - or share points - that part is up to you.
here awking00 worked some magic and that query is the core of the solution
you may award all points to that - or share points - that part is up to you.
ASKER
Thank you to awking00 for providing the resolution and to portletpaul for finalising the presentation. EE at its best.
You're welcome and thank you, PortletPaul, for your supporting comments.
Open in new window
I may not complete this (but will try). Here I'm using row_number() to produce a an offsetting key that will allow 'next rows' to align.