asked on

Finding corresponding column of a row that has a maximum value

I have 2 tables: a daily table and an hourly table. The daily table has the following columns:
avg_30day float;
daily_metric;
max_hour_value float;
max_hour date;
timestamp; /* timestamp is always = trunc(timestamp) */

while the hourly table has:
metric_value float;
timestamp date; /* timestamp is always = trunc(timestamp, 'hh') */

I'm trying to create an update statement that will update the daily table as follows:
avg_30day = average of previous 30 days' daily_metric
max_hour_value = maximum metric_value in hour table for that day
max_hour = corresponding timestamp of the row that has maximum metric_value

The problem with my statement below is that I can't find the correct SQL that will return the corresponding timestamp from the daily table that has the maximum metric_value for that day. I get the ORA-00937: not a single-group group function. If I remove the busy_hour column and h1.timestamp, it works.

update daily d2
set (avg_30day, busy_hour_value, busy_hour) =
(select avg(d1.daily_metric), max(h1.metric_value), h1.timestamp
from daily d1 left outer join hourly h1 on d1.id = h1.id and d1.timestamp = trunc (h1.timestamp)
where d1.id = d2.id and d1.timestamp > (d2.timestamp - 29) and d1.timestamp <= d2.timestamp)

Open in new window

bull_rider

Just run the select statement if it retrieves the correct values. I am suspecting the group by clause not being included in the select statement. It should contain group by h1.timestamp as avg() and max() are group functions.

Hope this helps. Let me know if you have any more questions.

ASKER CERTIFIED SOLUTION

gatorvip

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Andytw

Try the statement below. It uses the FIRST function which allows you to get a value from the first row of a sorted group, where the value is not what you're sorting on. See ora docs for more info ...

Oracle® Database SQL Reference
http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/functions056.htm#i1000901

update daily d2
set (avg_30day, max_hour_value, max_hour) =
( select avg(d1.daily_metric), 
         max(h1.metric_value),
	 max(h1.date_time) keep (dense_rank first order by h1.metric_value desc)
  from daily d1 left outer join hourly h1 on (d1.id = h1.id and d1.date_time = trunc (h1.date_time) )
  where d1.id = d2.id 
    and d1.date_time > (d2.date_time - 29) 
    and d1.date_time <= d2.date_time)

Open in new window

jdymahal

ASKER

Thank you! I've never heard of analytical functions before (hence my slow response since I spent the last day studying it), but it was exactly what i needed to solve my problem. I had to split the solution into two update statements for performance reasons. here's my final solution:

update daily d2 set (avg_30day) =
(select avg(d1.daily_metric) over (partition by id)
from daily d1
where d1.id = d2.id and d1.timestamp > (d2.timestamp - 29) and d1.timestamp <= d2.timestamp)

the second update is similar to what you specified in your solution.

gatorvip

Have you tried Andytw's solution? I believe it might actually help solve your problem moreso than my answer did. I would suggest splitting the points.

jdymahal

ASKER

Yes, I did, but using row_number() and then picking out the row that has row_number() = 1 in the outer select as in your solution was better. Andytw's solution using "keep first" did find the correct timestamp, but it returned an "ora-01427: single-row subquery returns more than one row". In other words, I would have needed to add a row_number() column in the inner select and pick out where row_number() = 1 in an outer select to make it work in the update... making the "keep first" clause superfluous.

I hope I explained myself clear here.. Thanks!