How to Create User-Defined Aggregates in Oracle
Before we begin creating these things, what are user-defined aggregates? They are a feature introduced in Oracle 9i that allows a developer to create his or her own functions like "SUM", "AVG", and "COUNT". Previously the only way to achieve this kind of functionality across a set of data would be to write a pl/sql routine to iterate through a cursor.
The aggregates are defined in multiple parts, using Oracle's Object Oriented functionality. In structure it's somewhat similar to creating an externally defined function; but instead of writing a c routine as a member of an external library, first you create an object type and body using a predefined API from the Data Cartridge (ODCI) library then, like the externals, you create a function using that type to define it rather than implementing the code directly within the function.
Data Cartridge API for Aggregates
The following four functions must be implemented in your object type and body:
- odciaggregateinitialize(ctx IN OUT your_aggregate_object_type) RETURN NUMBER
- odciaggregateiterate(self IN OUT your_aggregate_object_type, your_parameter_to_aggregate IN some_data_type) RETURN NUMBER
- odciaggregatemerge(self IN OUT your_aggregate_object_type, ctx2 IN your_aggregate_object_type) RETURN NUMBER
- odciaggregateterminate(self IN your_aggregate_object_type, returnvalue OUT some_data_type, flags IN NUMBER) RETURN NUMBER
Starting simply, a NULL aggregate
Let's start with a minimal aggregate. This function doesn't really "do" anything, it simply returns NULL for any group; but it illustrates all of the required pieces.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- NULL Aggregate - minimal aggregate syntax, returns NULL
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE null_agg_type
AS OBJECT
(
v_dummy char(1), -- every Oracle object type must include at least one attribute
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT null_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT null_agg_type, p_notused IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT null_agg_type, ctx2 IN null_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(self IN null_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY null_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT null_agg_type)
RETURN NUMBER
IS
BEGIN
-- instantiate our type, NULL the dummy attribute
ctx := null_agg_type(null);
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT null_agg_type, p_notused IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
-- In normal functions you would do something with the input
-- add it, count it, square it, etc. but not in this null-aggregate
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT null_agg_type, ctx2 IN null_agg_type)
RETURN NUMBER
IS
BEGIN
-- If you have 2 results sets to consolidate, for instance parallel processing of a table
-- you will resolve the combination here.
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(self IN null_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
-- Set your return value, usually this would involve your attributes and possibly other
-- functions. For these simple null-aggregates we simply assign NULL and exit.
returnvalue := NULL;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
Select allOpen in new window
Now create a function to implement the type created above
CREATE OR REPLACE FUNCTION nullagg(p_string VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING null_agg_type;
1:
2:
3:
4:
Select allOpen in new window
And finally, we can use the new function just as we would other aggregates; but, in this case, it's rather anticlimactic.
Ok, that's simple enough, but not useful, so let's explore some more complex functionality by replicating some of the built-in aggregates. I don't recommend these for production use, but they are helpful for illustration to explore how the functionality works when compared with familiar routines.
Writing our own counting aggregate
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Counting Aggregate
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE count_agg_type
AS OBJECT
(
v_result INTEGER, -- We need a place to store our count as we iterate through the group sets
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT count_agg_type)
RETURN NUMBER,
-- As with the null aggregate, we're not really using the input parameter, we're simply counting rows.
MEMBER FUNCTION odciaggregateiterate(self IN OUT count_agg_type, p_notused IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT count_agg_type, ctx2 IN count_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(self IN count_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY count_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT count_agg_type)
RETURN NUMBER
IS
BEGIN
-- At the start of our count we will, of course, have 0.
ctx := count_agg_type(0);
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT count_agg_type, p_notused IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
-- As we iterate through each value, increment our counter
self.v_result := self.v_result + 1;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT count_agg_type, ctx2 IN count_agg_type)
RETURN NUMBER
IS
BEGIN
-- If we have two results, the merged result is simply the sum of both
self.v_result := self.v_result + ctx2.v_result;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(self IN count_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
-- No finalizing is required when done, the count is complete, so return and exit.
returnvalue := self.v_result;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
Select allOpen in new window
And, finally, create a function to use the type.
CREATE OR REPLACE FUNCTION countagg(p_string VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING count_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
Important to note, unlike Oracle's built-in function, we can't use the "*" input. That syntax is only exposed for the built-in. To get around this limit, simply pass in a non-null value (such as rowid) to count every row. On the other hand, the "*" syntax only applies to counting and no other aggregates. Since there's no good reason to rewrite COUNT, this is not a feature you should miss.
Writing a MAX aggregate for numbers
Next, we'll step up the complexity slightly by creating our own MAX function for numbers. Unfortunately we can't really do overloading as the Oracle built-ins can. User-defined aggregate types can only be assigned to standalone functions, not functions within a package. You could create several standalone functions then create a package with overloaded functions that called the appropriate standalone but that's a lot of overhead that simply creates the illusion of consolidated code. As mentioned above, this aggregate is solely for illustration of the API so you can learn from the example. In practice, you'll want to use the built-in function for efficiency.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Maximum Aggregate for numbers
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE max_num_agg_type
AS OBJECT
(
v_result NUMBER, -- We need a place to store our maximum as we iterate through the group sets
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT max_num_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT max_num_agg_type, p_number IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT max_num_agg_type, ctx2 IN max_num_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(self IN max_num_agg_type,
returnvalue OUT NUMBER,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY max_num_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT max_num_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := max_num_agg_type(NULL); -- initialize our max value to null
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT max_num_agg_type, p_number IN NUMBER)
RETURN NUMBER
IS
BEGIN
IF self.v_result IS NULL
THEN
self.v_result := p_number; -- the first number is automatically the maximum
ELSE
self.v_result := GREATEST(self.v_result, p_number); -- compare new number to old, keep the bigger
END IF;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT max_num_agg_type, ctx2 IN max_num_agg_type)
RETURN NUMBER
IS
BEGIN
IF self.v_result IS NULL
THEN
self.v_result := ctx2.v_result; -- if merging two sets, if the current is null, use the new one
ELSIF ctx2.v_result IS NOT NULL
THEN
self.v_result := GREATEST(self.v_result, ctx2.v_result); -- if both are populated then use the bigger
END IF;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(self IN max_num_agg_type,
returnvalue OUT NUMBER,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := self.v_result; -- like the counting aggregate, no finalizing processes are needed
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
Select allOpen in new window
And as always, create a function to implement the object
CREATE OR REPLACE FUNCTION maxnumagg(p_number NUMBER)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING max_num_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
You could create a similar aggregate for MIN by switching GREATEST for LEAST. And you could build them for other data types by swapping the data type in the following lines:
- MEMBER FUNCTION odciaggregateiterate(self IN OUT max_num_agg_type, p_number IN NUMBER)
Aggregating Interval types
Next, we'll create a new aggregate that produces functionality not already available through an Oracle built-in function. The method is the same though, we'll mimic the SUM function, but our aggregate will support the INTERVAL type, specifically INTERVAL DAY TO SECOND.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Sum Aggregate for Day to Second Intervals
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE dsinterval_agg_type
AS OBJECT
(
v_result INTERVAL DAY TO SECOND, -- our working result
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT dsinterval_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(
self IN OUT dsinterval_agg_type,
p_interval IN INTERVAL DAY TO SECOND
)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(
self IN OUT dsinterval_agg_type,
ctx2 IN dsinterval_agg_type
)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN dsinterval_agg_type,
returnvalue OUT INTERVAL DAY TO SECOND,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY dsinterval_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT dsinterval_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := dsinterval_agg_type(NULL); -- initialize the sum to NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(
self IN OUT dsinterval_agg_type,
p_interval IN INTERVAL DAY TO SECOND
)
RETURN NUMBER
IS
BEGIN
IF self.v_result IS NULL
THEN
self.v_result := p_interval; -- the first interval is the sum
ELSE
self.v_result := self.v_result + p_interval; -- add each new interval to the running total
END IF;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(
self IN OUT dsinterval_agg_type,
ctx2 IN dsinterval_agg_type
)
RETURN NUMBER
IS
BEGIN
IF self.v_result IS NULL
THEN
self.v_result := ctx2.v_result; -- If our sum is NULL, use the result of the other set
ELSIF ctx2.v_result IS NOT NULL
THEN
self.v_result := self.v_result + ctx2.v_result; -- If both sets have sums, add them to get the total
END IF;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN dsinterval_agg_type,
returnvalue OUT INTERVAL DAY TO SECOND,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := self.v_result; -- the iterate and merge did all the work, return the result
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
Select allOpen in new window
Nothing special here, create a function using the object above.
CREATE OR REPLACE FUNCTION dsintervalagg(p_interval INTERVAL DAY TO SECOND)
RETURN INTERVAL DAY TO SECOND
PARALLEL_ENABLE
AGGREGATE USING dsinterval_agg_type;
1:
2:
3:
4:
Select allOpen in new window
And, finally, let's look at our results, we'll create 2 small intervals, the result will vary but should be slightly more than +0 00:00:3.42
SELECT dsintervalagg(i)
FROM (SELECT SYSTIMESTAMP - SYSDATE i FROM DUAL
UNION ALL
SELECT NUMTODSINTERVAL(3.42, 'second') FROM DUAL);
1:
2:
3:
4:
Select allOpen in new window
Concatenating Strings with an aggregate
For the next aggregate we'll construct, we'll tackle one of the most common questions asked for Oracle SQL. That is: "How can I combine (concatenate) strings from multiple rows into a single value?" Beginning with 11gR2, the LISTAGG aggregate is available and I recommend using that, but for 9i, 10g and 11gR1 the following function will achieve the results. Since this is a common goal, it's common to see similar functions to this around the internet, the most widely known is "stragg" written by Tom Kyte of Oracle. Others have written similar functions, with the same name (including me) but I think Tom's was the first. As you might expect from the previous examples, the API is largely fixed, so all of these string concatenation functions look and behave pretty much the same.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Concatenation Aggregate for strings (a.k.a. stragg)
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE concat_agg_type
AS OBJECT
(
-- This could be defined to 32767, but since the aggregate will be used in SQL,
-- the 4000 limit is appropriate since that's all that SQL varchar2 strings support
v_result VARCHAR2(4000),
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT concat_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT concat_agg_type, p_string IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT concat_agg_type, ctx2 IN concat_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(self IN concat_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY concat_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT concat_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := concat_agg_type(NULL); -- initialize the concatenation to NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT concat_agg_type, p_string IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
self.v_result := self.v_result || ',' || p_string; -- Append a delimiter and new value
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT concat_agg_type, ctx2 IN concat_agg_type)
RETURN NUMBER
IS
BEGIN
-- If merging, simply concatenate them together
-- Since each string will either be NULL or delimiter prefixed, no need to re-delimit
self.v_result := self.v_result || ctx2.v_result;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(self IN concat_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
-- Since we prefix the string initially with a comma, remove the extra here before returning
returnvalue := LTRIM(self.v_result, ',');
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
Select allOpen in new window
Our function could be called "stragg" to mimic other versions found on various websites, I called this one "concatagg" to maintain the convention established in the other aggregates above. My recommendation, choose a name that is self-documenting and matches your corporate naming standards.
CREATE OR REPLACE FUNCTION concatagg(p_string VARCHAR2)
RETURN VARCHAR2 PARALLEL_ENABLE AGGREGATE
USING concat_agg_type;
1:
2:
3:
Select allOpen in new window
Example usage:
Multiplying numbers with a product aggregate
Next, we'll go back to numbers and introduce another new function that isn't available in the built-ins: set multiplication. With built-ins we can add numbers, we can average numbers, we can derive variance, median and standard deviation but we can't multiply them! So, let's create a function to allow that.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Product Aggregate for numbers (multiply numbers in set, vs. sum which adds)
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE product_agg_type
AS OBJECT
(
v_product NUMBER, -- Store our running product
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT product_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT product_agg_type, p_number IN NUMBER)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT product_agg_type, ctx2 IN product_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN product_agg_type,
returnvalue OUT NUMBER,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY product_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT product_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := product_agg_type(NULL); -- Initialize the result to NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT product_agg_type, p_number IN NUMBER)
RETURN NUMBER
IS
BEGIN
-- Multiply each new number by the running product, if we don't have a product yet, seed it with 1.
self.v_product := NVL(self.v_product, 1) * p_number;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT product_agg_type, ctx2 IN product_agg_type)
RETURN NUMBER
IS
BEGIN
-- Multiply the products of 2 sets, using 1 for any NULL's
-- Make sure at least one of the sets has a value though
IF COALESCE(self.v_product, ctx2.v_product) IS NOT NULL
THEN
self.v_product := NVL(self.v_product, 1) * NVL(ctx2.v_product, 1);
END IF;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN product_agg_type,
returnvalue OUT NUMBER,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := self.v_product;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
Select allOpen in new window
Create a function to use the object.
CREATE OR REPLACE FUNCTION productagg(p_number IN NUMBER)
RETURN NUMBER
PARALLEL_ENABLE
AGGREGATE USING product_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
Aggregating Dates to detect event skew
For the next aggregate, we'll get a little more creative (some might say mathematically silly.) We'll create an Average function for DATE values. So, first, you have to ask yourself: "Given a set of dates, what would the 'average' look like? How would it be used? Would the generated date have any meaning?" For this aggregate, I'm going to define the "average date" within a set to be a point in time that best represents all of the times in the set. So, if I'm tracking events, for example: crimes by county, vehicular accidents by state, quadruplet births by nation, or sales perks given away by department; I might want to know when these things "usually" happen. Of course, this is an approximation at best, but if I've a lot of skew to my results, a function like this could be helpful. For instance, if I have 10 crimes in June and 2 crimes in November, I'd expect the result of the function to skew toward the June vales and return an "average" date indicating these particular crimes are more common in the summer. Again, this function is mostly intended for illustration, but for some purposes, you may find the approximations helpful as an indicator for areas that need more sophisticated statistical analysis. Now that we know how we are going to define what the results mean, how do we calculate such a thing? Oracle doesn't store dates as a single number, so it's not just matter of adding them up and dividing by the count as you would sales or batting averages. However, it is possible to generate a meaningful number that represents each date. By using the Julian date number of each day and adding in the fractional day amount for hours, minutes and seconds we can generate a single number that is an offset from the "beginning of time" as far as Oracle is concerned, i.e. Jan 1, 4712 BC 00:00:00. Once we have that, then the date average is just as simple as any numerical average.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Average Aggregate for dates
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE avg_date_agg_type
AS OBJECT
(
v_day_sum NUMBER, -- running total of date offsets
v_day_count INTEGER, -- number of dates we've summed
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT avg_date_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT avg_date_agg_type, p_date IN DATE)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT avg_date_agg_type, ctx2 IN avg_date_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN avg_date_agg_type,
returnvalue OUT DATE,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY avg_date_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT avg_date_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := avg_date_agg_type(0, 0); -- initialize the sum and count both to 0
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT avg_date_agg_type, p_date IN DATE)
RETURN NUMBER
IS
BEGIN
-- The TO_CHAR function with 'j' format return the Julian value
-- for the day portion only and drops the time
-- this function will add the time as a fractional day to the Julian.
-- Subtract one to create a delta from minimum-supported time.
self.v_day_sum :=
self.v_day_sum + TO_NUMBER(TO_CHAR(p_date, 'j')) + (p_date - TRUNC(p_date)) - 1;
self.v_day_count := self.v_day_count + 1;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT avg_date_agg_type, ctx2 IN avg_date_agg_type)
RETURN NUMBER
IS
BEGIN
-- Add the sums and counts to generate an average over both sets
self.v_day_sum := self.v_day_sum + ctx2.v_day_sum;
self.v_day_count := self.v_day_count + ctx2.v_day_sum;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN avg_date_agg_type,
returnvalue OUT DATE,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
-- If we haven't iterated over any values, the count will be 0, so the average is NULL
-- we can't divide by zero anyway. If we do have a count then generate the mean delta
-- and add it to the start of the Oracle calendar.
returnvalue :=
CASE
WHEN self.v_day_count > 0
THEN
-- Add the average delta to min-time to create the "average" date
TO_DATE('1', 'j') + (self.v_day_sum / self.v_day_count)
ELSE
NULL
END;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
Select allOpen in new window
Create a function to use the object
CREATE OR REPLACE FUNCTION avgdateagg(p_date DATE)
RETURN DATE
PARALLEL_ENABLE
AGGREGATE USING avg_date_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage: (on average, when did we analyze our tables?)
Using an aggregate to parse strings
The next function is from a real world example asked here on Experts-Exchange. That is, given a set of strings, what are the distinct characters used in those strings? For example, given ('cadbb','dcfba','zxaf') the result would be 'abcdfxz'. I'll admit, this is an unusual function. Where this might be used? Perhaps an inventory management system with quality codes assigned to each item as a string to indicate various defect types. So you want to check within each inventory location, what are all of the defects that need to be managed; so you can assign the appropriate staff and equipment?
This aggregate has some extra complexity and requires an extra member function to determine the unique strings, this function will be used in the iterate and merge steps to find unique characters as new strings are introduced.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
--- Distinct characters
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE distinct_char_agg_type
AS OBJECT
(
v_result VARCHAR2(4000), -- as with the string concatenation function above, we can only return 4000 characters in SQL
MEMBER FUNCTION distinct_chars(p_unique IN VARCHAR2, p_new IN VARCHAR2)
RETURN VARCHAR2
DETERMINISTIC,
STATIC FUNCTION odciaggregateinitialize(p_ctx IN OUT distinct_char_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT distinct_char_agg_type, p_string IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(
self IN OUT distinct_char_agg_type,
p_ctx2 IN distinct_char_agg_type
)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN distinct_char_agg_type,
p_returnvalue OUT VARCHAR2,
p_flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY distinct_char_agg_type
IS
-- Given a string of unique characters and a new string, check if the new string
-- introduces any new characters that need to be added to the unique set.
-- This function assumes the "unique" string has already been validated to be unique
-- The usage of this function in the rest of the object's functions ensures this is true.
MEMBER FUNCTION distinct_chars(p_unique IN VARCHAR2, p_new IN VARCHAR2)
RETURN VARCHAR2
DETERMINISTIC
IS
v_temp VARCHAR2(32767) := NULL;
v_index INTEGER;
v_char CHAR(1);
BEGIN
v_temp := p_unique;
-- Look at each character of the new string
-- if it's unique then add it via simple sorted insertion
FOR i IN 1 .. NVL(LENGTH(p_new), 0)
LOOP
v_char := SUBSTR(p_new, i, 1);
-- Is the new character already in our unique string?
IF NVL(INSTR(v_temp, v_char), 0) = 0
THEN
v_index := 1;
-- The character is new, iterate through the unique string
-- to determine where to put the new character, this creates a sorted output.
WHILE v_index <= NVL(LENGTH(v_temp), 0)
LOOP
IF v_char < SUBSTR(v_temp, v_index, 1)
THEN
v_temp :=
SUBSTR(v_temp, 1, v_index - 1) || v_char || SUBSTR(v_temp, v_index);
EXIT;
END IF;
v_index := v_index + 1;
END LOOP;
IF v_index > NVL(LENGTH(v_temp), 0)
THEN
v_temp := v_temp || v_char;
END IF;
END IF;
END LOOP;
RETURN v_temp;
END distinct_chars;
STATIC FUNCTION odciaggregateinitialize(p_ctx IN OUT distinct_char_agg_type)
RETURN NUMBER
IS
BEGIN
p_ctx := distinct_char_agg_type(NULL); -- Initialize our unique set to be NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT distinct_char_agg_type, p_string IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
-- Use or new function to add distinct characters of each new string
self.v_result := distinct_chars(self.v_result, p_string);
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(
self IN OUT distinct_char_agg_type,
p_ctx2 IN distinct_char_agg_type
)
RETURN NUMBER
IS
BEGIN
-- As with iterate, simply add new characters from the other set.
self.v_result := distinct_chars(self.v_result, p_ctx2.v_result);
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN distinct_char_agg_type,
p_returnvalue OUT VARCHAR2,
p_flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
p_returnvalue := self.v_result; -- Iterate and Merge do all the work, return the result
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
93:
94:
95:
96:
97:
98:
99:
100:
101:
102:
103:
104:
105:
106:
107:
108:
109:
110:
111:
112:
113:
114:
115:
116:
117:
118:
119:
120:
121:
122:
123:
124:
125:
126:
127:
Select allOpen in new window
Create a function to use our object
CREATE OR REPLACE FUNCTION distinctagg(p_input VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING distinct_char_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
Aggregating Dates to find common days
The next aggregate is another example of event sampling. Given a set of dates, what day of the week is the most common? This function could be useful for determining sales promotions within a week based on customer activity, or scheduling audits based on system outages. The other reason I've included this aggregate is to show one of the limitations of the Oracle object model and a possible workaround. Conceptually, this function is just a glorified COUNT aggregate, except, instead of incrementing a single counter, it increments 7 of them, one counter for each day of the week. The simplest method to implement this would be an array indexed by the day of the week, either a numeric index 0-6 or a string index based on the names of the days of the week. However, it's not legal to declare our own collection types within an object. Nor can we declare a variable based on the collection types defined within the built-in package DBMS_SQL. Instead, we have to declare one variable for each counter. For days of the week, this isn't too bad. We only need 7 of them. If we were counting hundreds of items we might want to build a package to hold and manipulate the array. Also of note is tie-breaking. An aggregate can only produce one value for a set. But, if you have more than one day of the week with the same count, which do you return? There is no universal right answer, let your business requirements determine tie-breaker rules. For my function, the weeks start on Sunday and in the event of a tie, the earliest day of the week will be returned.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Most common day of the week aggregate for dates
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE common_dow_agg_type
AS OBJECT
(
v_sun_count INTEGER,
v_mon_count INTEGER,
v_tue_count INTEGER,
v_wed_count INTEGER,
v_thu_count INTEGER,
v_fri_count INTEGER,
v_sat_count INTEGER,
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT common_dow_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT common_dow_agg_type, p_date IN DATE)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(
self IN OUT common_dow_agg_type,
ctx2 IN common_dow_agg_type
)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN common_dow_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY common_dow_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT common_dow_agg_type)
RETURN NUMBER
IS
BEGIN
-- initialize all of the counters to zero
ctx :=
common_dow_agg_type(
0,
0,
0,
0,
0,
0,
0
);
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT common_dow_agg_type, p_date IN DATE)
RETURN NUMBER
IS
BEGIN
-- increment the appropriate counter for each day of the week
CASE TO_CHAR(p_date, 'Dy')
WHEN 'Sun'
THEN
self.v_sun_count := self.v_sun_count + 1;
WHEN 'Mon'
THEN
self.v_mon_count := self.v_mon_count + 1;
WHEN 'Tue'
THEN
self.v_tue_count := self.v_tue_count + 1;
WHEN 'Wed'
THEN
self.v_wed_count := self.v_wed_count + 1;
WHEN 'Thu'
THEN
self.v_thu_count := self.v_thu_count + 1;
WHEN 'Fri'
THEN
self.v_fri_count := self.v_fri_count + 1;
WHEN 'Sat'
THEN
self.v_sat_count := self.v_sat_count + 1;
END CASE;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(
self IN OUT common_dow_agg_type,
ctx2 IN common_dow_agg_type
)
RETURN NUMBER
IS
BEGIN
-- sum the counts from both sets
self.v_sun_count := self.v_sun_count + ctx2.v_sun_count;
self.v_mon_count := self.v_mon_count + ctx2.v_mon_count;
self.v_tue_count := self.v_tue_count + ctx2.v_tue_count;
self.v_wed_count := self.v_wed_count + ctx2.v_wed_count;
self.v_thu_count := self.v_thu_count + ctx2.v_thu_count;
self.v_fri_count := self.v_fri_count + ctx2.v_fri_count;
self.v_sat_count := self.v_sat_count + ctx2.v_sat_count;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN common_dow_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
v_max INTEGER
:= GREATEST(
self.v_sun_count,
self.v_mon_count,
self.v_tue_count,
self.v_wed_count,
self.v_thu_count,
self.v_fri_count,
self.v_sat_count
);
BEGIN
-- Return whichever day has the highest count
-- If there is a tie, then earliest in the week will be returned
returnvalue :=
CASE
WHEN self.v_sun_count = v_max THEN 'Sunday'
WHEN self.v_mon_count = v_max THEN 'Monday'
WHEN self.v_tue_count = v_max THEN 'Tuesday'
WHEN self.v_wed_count = v_max THEN 'Wednesday'
WHEN self.v_thu_count = v_max THEN 'Thursday'
WHEN self.v_fri_count = v_max THEN 'Friday'
ELSE 'Saturday'
END;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
93:
94:
95:
96:
97:
98:
99:
100:
101:
102:
103:
104:
105:
106:
107:
108:
109:
110:
111:
112:
113:
114:
115:
116:
117:
118:
119:
120:
121:
122:
123:
124:
125:
126:
127:
128:
129:
130:
131:
132:
133:
134:
135:
136:
137:
138:
139:
140:
141:
142:
Select allOpen in new window
Create a function to use the object
CREATE OR REPLACE FUNCTION day_of_weekagg(p_date DATE)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING common_dow_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example of usage:
Selecting a random value from a set with an aggregate
The next two aggregates are also real requests I've received, either in EE or in my real work. Each of them returns a random value from within a set. The first is a simple version but statistically skewed. Basically it just flips a coin using the DBMS_RANDOM.RANDOM function to determine whether to keep a value or use the next one in the iteration. This skews the results in that the first value iterated will become less likely to be used and the last value will become most likely. The second is more complicated but ensures the random selection is more uniform across the set. That is, every element in the set has equal chance of being selected.
Note, by default DBMS_RANDOM does not have public EXECUTE privilege by default. If you want to use these aggregates, you'll either need to get privilege to execute it or use another random number generator.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Return random value from a set (simple version)
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE random_agg_type
AS OBJECT
(
v_string VARCHAR2(4000),
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT random_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT random_agg_type, p_string IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT random_agg_type, ctx2 IN random_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN random_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
CREATE OR REPLACE TYPE BODY random_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT random_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := random_agg_type(NULL); -- initialize our random selection to NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT random_agg_type, p_string IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
-- If we haven't set a value yet, then use the first value passed in
-- if we have a value "flip a coin" to decide if we use the new value
-- or keep the old one.
IF v_string IS NULL
THEN
v_string := p_string;
ELSIF DBMS_RANDOM.random < 0
THEN
v_string := p_string;
END IF;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT random_agg_type, ctx2 IN random_agg_type)
RETURN NUMBER
IS
BEGIN
-- "flip a coin" to decide which value to keep
IF DBMS_RANDOM.random < 0
THEN
v_string := ctx2.v_string;
END IF;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN random_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := v_string;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
Select allOpen in new window
Create a function to use the object
CREATE OR REPLACE FUNCTION random_value_agg(p_string VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING random_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
In the next one we still want to select a random value from our set but we want each value to be equally weighted. The simplest method would be to just put the values in a list and select randomly 1-N and take whichever value it is. Unfortunately, we won't know how many values there are until we are done and we'd have to store all of the values as we go. Instead, we'll adjust the odds of each subsequent value in the iteration down by the total number of elements seen. So, the first value will have a 100% chance of being used. The second value will have a 1/2 chance of winning over the currently selected value. The third value will have a 1/3 chance of being selected. The fourth value will have a 1/4 chance and so on.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Return random value from a set (less skew version)
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE random_agg_type
AS OBJECT
(
v_string VARCHAR2(4000), -- our current random selection
v_count NUMBER, -- how many values have we iterated through
MEMBER FUNCTION random_0n(p_n IN INTEGER)
RETURN INTEGER,
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT random_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT random_agg_type, p_string IN VARCHAR2)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT random_agg_type, ctx2 IN random_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(
self IN random_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
);
CREATE OR REPLACE TYPE BODY random_agg_type
IS
-- We only want to select random values 0-N (actually N-1), but no such function exists in DBMS_RANDOM package
-- so we'll create one and adjust for modulo bias to ensure even distribution.
MEMBER FUNCTION random_0n(p_n IN INTEGER)
RETURN INTEGER
IS
v_biasmax INTEGER;
v_result INTEGER;
BEGIN
IF p_n <= 0 OR p_n > 2 ** 32 - 1
THEN
RAISE VALUE_ERROR;
END IF;
-- To prevent modulo bias limit the random range
-- to the largest number evenly divisible by N
v_biasmax := (2 ** 32 - 1) - MOD(2 ** 32 - 1, p_n);
-- If random number is greater than the bias limit
-- then pick a new number.
-- This "could" theoretically run forever but in practice it'll terminate
-- within the first or "maybe" second cycle.
-- DBMS_RANDOM.RANDOM is obsolete, but included here simply
-- for purposes of example and is compatible with all versions from 9i forward.
-- Due to range limitation of dbms_random.random the result will only
-- be from the first 4,294,967,296 values within the aggregation group.
-- all others will be ignored.
LOOP
v_result := DBMS_RANDOM.random + (2 ** 31);
EXIT WHEN v_result <= v_biasmax;
END LOOP;
RETURN MOD(v_result, p_n);
END random_0n;
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT random_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := random_agg_type(NULL, 0);
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(self IN OUT random_agg_type, p_string IN VARCHAR2)
RETURN NUMBER
IS
BEGIN
-- Adjust the count for the new element
v_count := v_count + 1;
-- Generate a random number 0 to N-1, if it's 0,
-- then the new value becomes the current selection
IF random_0n(v_count) = 0
THEN
v_string := p_string;
END IF;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(self IN OUT random_agg_type, ctx2 IN random_agg_type)
RETURN NUMBER
IS
BEGIN
-- Weight the selection of random value from the contexts by the relative sizes
-- For example:
-- self.v_count = 10
-- p_ctx2.v_count = 7
-- If random number from 0-16 < 7 then select the p_ctx2 value
-- otherwise keep the value currently in use
-- Note, this does have a 2^32-1 limit before there is an error
-- due to inability to prevent modulo bias
IF random_0n(v_count + ctx2.v_count) < ctx2.v_count
THEN
v_string := ctx2.v_string;
END IF;
v_count := v_count + ctx2.v_count;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN random_agg_type,
returnvalue OUT VARCHAR2,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := v_string;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
93:
94:
95:
96:
97:
98:
99:
100:
101:
102:
103:
104:
105:
106:
107:
108:
109:
110:
111:
112:
113:
114:
115:
116:
117:
118:
119:
120:
121:
122:
123:
124:
125:
126:
127:
Select allOpen in new window
Create a function to use the object
CREATE OR REPLACE FUNCTION random_value_agg(p_string VARCHAR2)
RETURN VARCHAR2
PARALLEL_ENABLE
AGGREGATE USING random_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage:
Aggregating a correlated value
The last function I'll show can be used to solve another problem I see frequently on EE. That is, given a set of values, when does the maximum occur? So, it's sort of like MAX, except we don't really care what the max value is, instead, we want to know "when" that value happened. If you do want to know the actual value, you can simply query MAX along with this aggregate.
User-defined aggregates can only accept a single parameter. We'll get around this limitation by creating our own object type that has both value and date fields and declare our parameter to be of that type.
Now that we have our type to hold value/time pairs, we can build our aggregate. If the maximum occurs more than one time, we'll return the date of the last occurrence.
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
---- Correlated Time for Maximum Number Aggregate
-----------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------
CREATE OR REPLACE TYPE cortime_max_num_agg_type
AS OBJECT
(
v_temp NUMBER, -- What is the max?
v_time DATE, -- When did it happen?
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT cortime_max_num_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateiterate(self IN OUT cortime_max_num_agg_type, p_pair IN number_date_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregatemerge(self IN OUT cortime_max_num_agg_type, ctx2 IN cortime_max_num_agg_type)
RETURN NUMBER,
MEMBER FUNCTION odciaggregateterminate(self IN cortime_max_num_agg_type,
returnvalue OUT DATE,
flags IN NUMBER
)
RETURN NUMBER
);
/
CREATE OR REPLACE TYPE BODY cortime_max_num_agg_type
IS
STATIC FUNCTION odciaggregateinitialize(ctx IN OUT cortime_max_num_agg_type)
RETURN NUMBER
IS
BEGIN
ctx := cortime_max_num_agg_type(NULL, NULL); -- initialize max and time to NULL
RETURN odciconst.success;
END odciaggregateinitialize;
MEMBER FUNCTION odciaggregateiterate(
self IN OUT cortime_max_num_agg_type,
p_pair IN number_date_type
)
RETURN NUMBER
IS
BEGIN
-- The first pass through the value and correlated time are automatically
-- the max. After that, check if the new value is bigger than the old
-- if so, use the new pair. If the old and new are the same value, then
-- keep the time that is latest.
IF v_temp IS NULL
THEN
v_temp := p_pair.VALUE;
v_time := p_pair.correlated_time;
ELSIF v_temp < p_pair.VALUE
THEN
v_temp := p_pair.VALUE;
v_time := p_pair.correlated_time;
ELSIF v_temp = p_pair.VALUE
THEN
v_time := GREATEST(v_time, p_pair.correlated_time);
END IF;
RETURN odciconst.success;
END odciaggregateiterate;
MEMBER FUNCTION odciaggregatemerge(
self IN OUT cortime_max_num_agg_type,
ctx2 IN cortime_max_num_agg_type
)
RETURN NUMBER
IS
BEGIN
-- If the current value is null, use the other one
-- If the new set is greater than our current one, use the new one
-- If the new and current are the same, then take the latest time
IF v_temp IS NULL
THEN
v_temp := ctx2.v_temp;
v_time := ctx2.v_time;
ELSIF ctx2.v_temp > v_temp
THEN
v_temp := ctx2.v_temp;
v_time := ctx2.v_time;
ELSIF ctx2.v_temp = v_temp
THEN
v_time := GREATEST(v_time, ctx2.v_time);
END IF;
RETURN odciconst.success;
END odciaggregatemerge;
MEMBER FUNCTION odciaggregateterminate(
self IN cortime_max_num_agg_type,
returnvalue OUT DATE,
flags IN NUMBER
)
RETURN NUMBER
IS
BEGIN
returnvalue := self.v_time;
RETURN odciconst.success;
END odciaggregateterminate;
END;
/
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:
54:
55:
56:
57:
58:
59:
60:
61:
62:
63:
64:
65:
66:
67:
68:
69:
70:
71:
72:
73:
74:
75:
76:
77:
78:
79:
80:
81:
82:
83:
84:
85:
86:
87:
88:
89:
90:
91:
92:
93:
94:
95:
96:
97:
98:
99:
100:
101:
102:
103:
104:
105:
Select allOpen in new window
Create a function to use our object
CREATE OR REPLACE FUNCTION correlated_timeagg(p_pair number_date_type)
RETURN DATE
PARALLEL_ENABLE
AGGREGATE USING cortime_max_num_agg_type;
1:
2:
3:
4:
Select allOpen in new window
Example usage (find the biggest row count and when that count was established)
You may note, sometimes I reference the current values of the object by referencing them directly (v_result) and sometimes I dereference them from "self" (self.v_result). This is intentional to show both are legal and you should use whichever method suits you and your corporate coding standards.
You may also note the parameter naming may not appear consistent. This was a style choice of mine. When I look at the odciaggregateiterate API, I see it as one part Oracle's (self) and one part mine (p_string, p_date, p_pair, etc.) The API forces me to declare the member functions a certain way. The parts that I don't really care about I left unprefixed and gave them the names found in the Oracle Data Cartridge Developer's Guide. The part that is "mine" I prefixed with "p_" as my own coding style for parameters. This is largely cosmetic though, just my personal preference.
For more information about the Data Cartridge and Object facilities in Oracle:
Data Cartridge Developer's Guide: http://docs.oracle.com/cd/E11882_01/appdev.112/e10765/toc.htm
Object-Relational Developer's Guide: http://docs.oracle.com/cd/E11882_01/appdev.112/e11822/toc.htm
Thank you for reading and I hope you found it helpful.
Questions and comments welcome.
by: wasimibm on 2012-02-03 at 04:53:22ID: 41726
excellent..!!!!
i never knew this before...!!!
out.txt which has a log of all activities, similar to this, if we had a consolidated code(code.sql file), it would have been better..
Thank you sdstuber..!!!