Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 903
  • Last Modified:

Collect function returns rows in wrong order for string aggregation

Hi,
 I'm using the custom type and function in conjunction with the collect function to do string aggregation as seen in many places (heres an example

http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php#collect

)

this works fine, in that it concatenates the strings together, but the rows are in the wrong order. see the example below. can someone tell me what the problem is, or a better way to do string aggregration in 10g (and not using straggr which is old and slow)
select  table_to_string(cast(collect(column1) as t_varchar2_tab)) from 
    (
      select cast('a' as varchar2(10)) as column1, 'chris' as groupby from dual
      union all
      select cast('b' as varchar2(10)) as column1, 'chris' as groupby from dual
      union all
      select cast('c' as varchar2(10)) as column1, 'chris' as groupby from dual
      union all
      select cast('d' as varchar2(10)) as column1, 'chris' as groupby from dual
      union all
      select cast('e' as varchar2(10)) as column1, 'chris' as groupby from dual
       union all
      select cast('f' as varchar2(10)) as column1, 'chris' as groupby from dual
       union all
      select cast('g' as varchar2(10)) as column1, 'chris' as groupby from dual
       union all
      select cast('h' as varchar2(10)) as column1, 'chris' as groupby from dual
    )
  group by groupby

Open in new window

0
chrismarx
Asked:
chrismarx
  • 13
  • 4
  • 2
1 Solution
 
sdstuberCommented:
stragg old and slow?  hmm, stragg is generally faster and consumes less resources than other means I've seen, including the above procedure (about 1/4 the time on my db), hierarchical queries and xml aggregation.

However,  if you're dead set on not using it,  I like xml aggregation, it requires no additional objects and is relatively straightforward


  SELECT RTRIM(
             EXTRACT(
                 XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY column1),
                 '/s/text()'
             ).getstringval(),
             ','
         )
    FROM (  SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
          UNION ALL
            SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)
GROUP BY groupby

Open in new window

0
 
sdstuberCommented:
if you do want to give stragg another chance...
SELECT DISTINCT
       stragg(column1)
           OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
  FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)

Open in new window

0
 
sdstuberCommented:
and here's the hierarchical version
but it's the worst performer and heaviest resource consumer of all of them...
    SELECT MAX(SUBSTR(SYS_CONNECT_BY_PATH(column1, ','), 2)) x
      FROM (SELECT column1,
                   groupby,
                   ROW_NUMBER() OVER (PARTITION BY groupby ORDER BY column1) curr,
                   ROW_NUMBER() OVER (PARTITION BY groupby ORDER BY column1) - 1 prev
              FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                    UNION ALL
                    SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL))
START WITH curr = 1
CONNECT BY PRIOR curr = prev AND groupby = PRIOR groupby
  GROUP BY groupby

Open in new window

0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
chrismarxAuthor Commented:
your examples are good, but they assume that i want the rows in some alphabetical order, that is not the case, i was only using that to illustrate my problem. i want the order of the string to reflect the order of the rows in the subquery.

both of your examples use an order by to achieve the correct results, but thats not going to work. for instance, see the code below, it should produce the following string "a,c,b,e,d,g,f,h"  -- exactly as they are listed in the subquery

  SELECT 
       straggr(column1)
  FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
        UNION ALL
        SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)
group by groupby

Open in new window

0
 
sdstuberCommented:
I'll adjust some of my claims above...

I just retested with some bigger samples, and I got comparable speeds between xml and stragg and both were only marginally faster than the collect method, but they did return sorted output so marginally faster and accurate is better than marginally slower and not accurate.

The hierarchical method was and remains awful.
Doing about 15 times the work (measured in gets) with a time about  9 time slower.
It also did about 50 times the sorting
0
 
sdstuberCommented:
ah, if you are artificially creating your own ordering, then you will need something from your subquery to indicate what your ordering should be

easiest and guaranteed way is to include a number on your subquery
I'll use the xmlaggregation as an example, since we might not have equivalent implementations of stragg (if nothing else, we at least spell it differently)

with a union all type subquery you can simply wrap it in an inline view and add rownum
  SELECT RTRIM(
             EXTRACT(XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY rn), '/s/text()').getstringval(),
             ','
         )
    FROM (SELECT ROWNUM rn, column1, groupby
            FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL))
GROUP BY groupby

Open in new window

0
 
sdstuberCommented:
if your inner query might include some sorting, either implicitly or explicitly then you'll need to manually include your artificial sorting criteria


This is probably better practice anyway since it makes the query self-documenting as to your intended sorting criteria
  SELECT RTRIM(
             EXTRACT(XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY rn), '/s/text()').getstringval(),
             ','
         )
    FROM (SELECT 1 rn, CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 2,CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 3,CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 4,CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 5,CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 6,CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 7,CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL
                  UNION ALL
                  SELECT 8,CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)
GROUP BY groupby

Open in new window

0
 
gatorvipCommented:
Never rely on "group by" to perform the ordering for you. If you need your output in any given order, you need to do so explicitly.
0
 
sdstuberCommented:
all of the ordering IS taking place via an order by

but we're not ordering the rows, of the grouped output
we're ordering the data within one column
0
 
gatorvipCommented:
sdstuber: if you look at the code for tab_to_string (see OP's link) you will see that there's no explicit ordering there, either
0
 
sdstuberCommented:
yes, but that function is what this question is all about.

How to get sorting because using that function doesn't do it.
0
 
sdstuberCommented:
where sorting is being done, it's being done by explicit order by.

or maybe I'm not understanding your objection
0
 
sdstuberCommented:
for more options,  sqlsnippets.com has a great comparison between the methods above and others,
including examples that show stragg being significantly heavier than the collect method.

In my own testing I haven't duplicated their results quite as convincingly and I still support version 9 db's  so stragg is portable from 9 to 10 to 11.
also, those examples don't include sorting.  They can, but they didn't.  And you'd have to write a new version of the tab_to_string function to accomplish it.

It is an interesting article though and well presented
0
 
chrismarxAuthor Commented:
hi,
 ok, yes the defined ordering by rownum works well, i like the xml example, it is fast, and it lets me continue using the group by function. this solves my problem,  but doesnt answer the original quesiton, and if you have another moment, i would like to know 2 things

1. how would use the stragg function in conjunction with a group by operation, when using the over(order)

2. how might one do the rownum ordering and still using the original collect function i first posted?

thanks!
0
 
sdstuberCommented:
0 - the original question isn't answerable "as is",  your function doesn't support ordering,  so you MUST do something else if you want your data ordered

1 - you don't, you use partition by clause

       stragg(column1) OVER (PARTITION BY groupby ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

2 - your original collection function won't work.  that function doesn't support ordering
0
 
chrismarxAuthor Commented:
ah yes, i tried partition, but it returns 9 rows rather than 1, im assuming since its not a true aggregate function. anyways, the xml answer works, and thanks for explaining about the original function!
0
 
sdstuberCommented:
you need to do distinct,  check out my full example.

using OVER clause means you're not using stragg as an aggregate, but as an analytic.
 It "should" behave differently.  Just as if you were using MAX/MIN or other more traditional aggregates.


another option if you did want to use the collection function is to rewrite it so it will support ordering.
Typically that's done when the aggregated columns have their own sorting criteria (numeric, alphabetic, chronological)
but if you are sorting by something other than the data you are aggregating then it might not be practical.
0
 
sdstuberCommented:
glad I could help though!
0
 
chrismarxAuthor Commented:
im a little rusty on my custom types in oracle, r u saying that i could do something like below?
--change this
create or replace TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
 
--to something like this
create or replace TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000) --and a rownum?;
 
--and then change the function
create or replace
FUNCTION table_to_string (p_varchar2_tab  IN  t_varchar2_tab,
                                          p_delimiter     IN  VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS
  l_string     VARCHAR2(32767);
BEGIN
  --to include a different kind of loop here?
  FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP
    IF i != p_varchar2_tab.FIRST THEN
      l_string := l_string || p_delimiter;
    END IF;  
    l_string := l_string || p_varchar2_tab(i);
  END LOOP;
  RETURN l_string;
 
 
--and then i could call the function, something like this?
 
select table_to_string(cast(collect(column1),collect(rwnum) as t_varchar2_tab)) from ---etc?
from
END table_to_string;

Open in new window

0

Featured Post

Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

  • 13
  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now