Solved

Collect function returns rows in wrong order for string aggregation

Posted on 2009-04-03
19
876 Views
Last Modified: 2013-12-18
Hi,
 I'm using the custom type and function in conjunction with the collect function to do string aggregation as seen in many places (heres an example

http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php#collect

)

this works fine, in that it concatenates the strings together, but the rows are in the wrong order. see the example below. can someone tell me what the problem is, or a better way to do string aggregration in 10g (and not using straggr which is old and slow)
select  table_to_string(cast(collect(column1) as t_varchar2_tab)) from 

    (

      select cast('a' as varchar2(10)) as column1, 'chris' as groupby from dual

      union all

      select cast('b' as varchar2(10)) as column1, 'chris' as groupby from dual

      union all

      select cast('c' as varchar2(10)) as column1, 'chris' as groupby from dual

      union all

      select cast('d' as varchar2(10)) as column1, 'chris' as groupby from dual

      union all

      select cast('e' as varchar2(10)) as column1, 'chris' as groupby from dual

       union all

      select cast('f' as varchar2(10)) as column1, 'chris' as groupby from dual

       union all

      select cast('g' as varchar2(10)) as column1, 'chris' as groupby from dual

       union all

      select cast('h' as varchar2(10)) as column1, 'chris' as groupby from dual

    )

  group by groupby

Open in new window

0
Comment
Question by:chrismarx
  • 13
  • 4
  • 2
19 Comments
 
LVL 73

Expert Comment

by:sdstuber
ID: 24062555
stragg old and slow?  hmm, stragg is generally faster and consumes less resources than other means I've seen, including the above procedure (about 1/4 the time on my db), hierarchical queries and xml aggregation.

However,  if you're dead set on not using it,  I like xml aggregation, it requires no additional objects and is relatively straightforward


  SELECT RTRIM(

             EXTRACT(

                 XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY column1),

                 '/s/text()'

             ).getstringval(),

             ','

         )

    FROM (  SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

          UNION ALL

            SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)

GROUP BY groupby

Open in new window

0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24062568
if you do want to give stragg another chance...
SELECT DISTINCT

       stragg(column1)

           OVER (ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

  FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)

Open in new window

0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24062589
and here's the hierarchical version
but it's the worst performer and heaviest resource consumer of all of them...
    SELECT MAX(SUBSTR(SYS_CONNECT_BY_PATH(column1, ','), 2)) x

      FROM (SELECT column1,

                   groupby,

                   ROW_NUMBER() OVER (PARTITION BY groupby ORDER BY column1) curr,

                   ROW_NUMBER() OVER (PARTITION BY groupby ORDER BY column1) - 1 prev

              FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                    UNION ALL

                    SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL))

START WITH curr = 1

CONNECT BY PRIOR curr = prev AND groupby = PRIOR groupby

  GROUP BY groupby

Open in new window

0
 

Author Comment

by:chrismarx
ID: 24062800
your examples are good, but they assume that i want the rows in some alphabetical order, that is not the case, i was only using that to illustrate my problem. i want the order of the string to reflect the order of the rows in the subquery.

both of your examples use an order by to achieve the correct results, but thats not going to work. for instance, see the code below, it should produce the following string "a,c,b,e,d,g,f,h"  -- exactly as they are listed in the subquery


  SELECT 

       straggr(column1)

  FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

        UNION ALL

        SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)

group by groupby

Open in new window

0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24062824
I'll adjust some of my claims above...

I just retested with some bigger samples, and I got comparable speeds between xml and stragg and both were only marginally faster than the collect method, but they did return sorted output so marginally faster and accurate is better than marginally slower and not accurate.

The hierarchical method was and remains awful.
Doing about 15 times the work (measured in gets) with a time about  9 time slower.
It also did about 50 times the sorting
0
 
LVL 73

Accepted Solution

by:
sdstuber earned 500 total points
ID: 24062888
ah, if you are artificially creating your own ordering, then you will need something from your subquery to indicate what your ordering should be

easiest and guaranteed way is to include a number on your subquery
I'll use the xmlaggregation as an example, since we might not have equivalent implementations of stragg (if nothing else, we at least spell it differently)

with a union all type subquery you can simply wrap it in an inline view and add rownum
  SELECT RTRIM(

             EXTRACT(XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY rn), '/s/text()').getstringval(),

             ','

         )

    FROM (SELECT ROWNUM rn, column1, groupby

            FROM (SELECT CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL))

GROUP BY groupby

Open in new window

0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24062910
if your inner query might include some sorting, either implicitly or explicitly then you'll need to manually include your artificial sorting criteria


This is probably better practice anyway since it makes the query self-documenting as to your intended sorting criteria
  SELECT RTRIM(

             EXTRACT(XMLAGG(XMLELEMENT("s", column1 || ',') ORDER BY rn), '/s/text()').getstringval(),

             ','

         )

    FROM (SELECT 1 rn, CAST('a' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 2,CAST('c' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 3,CAST('b' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 4,CAST('e' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 5,CAST('d' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 6,CAST('g' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 7,CAST('f' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL

                  UNION ALL

                  SELECT 8,CAST('h' AS VARCHAR2(10)) AS column1, 'chris' AS groupby FROM DUAL)

GROUP BY groupby

Open in new window

0
 
LVL 20

Expert Comment

by:gatorvip
ID: 24063440
Never rely on "group by" to perform the ordering for you. If you need your output in any given order, you need to do so explicitly.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24063470
all of the ordering IS taking place via an order by

but we're not ordering the rows, of the grouped output
we're ordering the data within one column
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 20

Expert Comment

by:gatorvip
ID: 24063508
sdstuber: if you look at the code for tab_to_string (see OP's link) you will see that there's no explicit ordering there, either
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24064492
yes, but that function is what this question is all about.

How to get sorting because using that function doesn't do it.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24064496
where sorting is being done, it's being done by explicit order by.

or maybe I'm not understanding your objection
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24064612
for more options,  sqlsnippets.com has a great comparison between the methods above and others,
including examples that show stragg being significantly heavier than the collect method.

In my own testing I haven't duplicated their results quite as convincingly and I still support version 9 db's  so stragg is portable from 9 to 10 to 11.
also, those examples don't include sorting.  They can, but they didn't.  And you'd have to write a new version of the tab_to_string function to accomplish it.

It is an interesting article though and well presented
0
 

Author Comment

by:chrismarx
ID: 24065246
hi,
 ok, yes the defined ordering by rownum works well, i like the xml example, it is fast, and it lets me continue using the group by function. this solves my problem,  but doesnt answer the original quesiton, and if you have another moment, i would like to know 2 things

1. how would use the stragg function in conjunction with a group by operation, when using the over(order)

2. how might one do the rownum ordering and still using the original collect function i first posted?

thanks!
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24065283
0 - the original question isn't answerable "as is",  your function doesn't support ordering,  so you MUST do something else if you want your data ordered

1 - you don't, you use partition by clause

       stragg(column1) OVER (PARTITION BY groupby ORDER BY column1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)

2 - your original collection function won't work.  that function doesn't support ordering
0
 

Author Comment

by:chrismarx
ID: 24065348
ah yes, i tried partition, but it returns 9 rows rather than 1, im assuming since its not a true aggregate function. anyways, the xml answer works, and thanks for explaining about the original function!
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24065371
you need to do distinct,  check out my full example.

using OVER clause means you're not using stragg as an aggregate, but as an analytic.
 It "should" behave differently.  Just as if you were using MAX/MIN or other more traditional aggregates.


another option if you did want to use the collection function is to rewrite it so it will support ordering.
Typically that's done when the aggregated columns have their own sorting criteria (numeric, alphabetic, chronological)
but if you are sorting by something other than the data you are aggregating then it might not be practical.
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 24065373
glad I could help though!
0
 

Author Comment

by:chrismarx
ID: 24065434
im a little rusty on my custom types in oracle, r u saying that i could do something like below?
--change this

create or replace TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000);
 

--to something like this

create or replace TYPE t_varchar2_tab AS TABLE OF VARCHAR2(4000) --and a rownum?;
 

--and then change the function

create or replace

FUNCTION table_to_string (p_varchar2_tab  IN  t_varchar2_tab,

                                          p_delimiter     IN  VARCHAR2 DEFAULT ',') RETURN VARCHAR2 IS

  l_string     VARCHAR2(32767);

BEGIN

  --to include a different kind of loop here?

  FOR i IN p_varchar2_tab.FIRST .. p_varchar2_tab.LAST LOOP

    IF i != p_varchar2_tab.FIRST THEN

      l_string := l_string || p_delimiter;

    END IF;  

    l_string := l_string || p_varchar2_tab(i);

  END LOOP;

  RETURN l_string;
 
 

--and then i could call the function, something like this?
 

select table_to_string(cast(collect(column1),collect(rwnum) as t_varchar2_tab)) from ---etc?

from

END table_to_string;

Open in new window

0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Article by: Swadhin
From the Oracle SQL Reference (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm) we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here. General principle of Oracle compression Oracle compression is a way of reducing the d…
This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now