Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Function to concatenate CLOB

Posted on 2014-08-25
33
Medium Priority
?
1,349 Views
Last Modified: 2014-09-03
Hello,
I have an Oracle SQL query that joins more than 4 tables to get some data. One of them is a sub-query that basically concatenates rows into a column (Column 1). The “column 1” value returned is a CLOB type.  I am using XMLAGG function to return the value.
The sub-query currently is filtered based on VT_TEXT_TYPE. I have more other columns (Column 2 to Column 14) that also concatenate VT_TEXT but have its own filter VT_TEXT_TYPE (logic as attached in the spreadsheet).
Is there a way to place these in one function and pass the VT_SER_NUM and VT_TEXT_TYPE as parameters to get the concatenated string VT_TEXT as a result?
Since there are about million rows in the table, the performance is really slow even to bring one column.
Can someone please assist?
Thank you very much
SELECT AM.AM_SER_NUM,
         VT.VT_TEXT as Column 1
     FROM  AM@dbrd5 AM,
    (    select og_ser_num         from OG@dbrd5  
                group by og_ser_num
            )   OG  
       , ( select IU_SER_NUM from IU@dbrd5)  IU
      , ( select RI_SER_NUM FROM RI@dbrd5)  RI   
      , (  SELECT VT_SER_NUM
            , EXTRACT(XMLAGG(XMLELEMENT("s", vt_text) order by vt_ent_num), '/s/text()').getclobval()    VT_TEXT
          FROM VT@DBRD5 VT 
              WHERE REGEXP_LIKE(SUBSTR ( Vt.VT_TEXT_TYPE, 3, 2 ), '^[[:digit:]]+$')
                 and ( VT_TEXT_TYPE LIKE 'AT%')                
GROUP BY VT_SER_NUM                     ) VT
WHERE AM_SER_NUM = OG_SER_NUM     (+)
  AND AM_SER_NUM = RI_SER_NUM     (+)
  AND AM_SER_NUM = IU_SER_NUM     (+)
  AND AM_SER_NUM = VT_SER_NUM     (+)

Open in new window


Sub-query
SELECT VT_SER_NUM
            , EXTRACT(XMLAGG(XMLELEMENT("s", vt_text) order by vt_ent_num), '/s/text()').getclobval()    VT_TEXT
          FROM VT@DBRD5 VT 
              WHERE REGEXP_LIKE(SUBSTR ( Vt.VT_TEXT_TYPE, 3, 2 ), '^[[:digit:]]+$')
                 and ( VT_TEXT_TYPE LIKE 'AT%')                
GROUP BY VT_SER_NUM

Open in new window

Logic.xlsx
0
Comment
Question by:angel7170
  • 13
  • 11
  • 8
  • +1
33 Comments
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40284277
Just like the last question:
Please post sample data and expected results.

It might be better to create your own function.

Then you can use a simple cursor loop and dbms_lob.writeappend.

It may be faster than the XML trick form the last question.

Docs for dbms_lob.writeappend:
http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_lob.htm#ARPLS600

Have you checked where the performance issue is?

If you are trying to pull 1 million rows across the link, it may just be the network.

You might need to function in the remote database.
0
 
LVL 16

Expert Comment

by:Wasim Akram Shaik
ID: 40285156
I agree with Steve, your issue may be attributed to data transmission over the network

Why don't you think of a possibility to run the sql over the host database?

 if there is no such possibility in case of the host is a  production database then you can think of replicating the same set of data over your database and can perform the same query and see the improvement in the results.
0
 

Author Comment

by:angel7170
ID: 40285730
Thank you slightwv and Wasim.

Here are the record counts from each of the remote database tables.

VT = 95,511,263
AM = 10,708,390
OG = 104,980
RI = 150,158
IU = 2,150,696

As you said, it could be network.  I don't have the ability to create a function in remote database and I am not sure what would be the performance if I need to create a table in the host database with these many rows.

To answer Slightwv question: "Please post sample data and expected results".
Attached is the sample data and expected results in two different sheets.
Sample.xls
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40285752
Are you looking to dynamically create columns based on the first two characters of VT_TEXT_TYPE?

If so, you can sort of stop now.  You need to know the maximum number of columns when the SQL is parsed.  You really can't dynamically create columns in a result set.

With that much data I don't think you can get around performance unless you bring the data local.

Does the data have to be 'real time'?
If not, I would look at bringing it ALL local as suggested above using Materialize Views:
http://docs.oracle.com/cd/B28359_01/server.111/b28324/tdpii_reppit.htm
0
 

Author Comment

by:angel7170
ID: 40285773
It is not dynamically created columns.  It is standard.

If I can create a local table before running the scripts then I should be good. It doesn't have to be real real time...
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40285855
>>It is not dynamically created columns.  It is standard.

OK,
I'll work on a test case for what you posted.  I'll post it when I get it done.

>>If I can create a local table before running the scripts then I should be good. It doesn't have to be real real time...

Read up on materialized views in the link I posted.  This will allow the local version to refresh during slow hours and you will have local copies of the tables that are only as old as the last refresh.
0
 

Author Comment

by:angel7170
ID: 40285858
great! Thank you.
0
 
LVL 78

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 1000 total points
ID: 40286473
Below is the test case I came up with.

It has two solutions.  One uses a function an one is all native SQL with a lot of XML tricks.

Given the amount of data you are dealing with, I really don't think either one is going to be really quick.

If I had to guess:  The XML solution will probably be the worst.

The downside to the function is you hit the table again for every 'type' as well as getting the ser_num's.

drop table tab1 purge;
create table tab1(VT_ENT_NUM number, VT_SER_NUM number, VT_TEXT_TYPE varchar2(10), VT_TEXT varchar2(256)) nologging;

insert into tab1 values(1,123,'AT0000','Hello ');
insert into tab1 values(2,123,'AT0000','World');

insert into tab1 values(1,77551538,'AT0000','aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
insert into tab1 values(2,77551538,'AT0000','bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb');
insert into tab1 values(1,77551538,'CS0001','cccccccccccccccccccccccccccccccccccccccc');
insert into tab1 values(2,77551538,'CS0002','dddddddddddddddddddddddddddddddddddddddd');
insert into tab1 values(3,77551538,'CS0003','eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee');
insert into tab1 values(4,77551538,'CS0004','ffffffffffffffffffffffffffffffffffffffff');
insert into tab1 values(1,77551538,'DM0000','gggggggggggggggggggggggggggggggggggggggg');
insert into tab1 values(2,77551538,'DM0001','hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh');
insert into tab1 values(3,77551538,'DM0002','hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh');
insert into tab1 values(1,77551538,'DR0000','iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii');
insert into tab1 values(2,77551538,'DR0001','jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj');
insert into tab1 values(3,77551538,'DR0002','kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk');
commit;

create or replace function myfunc(p_ser_num in number, p_type in varchar2) return clob is
	retClob clob;
begin

	for i in (select vt_text from tab1 where vt_ser_num=p_ser_num and substr(vt_text_type,1,2) = p_type order by vt_ent_num) loop
		retClob := retClob || i.vt_text;
	end loop;
	return retClob;
end;
/

show errors

select vt_ser_num,
	myFunc(vt_ser_num,'AT') COLUMN_1,
	myFunc(vt_ser_num,'CS') COLUMN_2,
	myFunc(vt_ser_num,'DM') COLUMN_3,
	myFunc(vt_ser_num,'DR') COLUMN_4
from
(
	select distinct vt_ser_num from tab1
)
/


with mydata as (
select xmlelement("myXML",
	xmlelement("row",
			xmlattributes(vt_ser_num),
		xmlagg(
			xmlelement("type_row",
			xmlattributes(mytype),
			myclob
			)
		)
	)
) myxml
from
(
SELECT  vt_ser_num, substr(vt_text_type,1,2) mytype,
	EXTRACT(XMLAGG(XMLELEMENT("s", vt_text) order by vt_ent_num), '/s/text()').getclobval() myclob
FROM tab1
group by vt_ser_num, substr(vt_text_type,1,2)
)
group by vt_ser_num
)
select vt_ser_num, column_1, column_2, column_3, column_4
from mydata, xmltable('/myXML' passing myxml
		columns
			vt_ser_num number path '/myXML/row/@VT_SER_NUM',
			column_1 clob path '/myXML/row/type_row[@MYTYPE="AT"]',
			column_2 clob path '/myXML/row/type_row[@MYTYPE="CS"]',
			column_3 clob path '/myXML/row/type_row[@MYTYPE="DM"]',
			column_4 clob path '/myXML/row/type_row[@MYTYPE="DR"]'
)
/

Open in new window

0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40286515
Another one very similar to the function version.

The difference here is it uses the CLOBAGG function that is found all over the web.

My original function did a simple string concatenation.  CLOBAGG should be a little faster.

You'll just need to test them and see which one works best.

create or replace type clobagg_type as object(
  text clob,
  static function ODCIAggregateInitialize(sctx in out clobagg_type
                                         ) return number,
  member function ODCIAggregateIterate(self in out clobagg_type,
                                       value in clob
                                      ) return number,
  member function ODCIAggregateTerminate(self in clobagg_type,
                                         returnvalue out clob,
                                         flags in number
                                        ) return number,
  member function ODCIAggregateMerge(self in out clobagg_type,
                                     ctx2 in clobagg_type
                                    ) return number
);
/

show errors 

create or replace type body clobagg_type is
  static function ODCIAggregateInitialize(sctx in out clobagg_type
                                         ) return number is
  begin
    sctx := clobagg_type(null);
    return ODCIConst.Success;
  end;
  member function ODCIAggregateIterate(self  in out clobagg_type,
                                       value in     clob
                                      ) return number is
  begin
    self.text := self.text || value;
    return ODCIConst.Success;
  end;
  member function ODCIAggregateTerminate(self in clobagg_type,
                                         returnvalue out clob,
                                         flags in number
                                        ) return number is
  begin
    returnValue := self.text;
    return ODCIConst.Success;
  end;
  member function ODCIAggregateMerge(self in out clobagg_type,
                                     ctx2 in clobagg_type
                                    )return number is
  begin
    self.text := self.text || ctx2.text;
    return ODCIConst.Success;
  end;
end;
/

show errors 

create or replace function clobagg(input clob) return clob
  deterministic
  parallel_enable
  aggregate using clobagg_type;
/

show errors 

create or replace function myfunc2(p_ser_num in number, p_type in varchar2) return clob is
	retClob clob;
begin

	select clobagg(vt_text) into retClob from (select vt_text from tab1 where vt_ser_num=p_ser_num and substr(vt_text_type,1,2) = p_type order by vt_ent_num);
	return retClob;
end;
/

show errors

select vt_ser_num,
	myFunc2(vt_ser_num,'AT') COLUMN_1,
	myFunc2(vt_ser_num,'CS') COLUMN_2,
	myFunc2(vt_ser_num,'DM') COLUMN_3,
	myFunc2(vt_ser_num,'DR') COLUMN_4
from
(
	select distinct vt_ser_num from tab1
)
/

Open in new window

0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40286570
similar idea to what slightwv posted,  except I changed the aggregate to accept varchar2 inputs but return a clob

then I use stand pivot technique to generate the output


CREATE OR REPLACE TYPE vcagg_to_clob_type
    AS OBJECT
(
    v_result CLOB,
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_type)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateiterate(self IN OUT vcagg_to_clob_type, p_string IN VARCHAR2)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateterminate(
        self          IN     vcagg_to_clob_type,
        returnvalue      OUT CLOB,
        flags         IN     NUMBER
    )
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregatemerge(self IN OUT vcagg_to_clob_type, ctx2 IN vcagg_to_clob_type)
        RETURN NUMBER
);
/

SHOW ERRORS

CREATE OR REPLACE TYPE BODY vcagg_to_clob_type
IS
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_type)
        RETURN NUMBER
    IS
    BEGIN
        sctx := vcagg_to_clob_type(NULL);
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateiterate(self IN OUT vcagg_to_clob_type, p_string IN VARCHAR2)
        RETURN NUMBER
    IS
    BEGIN
        self.v_result := self.v_result || p_string;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateterminate(
        self          IN     vcagg_to_clob_type,
        returnvalue      OUT CLOB,
        flags         IN     NUMBER
    )
        RETURN NUMBER
    IS
    BEGIN
        returnvalue := self.v_result;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregatemerge(self IN OUT vcagg_to_clob_type, ctx2 IN vcagg_to_clob_type)
        RETURN NUMBER
    IS
    BEGIN
        self.v_result := self.v_result || ctx2.v_result;
        RETURN odciconst.success;
    END;
END;
/

SHOW ERRORS

CREATE OR REPLACE FUNCTION clobagg(p_string VARCHAR2)
    RETURN CLOB
    DETERMINISTIC
    PARALLEL_ENABLE
    AGGREGATE USING vcagg_to_clob_type;
/

Open in new window


  SELECT vt_ser_num,
         clobagg(CASE WHEN vt_text_type LIKE 'AT%' THEN vt_text END) column_1,
         clobagg(CASE WHEN vt_text_type LIKE 'CS%' THEN vt_text END) column_2,
         clobagg(CASE WHEN vt_text_type LIKE 'DM%' THEN vt_text END) column_3,
         clobagg(CASE WHEN vt_text_type LIKE 'DR%' THEN vt_text END) column_4
    FROM tab1
GROUP BY vt_ser_num
/

Open in new window

0
 
LVL 74

Accepted Solution

by:
sdstuber earned 1000 total points
ID: 40286589
I missed the ordering criteria for the vt_text aggregation.
It's an easy fix, but cumbersome syntax.

SELECT vt_ser_num,
       column_1,
       column_2,
       column_3,
       column_4
  FROM (SELECT vt_ser_num,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'AT%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_1,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'CS%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_2,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DM%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_3,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DR%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_4,
               ROW_NUMBER() OVER(PARTITION BY vt_ser_num ORDER BY ROWNUM) rn
          FROM tab1)
 WHERE rn = 1

Open in new window

0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40286602
I don't think  myfunc2 in  http:#a40286515  is reliable for generating a sorted list.  

The aggregation has no requirement to process the subquery results in the order presented.  

 That's part of the reason analytics have an order by clause for them.  To enable the function results to be ordered regardless of the input data
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40286607
Agreed.  I was thinking it was questionable but played with the data changing the  'insert order' and it appeared to work.

Might have just been dumb luck.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40286619
If the query runs singly-threaded (i.e. non parallel) then it will probably work most (all?) of the time; but if the query goes parallel then all bets are off.

for example:  subquery returns abcdefg

thread one gets abcd
thread two gets efg

two finishes first, and clob aggregates as:  efgabcd


or, the aggregate sorts on a hash thereby negating the alphabetic sort, regardless of how many threads are involved
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40286624
Makes sense.  Agreed.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40287108
Here's another more sophisticated aggregate that allows you to pass in ordering sequence along with the values


CREATE OR REPLACE TYPE string_order_type AS OBJECT
(
    str VARCHAR2(32767),
    seq INTEGER
);

CREATE OR REPLACE TYPE string_order_tab AS TABLE OF string_order_type;

CREATE OR REPLACE TYPE vcagg_to_clob_sort_type
    AS OBJECT
(
    v_strings string_order_tab,
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_sort_type)
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateiterate(
        self    IN OUT vcagg_to_clob_sort_type,
        p_obj   IN     string_order_type
    )
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregateterminate(
        self          IN     vcagg_to_clob_sort_type,
        returnvalue      OUT CLOB,
        flags         IN     NUMBER
    )
        RETURN NUMBER,
    MEMBER FUNCTION odciaggregatemerge(
        self   IN OUT vcagg_to_clob_sort_type,
        ctx2   IN     vcagg_to_clob_sort_type
    )
        RETURN NUMBER
);
/

CREATE OR REPLACE TYPE BODY vcagg_to_clob_sort_type
IS
    STATIC FUNCTION odciaggregateinitialize(sctx IN OUT vcagg_to_clob_sort_type)
        RETURN NUMBER
    IS
    BEGIN
        sctx := vcagg_to_clob_sort_type(string_order_tab());
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateiterate(
        self    IN OUT vcagg_to_clob_sort_type,
        p_obj   IN     string_order_type
    )
        RETURN NUMBER
    IS
    BEGIN
        self.v_strings.EXTEND;
        self.v_strings(self.v_strings.COUNT) := p_obj;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregateterminate(
        self          IN     vcagg_to_clob_sort_type,
        returnvalue      OUT CLOB,
        flags         IN     NUMBER
    )
        RETURN NUMBER
    IS
        v_result CLOB := '';
    BEGIN
        FOR x IN (SELECT str
                    FROM TABLE(v_strings)
                  ORDER BY seq,str)
        LOOP
            v_result := v_result || x.str;
        END LOOP;

        returnvalue := v_result;
        RETURN odciconst.success;
    END;

    MEMBER FUNCTION odciaggregatemerge(
        self   IN OUT vcagg_to_clob_sort_type,
        ctx2   IN     vcagg_to_clob_sort_type
    )
        RETURN NUMBER
    IS
    BEGIN
        FOR i IN 1 .. ctx2.v_strings.COUNT
        LOOP
            self.v_strings.EXTEND;
            self.v_strings(self.v_strings.COUNT).str := ctx2.v_strings(i).str;
            self.v_strings(self.v_strings.COUNT).seq := ctx2.v_strings(i).seq;
        END LOOP;

        RETURN odciconst.success;
    END;
END;
/

Open in new window


Then to use it, it's similar to what I posted before except you include the ordering criteria

SELECT vt_ser_num,
       clobsortagg(
           CASE WHEN vt_text_type LIKE 'AT%' THEN string_order_type(vt_text, vt_ent_num) END
       )
           column_1,
       clobsortagg(
           CASE WHEN vt_text_type LIKE 'CS%' THEN string_order_type(vt_text, vt_ent_num) END
       )
           column_2,
       clobsortagg(
           CASE WHEN vt_text_type LIKE 'DM%' THEN string_order_type(vt_text, vt_ent_num) END
       )
           column_3,
       clobsortagg(
           CASE WHEN vt_text_type LIKE 'DR%' THEN string_order_type(vt_text, vt_ent_num) END
       )
           column_4
  FROM tab1
GROUP BY vt_ser_num

Open in new window

0
 

Author Comment

by:angel7170
ID: 40287887
Thank you both Slightwv and Sdstuber. I am trying to run both of your codes but no luck of returning any results. It has been more than 16 hours this query is still running... I even brought the remote database tables locally and still it is running slow. I hope this would work today.

Again, thanks to both of you. Without your help I won't be even this far.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40287905
what does your query look like with the functions added in?
0
 

Author Comment

by:angel7170
ID: 40287923
I am running the query that uses the function CLOBAGG. I have 14 columns similar to it but I only added 4 columns as below to run first to see how the results will be. No luck :(


SELECT vt_ser_num,
       column_1,
       column_2,
       column_3,
       column_4
  FROM (SELECT vt_ser_num,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'AT%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_1,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'CS%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_2,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DM%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_3,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DR%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_4,
               ROW_NUMBER() OVER(PARTITION BY vt_ser_num ORDER BY ROWNUM) rn
          FROM DQ_VT_TBL)
 WHERE rn = 1

Open in new window

0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40287941
Assuming dq_vt_tbl is the same VT table that listed the counts above:
You are wanting to concatenate all 95 million rows at once?
0
 

Author Comment

by:angel7170
ID: 40287968
No, DQ_VT_TBL is limited only for the VT_TEXT_TYPE  like the 14 text types (AT%, DR% etc) I mentioned. Now the total count of this table is 14,637,556. I am calling the function to concatenate the VT_TEXT column for each of these text types.
0
 

Author Comment

by:angel7170
ID: 40287975
DQ_VT_TBL is the table I created locally to limit the rows/columns and doesn't have to cross the network.
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40287991
Trying to concatenate 14 million rows is going to take a while.

Do you really need all 14 million at once?

I cannot imagine an application that needs 14 million rows appearing all at once.
0
 

Author Comment

by:angel7170
ID: 40288017
I don't want it to concatenate all 14 million rows at once.  The logic is, for each serial number, there are about 50-100 rows of each TEXT TYPES like AT%, DR%, DM% etc.... I want these 50-100 rows (VT_TEXT) to concatenate for each of these VT_TEXT_TYPE and it should be per serial number (VT_SER_NUM) and ordered by a sequence number (VT_ENT_NUM)

I am attaching the logic document again. I hope I am not confusing :)
Logic.xlsx
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40288032
>>I don't want it to concatenate all 14 million rows at once.

There is no where clause on the inner select:
...
     column_4,
               ROW_NUMBER() OVER(PARTITION BY vt_ser_num ORDER BY ROWNUM) rn
          FROM DQ_VT_TBL)
...

So, you are running that select against ALL rows in DQ_VT_TBL.
0
 

Author Comment

by:angel7170
ID: 40288042
Ok, I thought the CASE statement is looking for the text types that starts with and then it concatenates. Does this code needs to be changed?
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40288064
the case ensures only text of the single type is concatenated into each string

but you still probably want to apply a filter to only pull the rows you're interested in

SELECT vt_ser_num,
       column_1,
       column_2,
       column_3,
       column_4
  FROM (SELECT vt_ser_num,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'AT%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_1,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'CS%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_2,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DM%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_3,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'DR%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_4,
               ROW_NUMBER() OVER(PARTITION BY vt_ser_num ORDER BY ROWNUM) rn
          FROM DQ_VT_TBL
WHERE vt_text_type like 'AT%'
or vt_text_type like 'DR%'
or vt_text_type like 'DM%'
or vt_text_type like 'CS%'
)
 WHERE rn = 1
                                          

Open in new window

0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40288066
The case looks at the first two characters of every row it is provided.

The where clause on a select limits the rows returned.

A case statement cannot reduce the number of rows returned by the select.
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40288077
WHERE vt_text_type like 'AT%'
or vt_text_type like 'DR%'
or vt_text_type like 'DM%'
or vt_text_type like 'CS%'

Open in new window


It needs to go farther than just that.  When they add in ALL the necessary types, it will still probably pull in ALL 14 million rows.

I would expect the WHERE clause to limit based on vt_ser_num, a list of vt_ser_num's or some date range, etc...
0
 

Author Comment

by:angel7170
ID: 40288168
Ok, thank you.

Now I am running for one column "column 1" with the filter applied as "AT%". It has been more than 10 minutes, it is still running... see screenshot.

Total records from the select statement is 4,958,809


SELECT vt_ser_num,
       column_1
  FROM (SELECT vt_ser_num,
               clobagg(
                   CASE WHEN vt_text_type LIKE 'AT%' THEN vt_text END
               )
               OVER(
                   PARTITION BY vt_ser_num
                   ORDER BY vt_ent_num
                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
               )
                   column_1,
                    ROW_NUMBER() OVER(PARTITION BY vt_ser_num ORDER BY ROWNUM) rn
          FROM DQ_VT_TBL
WHERE vt_text_type like 'AT%'
)
 WHERE rn = 1
                 

Open in new window

screenshot.PNG
0
 

Author Comment

by:angel7170
ID: 40288183
>>>>>I would expect the WHERE clause to limit based on vt_ser_num, a list of vt_ser_num's or some date range, etc...

There is no date range or any other filter I could apply. It would be much easier If we do. This is a data migration project
and is complicated to bring data from different data source and moving it into Oracle database.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 40288186
If you have 4.9million rows, then you'll need to call the clobagg 4.9million times. Which means doing 4.9 million lob operations as well as 4.9 million context switches

also what is the execution plan of that query? Is it doing 4.9 million index lookups? Those will take a long time, but so will scanning the full table and filtering out only the 4.9 million rows you're really interested in
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40288248
>>This is a data migration project

So there will eventually be some INSERT associated with this?

Then you'll have the issue of HUGE rollback/undo to deal with.

Create a temp table of ALL vt_ser_num's to migrate.

Create a pl/sql block with the logic like this:

loop:
grab a vt_ser_num from the tmep table
exit when temp table is empty
concatenate ALL the rows for that one entry
put the migrated data where it goes
if everything checks out, delete that vt_ser_num from the temp table
commit every 10000-50000 rows based off a loop counter(depends on what works best for you)
next
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Background In several of the companies I have worked for, I noticed that corporate reporting is off loaded from the production database and done mainly on a clone database which needs to be kept up to date daily by various means, be it a logical…
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.
Suggested Courses

581 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question