asked on

xquery PLSQL slow (Is there a better way)

I have a slow solution to the problem and we are going to production with it because it is "good enough". The point being that this is not urgent. There is the problem ... given an XML document that looks like this

<batch>
    <testdata>
        <AffiliateKey>21047193</AffiliateKey>
        <ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
        <RateName>DRYWALL_SHEETROCK-Install-1</RateName>
        <StartDate>6/8/15</StartDate>
        <EndDate>1/1/25</EndDate>
        <Rate>20</Rate>
        <Attribute1>DrywallProjectScope</Attribute1>>
        <Operator1>=</Operator1>
        <Value1>'Install'</Value1>
        <Attribute2>RateLevel</Attribute2>
        <Operator2>=</Operator2>
        <Value2>1</Value2>
    </testdata>
    <testdata>
        <AffiliateKey>21047205</AffiliateKey>
        <ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
        <RateName>DRYWALL_SHEETROCK-Install-1</RateName>
        <StartDate>6/8/15</StartDate>
        <EndDate>1/1/25</EndDate>
        <Rate>25</Rate>
        <Attribute1>DrywallProjectScope</Attribute1>>
        <Operator1>=</Operator1>
    </testdata>
</batch>

Open in new window

That is to say, for a given <testdata> there will be one
<AffiliateKey>
<ServiceCode>
<RateName>
<StartDate>
<EndDate>
<Rate>20</Rate>
and then 1 to 5 triplets of :
<Attribute#>
<Operator#>
<Value#>
where # is a number from 1 to 5.

In my PLSQL cursor which I am about to show you, the result set is multiple rows of:
AffiliateKey,ServiceCode,RateName,StartDate,EndDate,Rate, NESTED TABLE of [Attribute,Operator,Value]

Here is my cursor that performs that transformation:

   CURSOR c (
      pPathI     IN VARCHAR2,
      pXMLDocI   IN XMLTYPE) IS
      WITH xform AS
              (         SELECT seq,
                               affiliatekey,
                               servicecode,
                               startdate,
                               enddate,
                               ratename,
                               rate,
                               XMLELEMENT (
                                  "criteria",
                                  XMLELEMENT (
                                     "grp",
                                     XMLFOREST (x.attribute1 AS attribute,
                                                x.operator1 AS operator,
                                                x.value1 AS VALUE)),
                                  XMLELEMENT (
                                     "grp",
                                     XMLFOREST (x.attribute2 AS attribute,
                                                x.operator2 AS operator,
                                                x.value2 AS VALUE)),
                                  XMLELEMENT (
                                     "grp",
                                     XMLFOREST (x.attribute3 AS attribute,
                                                x.operator3 AS operator,
                                                x.value3 AS VALUE)),
                                  XMLELEMENT (
                                     "grp",
                                     XMLFOREST (x.attribute4 AS attribute,
                                                x.operator4 AS operator,
                                                x.value4 AS VALUE)),
                                  XMLELEMENT (
                                     "grp",
                                     XMLFOREST (x.attribute5 AS attribute,
                                                x.operator5 AS operator,
                                                x.value5 AS VALUE)))
                                  AS critlist
                          FROM XMLTABLE (pPathI  
                                         PASSING pXMLDocI
                                         COLUMNS seq            FOR ORDINALITY,
                                                 affiliatekey   VARCHAR2 (20) PATH 'AffiliateKey',
                                                 servicecode    VARCHAR2 (250) PATH 'ServiceCode',
                                                 startdate      VARCHAR2 (20) PATH 'StartDate',
                                                 enddate        VARCHAR2 (20) PATH 'EndDate',
                                                 ratename       VARCHAR2 (200) PATH 'RateName',
                                                 rate           VARCHAR2 (200) PATH 'Rate',
                                                 attribute1     VARCHAR2 (200) PATH 'Attribute1',
                                                 operator1      VARCHAR2 (200) PATH 'Operator1',
                                                 value1         VARCHAR2 (2000) PATH 'Value1',
                                                 attribute2     VARCHAR2 (200) PATH 'Attribute2',
                                                 operator2      VARCHAR2 (200) PATH 'Operator2',
                                                 value2         VARCHAR2 (2000) PATH 'Value2',
                                                 attribute3     VARCHAR2 (200) PATH 'Attribute3',
                                                 operator3      VARCHAR2 (200) PATH 'Operator3',
                                                 value3         VARCHAR2 (2000) PATH 'Value3',
                                                 attribute4     VARCHAR2 (200) PATH 'Attribute4',
                                                 operator4      VARCHAR2 (200) PATH 'Operator4',
                                                 value4         VARCHAR2 (2000) PATH 'Value4',
                                                 attribute5     VARCHAR2 (200) PATH 'Attribute5',
                                                 operator5      VARCHAR2 (200) PATH 'Operator5',
                                                 value5         VARCHAR2 (2000) PATH 'Value5') x)
            SELECT z.seq,
                   z.affiliatekey,
                   z.servicecode,
                   z.startdate,
                   z.enddate,
                   z.ratename,
                   z.rate,
                   CAST (
                      COLLECT (ATTR_OPER_VALU_OBJECT (yy.attribute, yy.operator, yy.VALUE)) AS ATTR_OPER_VALU_OBJECT_TABLE)
                      crits
              FROM xform z,
                   XMLTABLE (
                      '/criteria/grp'
                      PASSING z.critlist
                      COLUMNS attribute   VARCHAR2 (200) PATH 'ATTRIBUTE',
                              operator    VARCHAR2 (200) PATH 'OPERATOR',
                              VALUE       VARCHAR2 (200) PATH 'VALUE') yy
          GROUP BY z.seq,
                   z.affiliatekey,
                   z.servicecode,
                   z.startdate,
                   z.enddate,
                   z.ratename,
                   z.rate;

Open in new window

I suspect that everything I've done in my cursor could have been done with a single XQUERY statement which I suspect would run a lot faster.
1) Could this be done with a single xquery or at least something much simplified.
2) What would you have done differently?
3) Any other thoughts or questions you might have?

Thanks, Mike

slightwv (䄆 Netminder)

Can you post an example of the cursor parameters and the results from that?

To provide a copy/paste solution we also need the type of ATTR_OPER_VALU_OBJECT_TABLE.

From a quick scan it looks like you might be able to simplify things but I'm not sure about a performance increase.

slightwv (䄆 Netminder)

If I guess at your types I get this:

CREATE TYPE ATTR_OPER_VALU_OBJECT AS OBJECT (
   attribute	varchar2(20),
   operator	varchar2(20),
   value	varchar2(20)
);  
/

CREATE TYPE ATTR_OPER_VALU_OBJECT_TABLE is table of ATTR_OPER_VALU_OBJECT;
/

Open in new window

Then I run your SQL (again guessed at the cursor parameters) and I get this as the collection for the first set:

ATTR_OPER_VALU_OBJECT_TABLE(
ATTR_OPER_VALU_OBJECT('DrywallProjectScope', '=', '''Install'''), 
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT('RateLevel', '=', '1')
)

Open in new window

I'm concerned I have something not quite right given the three null objects.

What am I missing?

ASKER CERTIFIED SOLUTION

slightwv (䄆 Netminder)

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

mmoore

ASKER

Yes, you surmised the the TYPES correctly and the input parameters are simply the xpath, like 'batch/testdata' and the document itself. I need a while to examine your 'solution' , so I'll get back to you later today hopefully. Thanks very much for taking a look at this.

mmoore

ASKER

I love the way you handled loading ATTR_OPER_VALU_OBJECT_TABLE! This gets rid of the COLLECT and TABLE and GROUPBY and a bunch of other crap. I'm going to run profiler on it and see how it performs. You may be interested to know that this is for a process whereby the user drops an EXCEL document onto a web page. The javascript on that web pages converts the EXCEL doc to JSON and then, using AJAX, sends that JSON to the client-side Java process which converts the JSON to XML and calls my PLSQL procedure. PLSQL does the edits and if all is good, loads the database and sends the results back to the user's screen. About 80% of the time from when the user drops the EXCEL to when he sees the results was take up by the cursor you addressed. Hopefully I can sneak your new cursor in before this goes to production. Again , huge thanks.

slightwv (䄆 Netminder)

I hope it runs faster for you!

That seems like a lot of work to get an uploaded Excel file into Oracle...

Since it appears you are in control of the XML, I would really look at changing it. It really isn't what I would consider 'valid'. OK, based on syntax it is valid but it isn't really well-formed.
Repeating nodes should be encapsulated so they maintain their hierarchy.

I would make the XML look like this:

<batch>
    <testdata>
        <AffiliateKey>21047193</AffiliateKey>
        <ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
        <RateName>DRYWALL_SHEETROCK-Install-1</RateName>
        <StartDate>6/8/15</StartDate>
        <EndDate>1/1/25</EndDate>
        <Rate>20</Rate>
	<operatorInfo>
		<operator>
        		<Attribute>DrywallProjectScope</Attribute>
        		<Operator>=</Operator>
        		<Value>''Install''</Value>
		</operator>
		<operator>
        		<Attribute>RateLevel</Attribute>
        		<Operator>=</Operator>
        		<Value>1</Value>
		</operator>
	</operatorInfo>
    </testdata>
    <testdata>
        <AffiliateKey>21047205</AffiliateKey>
        <ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
        <RateName>DRYWALL_SHEETROCK-Install-1</RateName>
        <StartDate>6/8/15</StartDate>
        <EndDate>1/1/25</EndDate>
        <Rate>25</Rate>
	<operatorInfo>
		<operator>
        		<Attribute>DrywallProjectScope</Attribute>
        		<Operator>=</Operator>
		</operator>
	</operatorInfo>
    </testdata>
</batch>

Open in new window

It would make parsing so much easier.

You might also be able to do away with the TYPEs.

I'm guessing the final result is to get the data into a table, then displayed back out to the end user?

You can probably get rid of the TYPES now but properly formed XML would likely make things easier. It definitely would if they add say 10 more repeating fields. None of the code would have to change.

I dabble in .Net and know very little JSON/AJAX but it still seems like a lot of translating/transforming.

FYI:
Oracle 12c has some JSON parsing built into it.

mmoore

ASKER

Hi, I am using js-xlsx (demo here http://oss.sheetjs.com/js-xlsx/ ) to convert the excel to JSON, so I don't really have the opportunity to generate the XML as you suggest. I DO totally agree with your point about the structural decencies in my XML and with the solution you proposed.

Just as an aside:

An XML document with correct syntax is called "Well Formed". Like, every OPEN tag has a corresponding CLOSE tag, stuff like that.
An XML document which is compliant to a DTD or Schema is "Valid" with regards to that specific DTD or schema.
An XML document that is Valid must, by definition, also be "Well Formed"

My XML document is "Well Formed". However, since there is no DTD (or schema) to validate it against, we have no way to know if it is "Valid".

There might be a different official term to describe decencies you pointed out, if not, there should be. Maybe we could call it, "not fully grouped".

Sadly, we are still on Oracle 11g. If we had 12c, I would have foregone the XML entirely. Thanks again! Mike

slightwv (䄆 Netminder)

>>Just as an aside:

*sigh*
I knew I should have looked up those definitions before I just threw the terms out there...

I could get behind the adoption of "not fully grouped"!!!

mmoore

ASKER

Just following up, here is the profiler analysis of original cursor vs expert's cursor respectively. Clearly it's quite an improvement in execution time.

Run Comment (Run Object)			Total Execution Time (secs)	Avg. Execution Time (secs)	Min Time	Max Time	Passes	Run Date
		Line# 2		0.0000	0.0000	0.0000	0.0000	1	   CURSOR c (
		Line# 5		0.2430	0.2430	0.2430	0.2430	1	      WITH xform AS
		Line# 91	0.0000	0.0000	0.0000	0.0000	1	   PROCEDURE load_criteria_rates (pRateTransactionsI          IN     CLOB,
		Line# 96	0.0000	0.0000	0.0000	0.0000	1	   BEGIN
		Line# 97	0.0001	0.0001	0.0001	0.0001	1	      dbms_output.put_line('Length of input is '||to_char(length(pRateTransactionsI)));
		Line# 98	0.0000	0.0000	0.0000	0.0000	1	      vRateTransactionsXML := xmltype (pRateTransactionsI);
		Line# 99	0.0000	0.0000	0.0000	0.0000	1	      vPath := '/batch/noload';
		Line# 101	40.9826	0.0409	0.0000	40.3294	1,002	      FOR cur1 IN c (pPathI => vPath, pXMLDocI => vRateTransactionsXML) LOOP
		Line# 103	0.1955	0.0000	0.0000	0.0003	4,000	         FOR cur1a
		Line# 107	0.0000	0.0000	0.0000	0.0000	2,000	            NULL;
		Line# 108	0.0000	0.0000	0.0000	0.0000	1,000	         END LOOP;
		Line# 109	0.0000	0.0000	0.0000	0.0000	1	      END LOOP;
		Line# 110	0.0000	0.0000	0.0000	0.0000	1	   END load_criteria_rates;

Run Comment (Run Object)			Total Execution Time (secs)	Avg. Execution Time (secs)	Min Time	Max Time	Passes	Run Date
		Line# 2		0.0000	0.0000	0.0000	0.0000	1	   CURSOR c (
		Line# 5		0.2392	0.2392	0.2392	0.2392	1	               SELECT seq,
		Line# 45	0.0000	0.0000	0.0000	0.0000	1	   PROCEDURE load_criteria_rates (pRateTransactionsI          IN     CLOB,
		Line# 50	0.0000	0.0000	0.0000	0.0000	1	   BEGIN
		Line# 51	0.0001	0.0001	0.0001	0.0001	1	      dbms_output.put_line('Length of input is '||to_char(length(pRateTransactionsI)));
		Line# 52	0.0000	0.0000	0.0000	0.0000	1	      vRateTransactionsXML := xmltype (pRateTransactionsI);
		Line# 53	0.0000	0.0000	0.0000	0.0000	1	      vPath := '/batch/noload';
		Line# 55	21.6043	0.0216	0.0000	0.1300	1,002	      FOR cur1 IN c (pPathI => vPath, pXMLDocI => vRateTransactionsXML) LOOP
		Line# 57	0.3606	0.0001	0.0000	0.0126	7,000	         FOR cur1a
		Line# 61	0.0000	0.0000	0.0000	0.0000	5,000	            NULL;
		Line# 62	0.0000	0.0000	0.0000	0.0000	1,000	         END LOOP;
		Line# 63	0.0000	0.0000	0.0000	0.0000	1	      END LOOP;
		Line# 64	0.0000	0.0000	0.0000	0.0000	1	   END load_criteria_rates;

Open in new window

slightwv (䄆 Netminder)

Thanks or the update!

Not sure how to read that output but just make sure the runs are 'accurate'.

If you run a query Oracle will likely have everything it needs cached in memory for a slightly different second run.

mmoore

ASKER

Yeah, the text formatting was kind of wonky. Maybe this will help. It's really strange how, on line #101 the max time of a single iteration of the loop was almost as long as the total execution time for that same line. I can only suspect that this is a bug in the Profiler. Bottom line is that your cursor runs almost twice as fast as mine.

slightwv (䄆 Netminder)

Cool.