mmoore
asked on
xquery PLSQL slow (Is there a better way)
I have a slow solution to the problem and we are going to production with it because it is "good enough". The point being that this is not urgent. There is the problem ... given an XML document that looks like this
<AffiliateKey>
<ServiceCode>
<RateName>
<StartDate>
<EndDate>
<Rate>20</Rate>
and then 1 to 5 triplets of :
<Attribute#>
<Operator#>
<Value#>
where # is a number from 1 to 5.
In my PLSQL cursor which I am about to show you, the result set is multiple rows of:
AffiliateKey,ServiceCode,R ateName,St artDate,En dDate,Rate , NESTED TABLE of [Attribute,Operator,Value]
Here is my cursor that performs that transformation:
I suspect that everything I've done in my cursor could have been done with a single XQUERY statement which I suspect would run a lot faster.
1) Could this be done with a single xquery or at least something much simplified.
2) What would you have done differently?
3) Any other thoughts or questions you might have?
Thanks, Mike
<batch>
<testdata>
<AffiliateKey>21047193</AffiliateKey>
<ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
<RateName>DRYWALL_SHEETROCK-Install-1</RateName>
<StartDate>6/8/15</StartDate>
<EndDate>1/1/25</EndDate>
<Rate>20</Rate>
<Attribute1>DrywallProjectScope</Attribute1>>
<Operator1>=</Operator1>
<Value1>'Install'</Value1>
<Attribute2>RateLevel</Attribute2>
<Operator2>=</Operator2>
<Value2>1</Value2>
</testdata>
<testdata>
<AffiliateKey>21047205</AffiliateKey>
<ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
<RateName>DRYWALL_SHEETROCK-Install-1</RateName>
<StartDate>6/8/15</StartDate>
<EndDate>1/1/25</EndDate>
<Rate>25</Rate>
<Attribute1>DrywallProjectScope</Attribute1>>
<Operator1>=</Operator1>
</testdata>
</batch>
That is to say, for a given <testdata> there will be one<AffiliateKey>
<ServiceCode>
<RateName>
<StartDate>
<EndDate>
<Rate>20</Rate>
and then 1 to 5 triplets of :
<Attribute#>
<Operator#>
<Value#>
where # is a number from 1 to 5.
In my PLSQL cursor which I am about to show you, the result set is multiple rows of:
AffiliateKey,ServiceCode,R
Here is my cursor that performs that transformation:
CURSOR c (
pPathI IN VARCHAR2,
pXMLDocI IN XMLTYPE) IS
WITH xform AS
( SELECT seq,
affiliatekey,
servicecode,
startdate,
enddate,
ratename,
rate,
XMLELEMENT (
"criteria",
XMLELEMENT (
"grp",
XMLFOREST (x.attribute1 AS attribute,
x.operator1 AS operator,
x.value1 AS VALUE)),
XMLELEMENT (
"grp",
XMLFOREST (x.attribute2 AS attribute,
x.operator2 AS operator,
x.value2 AS VALUE)),
XMLELEMENT (
"grp",
XMLFOREST (x.attribute3 AS attribute,
x.operator3 AS operator,
x.value3 AS VALUE)),
XMLELEMENT (
"grp",
XMLFOREST (x.attribute4 AS attribute,
x.operator4 AS operator,
x.value4 AS VALUE)),
XMLELEMENT (
"grp",
XMLFOREST (x.attribute5 AS attribute,
x.operator5 AS operator,
x.value5 AS VALUE)))
AS critlist
FROM XMLTABLE (pPathI
PASSING pXMLDocI
COLUMNS seq FOR ORDINALITY,
affiliatekey VARCHAR2 (20) PATH 'AffiliateKey',
servicecode VARCHAR2 (250) PATH 'ServiceCode',
startdate VARCHAR2 (20) PATH 'StartDate',
enddate VARCHAR2 (20) PATH 'EndDate',
ratename VARCHAR2 (200) PATH 'RateName',
rate VARCHAR2 (200) PATH 'Rate',
attribute1 VARCHAR2 (200) PATH 'Attribute1',
operator1 VARCHAR2 (200) PATH 'Operator1',
value1 VARCHAR2 (2000) PATH 'Value1',
attribute2 VARCHAR2 (200) PATH 'Attribute2',
operator2 VARCHAR2 (200) PATH 'Operator2',
value2 VARCHAR2 (2000) PATH 'Value2',
attribute3 VARCHAR2 (200) PATH 'Attribute3',
operator3 VARCHAR2 (200) PATH 'Operator3',
value3 VARCHAR2 (2000) PATH 'Value3',
attribute4 VARCHAR2 (200) PATH 'Attribute4',
operator4 VARCHAR2 (200) PATH 'Operator4',
value4 VARCHAR2 (2000) PATH 'Value4',
attribute5 VARCHAR2 (200) PATH 'Attribute5',
operator5 VARCHAR2 (200) PATH 'Operator5',
value5 VARCHAR2 (2000) PATH 'Value5') x)
SELECT z.seq,
z.affiliatekey,
z.servicecode,
z.startdate,
z.enddate,
z.ratename,
z.rate,
CAST (
COLLECT (ATTR_OPER_VALU_OBJECT (yy.attribute, yy.operator, yy.VALUE)) AS ATTR_OPER_VALU_OBJECT_TABLE)
crits
FROM xform z,
XMLTABLE (
'/criteria/grp'
PASSING z.critlist
COLUMNS attribute VARCHAR2 (200) PATH 'ATTRIBUTE',
operator VARCHAR2 (200) PATH 'OPERATOR',
VALUE VARCHAR2 (200) PATH 'VALUE') yy
GROUP BY z.seq,
z.affiliatekey,
z.servicecode,
z.startdate,
z.enddate,
z.ratename,
z.rate;
I suspect that everything I've done in my cursor could have been done with a single XQUERY statement which I suspect would run a lot faster.
1) Could this be done with a single xquery or at least something much simplified.
2) What would you have done differently?
3) Any other thoughts or questions you might have?
Thanks, Mike
If I guess at your types I get this:
Then I run your SQL (again guessed at the cursor parameters) and I get this as the collection for the first set:
I'm concerned I have something not quite right given the three null objects.
What am I missing?
CREATE TYPE ATTR_OPER_VALU_OBJECT AS OBJECT (
attribute varchar2(20),
operator varchar2(20),
value varchar2(20)
);
/
CREATE TYPE ATTR_OPER_VALU_OBJECT_TABLE is table of ATTR_OPER_VALU_OBJECT;
/
Then I run your SQL (again guessed at the cursor parameters) and I get this as the collection for the first set:
ATTR_OPER_VALU_OBJECT_TABLE(
ATTR_OPER_VALU_OBJECT('DrywallProjectScope', '=', '''Install'''),
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT(NULL, NULL, NULL),
ATTR_OPER_VALU_OBJECT('RateLevel', '=', '1')
)
I'm concerned I have something not quite right given the three null objects.
What am I missing?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes, you surmised the the TYPES correctly and the input parameters are simply the xpath, like 'batch/testdata' and the document itself. I need a while to examine your 'solution' , so I'll get back to you later today hopefully. Thanks very much for taking a look at this.
ASKER
I love the way you handled loading ATTR_OPER_VALU_OBJECT_TABL E! This gets rid of the COLLECT and TABLE and GROUPBY and a bunch of other crap. I'm going to run profiler on it and see how it performs. You may be interested to know that this is for a process whereby the user drops an EXCEL document onto a web page. The javascript on that web pages converts the EXCEL doc to JSON and then, using AJAX, sends that JSON to the client-side Java process which converts the JSON to XML and calls my PLSQL procedure. PLSQL does the edits and if all is good, loads the database and sends the results back to the user's screen. About 80% of the time from when the user drops the EXCEL to when he sees the results was take up by the cursor you addressed. Hopefully I can sneak your new cursor in before this goes to production. Again , huge thanks.
I hope it runs faster for you!
That seems like a lot of work to get an uploaded Excel file into Oracle...
Since it appears you are in control of the XML, I would really look at changing it. It really isn't what I would consider 'valid'. OK, based on syntax it is valid but it isn't really well-formed.
Repeating nodes should be encapsulated so they maintain their hierarchy.
I would make the XML look like this:
It would make parsing so much easier.
You might also be able to do away with the TYPEs.
I'm guessing the final result is to get the data into a table, then displayed back out to the end user?
You can probably get rid of the TYPES now but properly formed XML would likely make things easier. It definitely would if they add say 10 more repeating fields. None of the code would have to change.
I dabble in .Net and know very little JSON/AJAX but it still seems like a lot of translating/transforming.
FYI:
Oracle 12c has some JSON parsing built into it.
That seems like a lot of work to get an uploaded Excel file into Oracle...
Since it appears you are in control of the XML, I would really look at changing it. It really isn't what I would consider 'valid'. OK, based on syntax it is valid but it isn't really well-formed.
Repeating nodes should be encapsulated so they maintain their hierarchy.
I would make the XML look like this:
<batch>
<testdata>
<AffiliateKey>21047193</AffiliateKey>
<ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
<RateName>DRYWALL_SHEETROCK-Install-1</RateName>
<StartDate>6/8/15</StartDate>
<EndDate>1/1/25</EndDate>
<Rate>20</Rate>
<operatorInfo>
<operator>
<Attribute>DrywallProjectScope</Attribute>
<Operator>=</Operator>
<Value>''Install''</Value>
</operator>
<operator>
<Attribute>RateLevel</Attribute>
<Operator>=</Operator>
<Value>1</Value>
</operator>
</operatorInfo>
</testdata>
<testdata>
<AffiliateKey>21047205</AffiliateKey>
<ServiceCode>DRYWALL_SHEETROCK</ServiceCode>
<RateName>DRYWALL_SHEETROCK-Install-1</RateName>
<StartDate>6/8/15</StartDate>
<EndDate>1/1/25</EndDate>
<Rate>25</Rate>
<operatorInfo>
<operator>
<Attribute>DrywallProjectScope</Attribute>
<Operator>=</Operator>
</operator>
</operatorInfo>
</testdata>
</batch>
It would make parsing so much easier.
You might also be able to do away with the TYPEs.
I'm guessing the final result is to get the data into a table, then displayed back out to the end user?
You can probably get rid of the TYPES now but properly formed XML would likely make things easier. It definitely would if they add say 10 more repeating fields. None of the code would have to change.
I dabble in .Net and know very little JSON/AJAX but it still seems like a lot of translating/transforming.
FYI:
Oracle 12c has some JSON parsing built into it.
ASKER
Hi, I am using js-xlsx (demo here http://oss.sheetjs.com/js-xlsx/ ) to convert the excel to JSON, so I don't really have the opportunity to generate the XML as you suggest. I DO totally agree with your point about the structural decencies in my XML and with the solution you proposed.
Just as an aside:
An XML document with correct syntax is called "Well Formed". Like, every OPEN tag has a corresponding CLOSE tag, stuff like that.
An XML document which is compliant to a DTD or Schema is "Valid" with regards to that specific DTD or schema.
An XML document that is Valid must, by definition, also be "Well Formed"
My XML document is "Well Formed". However, since there is no DTD (or schema) to validate it against, we have no way to know if it is "Valid".
There might be a different official term to describe decencies you pointed out, if not, there should be. Maybe we could call it, "not fully grouped".
Sadly, we are still on Oracle 11g. If we had 12c, I would have foregone the XML entirely. Thanks again! Mike
Just as an aside:
An XML document with correct syntax is called "Well Formed". Like, every OPEN tag has a corresponding CLOSE tag, stuff like that.
An XML document which is compliant to a DTD or Schema is "Valid" with regards to that specific DTD or schema.
An XML document that is Valid must, by definition, also be "Well Formed"
My XML document is "Well Formed". However, since there is no DTD (or schema) to validate it against, we have no way to know if it is "Valid".
There might be a different official term to describe decencies you pointed out, if not, there should be. Maybe we could call it, "not fully grouped".
Sadly, we are still on Oracle 11g. If we had 12c, I would have foregone the XML entirely. Thanks again! Mike
>>Just as an aside:
*sigh*
I knew I should have looked up those definitions before I just threw the terms out there...
I could get behind the adoption of "not fully grouped"!!!
*sigh*
I knew I should have looked up those definitions before I just threw the terms out there...
I could get behind the adoption of "not fully grouped"!!!
ASKER
Just following up, here is the profiler analysis of original cursor vs expert's cursor respectively. Clearly it's quite an improvement in execution time.
Run Comment (Run Object) Total Execution Time (secs) Avg. Execution Time (secs) Min Time Max Time Passes Run Date
Line# 2 0.0000 0.0000 0.0000 0.0000 1 CURSOR c (
Line# 5 0.2430 0.2430 0.2430 0.2430 1 WITH xform AS
Line# 91 0.0000 0.0000 0.0000 0.0000 1 PROCEDURE load_criteria_rates (pRateTransactionsI IN CLOB,
Line# 96 0.0000 0.0000 0.0000 0.0000 1 BEGIN
Line# 97 0.0001 0.0001 0.0001 0.0001 1 dbms_output.put_line('Length of input is '||to_char(length(pRateTransactionsI)));
Line# 98 0.0000 0.0000 0.0000 0.0000 1 vRateTransactionsXML := xmltype (pRateTransactionsI);
Line# 99 0.0000 0.0000 0.0000 0.0000 1 vPath := '/batch/noload';
Line# 101 40.9826 0.0409 0.0000 40.3294 1,002 FOR cur1 IN c (pPathI => vPath, pXMLDocI => vRateTransactionsXML) LOOP
Line# 103 0.1955 0.0000 0.0000 0.0003 4,000 FOR cur1a
Line# 107 0.0000 0.0000 0.0000 0.0000 2,000 NULL;
Line# 108 0.0000 0.0000 0.0000 0.0000 1,000 END LOOP;
Line# 109 0.0000 0.0000 0.0000 0.0000 1 END LOOP;
Line# 110 0.0000 0.0000 0.0000 0.0000 1 END load_criteria_rates;
Run Comment (Run Object) Total Execution Time (secs) Avg. Execution Time (secs) Min Time Max Time Passes Run Date
Line# 2 0.0000 0.0000 0.0000 0.0000 1 CURSOR c (
Line# 5 0.2392 0.2392 0.2392 0.2392 1 SELECT seq,
Line# 45 0.0000 0.0000 0.0000 0.0000 1 PROCEDURE load_criteria_rates (pRateTransactionsI IN CLOB,
Line# 50 0.0000 0.0000 0.0000 0.0000 1 BEGIN
Line# 51 0.0001 0.0001 0.0001 0.0001 1 dbms_output.put_line('Length of input is '||to_char(length(pRateTransactionsI)));
Line# 52 0.0000 0.0000 0.0000 0.0000 1 vRateTransactionsXML := xmltype (pRateTransactionsI);
Line# 53 0.0000 0.0000 0.0000 0.0000 1 vPath := '/batch/noload';
Line# 55 21.6043 0.0216 0.0000 0.1300 1,002 FOR cur1 IN c (pPathI => vPath, pXMLDocI => vRateTransactionsXML) LOOP
Line# 57 0.3606 0.0001 0.0000 0.0126 7,000 FOR cur1a
Line# 61 0.0000 0.0000 0.0000 0.0000 5,000 NULL;
Line# 62 0.0000 0.0000 0.0000 0.0000 1,000 END LOOP;
Line# 63 0.0000 0.0000 0.0000 0.0000 1 END LOOP;
Line# 64 0.0000 0.0000 0.0000 0.0000 1 END load_criteria_rates;
Thanks or the update!
Not sure how to read that output but just make sure the runs are 'accurate'.
If you run a query Oracle will likely have everything it needs cached in memory for a slightly different second run.
Not sure how to read that output but just make sure the runs are 'accurate'.
If you run a query Oracle will likely have everything it needs cached in memory for a slightly different second run.
ASKER
Yeah, the text formatting was kind of wonky. Maybe this will help. It's really strange how, on line #101 the max time of a single iteration of the loop was almost as long as the total execution time for that same line. I can only suspect that this is a bug in the Profiler. Bottom line is that your cursor runs almost twice as fast as mine.
Cool.
To provide a copy/paste solution we also need the type of ATTR_OPER_VALU_OBJECT_TABL
From a quick scan it looks like you might be able to simplify things but I'm not sure about a performance increase.