asked on

Improving performance of EXTRACTVALUE XML DB

I have posted many times here in the last few weeks. The main problem I am have is this select statement.
If I have a count of 5000 as above then I have to crate a loop from 1 to 500 where I pass in i . The bellow statement takes more then a second.

select EXTRACTVALUE(SYS_NC_ROWINFO$,'/PDATA/EVH[i]') "EVH"
from XMLTABLE
WHERE EXISTSNODE (SYS_NC_ROWINFO$, '/PDATA/EVH') = 1;

I will have to process 640 million recs in 8 hours possible? Files will arrive in no greater then 1.5 MB.

if the above takes 2 seconds.
What I retrieve is <EVH>1~2~3~4~5~6~7~8~9~10||1~2~3~4~5~6~7~8~9~10||n times </EVH>

This has to be parse using || delim which represents 1 rec.
This has to be parsed 1~2~3~4~5~6~7~8~9~10 using ~ then inserted or update a table.

So far the EXTRACTVALUE is the bottle neck I really need to improve this. I just posted another question on how to register xsd to see if that will speed thins up.

I am also directly loading xml file into a clob data type.

I basically have two approaches load to xmltype table or direct from xml.

Because the xml file is fairly simlpe 3 tags 1 of which has embeded || and ~ does it make more sesne to build my own parse using instr to sepreate out hdr , evh and evd tags.

I am storing the parsed values in a nested table, is this the most efficient, maybe varray?
Could the temp values be stored in memory to avoid disk reads.
Can I use bulk loading techniques?

Also I am working on a database that was setup by someone else.
There there are only 4 tablespaces
system
sysaux
undo
temp
the system and sysaux are 99% full. This will slow processing down for sure.
The table and index are also use system table space, I can create new table space for table data and for index.
We do not have tablespaces on different disks. That could be an improvement placing tablespaces on diff disks.

Does the XDB schema default to system tablespace? Should that have its own tablespace?

All the code is on www.ihos-t.com/code

Lots of questions, in crisis mode at this time. I don't think it can be done but I want to get to a stage where at least its not dog slow.
Thanks in advance.

dplinnane

ASKER

hERE ARE SOME state from v$sqlarea this is limiting the loop to 50 i= 1 to 50

DISK_READS      BUFFER_GETS      PARSE_CALLS      USER_IO_WAIT_TIME      ROWS_PROCESSED

11968        31173      3      4236831      5
144704        376860      3      67324519      66
144704        376860      3      68225328      66

for HDR, EVH and EVD
select EXTRACTVALUE(SYS_NC_ROWINFO$,'/PDATA/EVH[i]') "EVH"
from XMLTABLE
WHERE EXISTSNODE (SYS_NC_ROWINFO$, '/PDATA/EVH') = 1;

sathyagiri

Instead of using extractvalue , did you try using XSL style sheet to transform your data ?

dplinnane

ASKER

I'm new to dealing xml so I don't know all the options how would I go about that?

sathyagiri

One other suggestion ,try creating the following indexes and see if you see any improvements

CREATE INDEX XML_fn_index ON XMLTABLE e
(EXTRACTVALUE(VALUE(e),'/PDATA/EVH'));

CREATE INDEX xml_index1 ON xmltable(SYS_NC_ROWINFO$) indextype is ctxsys.CTXXPATH;

dplinnane

ASKER

FOr the first indes I get
ORA-19025: EXTRACTVALUE returns value of only one node

The second index I get
ORA-19102: XQuery string literal expected

sathyagiri

Check this link for the create index
http://www.utexas.edu/its/unix/reference/oracledocs/v92/B10501_01/text.920/a96518/csql.htm

ASKER CERTIFIED SOLUTION

MohanKNair

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

dplinnane

ASKER

I'm not a C programmer or Pro C* I'm looking for improvements in what I alreay have.

sathyagiri

Did you try creating the index? You can create CTXSYS.CONTEXT index and see if there's any change in your query performance

dplinnane

ASKER

Had to write my own parser in the end , xml functionalit was way to slow. I will always be dealing with 4 tags. The performance gains were huge.