Link to home
Start Free TrialLog in
Avatar of r3r
r3rFlag for United States of America

asked on

Parse XML and load into Oracle 10g

Hi,

I need to import/convert the provided XML file into Oracle 10g.  I am looking for the most efficient means to accomplish this.  Note I had to change the extension to .txt for upload purposes.    For this particular question,  I'm not looking for a coded answer.  I would like someone to provide the best practice of achieving this goal, explaining clearly why I should go their suggested route.  Links would be helpful as well.  I have some people on my end that I need to convince ;)

Here are the suggestions that have been presented to me so far...

1.  transform the xml to csv and use oracle sqldr to import the csv into oracle
2.  use java to directly parse the xml, connect to oracle and populate the database.
3.  use perl to parse the xml, connect to oracle and populate the database
4.  use talend to create to java code for your integration purposes.

Some things to consider:

1.  When this goes in to production, this will be run as a daily job.  The entire process of reading the xml and populating the database  needs to be completely automated.
2.  The sample XML file is much smaller than the actual XML file.  The real XML file will be 5-10 MB

I am open to any suggestion whatsoever as long as it is the BEST and most efficient means.  Best answer gets 500 pts.  Also, after receiving the best answer I will start another question thread seeking more granular response on how to implement the best solution provided here.  That question will be for 500 pts, as well.

Thanks in advance for your assistance...

Ryan

unf-XML.xml.txt
unf-xml.dtd.txt
Avatar of Sean Stuber
Sean Stuber

what kind of load are you trying to do?

Do you want to load the xml and save it?  Is that ok?  Thereby elminiating the parsing step.  That's obviously fasterst.  Use a bfile to read the xml file in its entirety and store as a clob.  Very fast.

Do you want to parse out individual fields?  Use an external table and query the contents and parse the data out into individual columns as you read it.

We can't answer what will be the fastest.  your perl vs java vs oracle parsing,  we can't tell you which might be fastest on your system.  All three can be pretty quick.
ASKER CERTIFIED SOLUTION
Avatar of DrSQL - Scott Anderson
DrSQL - Scott Anderson
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of r3r

ASKER

Thanks for the quick response guys...good advice from both.  I'm going to leave this question open for about 24 hours to see how the suggestions compare. I promise not to let this drag out too long.

To further elaborate:

When putting the data into the database, each element of the xml will correspond to a column in the db.  So no, the raw XML cannot be just loaded as a clob into the database.  It must be transformed.

As far as table structure goes, the database only needs two tables:

1) actions
2) payments

No heavy normalization is required, as once loaded into Oracle, someone after me will be manipulating it further.  Not sure how they will be using it, but I am sure that loading it in just two tables is sufficient for my purposes.

Avatar of r3r

ASKER

Thanks DrSQL...This was very helpful