I need to import/convert the provided XML file into Oracle 10g. I am looking for the most efficient means to accomplish this. Note I had to change the extension to .txt for upload purposes. For this particular question, I'm not looking for a coded answer. I would like someone to provide the best practice of achieving this goal, explaining clearly why I should go their suggested route. Links would be helpful as well. I have some people on my end that I need to convince ;)
Here are the suggestions that have been presented to me so far...
1. transform the xml to csv and use oracle sqldr to import the csv into oracle
2. use java to directly parse the xml, connect to oracle and populate the database.
3. use perl to parse the xml, connect to oracle and populate the database
4. use talend to create to java code for your integration purposes.
Some things to consider:
1. When this goes in to production, this will be run as a daily job. The entire process of reading the xml and populating the database needs to be completely automated.
2. The sample XML file is much smaller than the actual XML file. The real XML file will be 5-10 MB
I am open to any suggestion whatsoever as long as it is the BEST and most efficient means. Best answer gets 500 pts. Also, after receiving the best answer I will start another question thread seeking more granular response on how to implement the best solution provided here. That question will be for 500 pts, as well.
Do you want to load the xml and save it? Is that ok? Thereby elminiating the parsing step. That's obviously fasterst. Use a bfile to read the xml file in its entirety and store as a clob. Very fast.
Do you want to parse out individual fields? Use an external table and query the contents and parse the data out into individual columns as you read it.
We can't answer what will be the fastest. your perl vs java vs oracle parsing, we can't tell you which might be fastest on your system. All three can be pretty quick.
Thanks for the quick response guys...good advice from both. I'm going to leave this question open for about 24 hours to see how the suggestions compare. I promise not to let this drag out too long.
To further elaborate:
When putting the data into the database, each element of the xml will correspond to a column in the db. So no, the raw XML cannot be just loaded as a clob into the database. It must be transformed.
As far as table structure goes, the database only needs two tables:
1) actions
2) payments
No heavy normalization is required, as once loaded into Oracle, someone after me will be manipulating it further. Not sure how they will be using it, but I am sure that loading it in just two tables is sufficient for my purposes.
Java is a platform-independent, object-oriented programming language and run-time environment, designed to have as few implementation dependencies as possible such that developers can write one set of code across all platforms using libraries. Most devices will not run Java natively, and require a run-time component to be installed in order to execute a Java program.
Do you want to load the xml and save it? Is that ok? Thereby elminiating the parsing step. That's obviously fasterst. Use a bfile to read the xml file in its entirety and store as a clob. Very fast.
Do you want to parse out individual fields? Use an external table and query the contents and parse the data out into individual columns as you read it.
We can't answer what will be the fastest. your perl vs java vs oracle parsing, we can't tell you which might be fastest on your system. All three can be pretty quick.