OpenXML a Memory Hog?


I have read that OpenXML reserves 1/8 of the total memory of the system it is on to perform its processes. I would like to use XML to simplify many processes, such as storing structured data in a table row.

I find it very easy to use, but I am concerned with ending up with a dead server because all the memory has been drained. I was trying to use the Chilkat Soft XML Parser for SQL Server, but it does not seem to be working and even though I bought their bundle, I cant get support.

Its important that I do the processing of the XML in SQL Server.
Who is Participating?
mrjoltcolaConnect With a Mentor Commented:
>>Chilkat Soft XML Parser for SQL Server, but it does not seem to be working and even though I bought their bundle, I cant get support.

You don't say. I also license their stuff and "similar" experience. Some good APIs but not so good customer service.

As to your question regarding XML.

1) I don't know anything about OpenXML specifically, so you may stop reading here and skip my post
2) XML in general is a memory hog, any way you slice it. Parsing XML is one of the least efficient things you can do in any language or platform. If you stick with SAX/event based parsers, you can skip most of the memory overhead, but often DOM is needed and that surely hogs memory.
3) In my experience, XML does not simply much of anything, given the alternatives. If it is structured data in a database, store it as such. You are absolutely right in your worry about performance, parsing XML out of a column is a very inefficient use of your powerful SQL Server engine. I've seen real world cases of huge documents causing even Oracle to run out of JVM memory with its builtin XML support.

My rule of thumb:

If the XML is meant to be parsed by an external tool, storing it as a LOB/text field in the database is perfect. You are only serving the document up to the tool.
If the XML is actually structured data that I need to access and/or correlate to my database tables, I do not store it as XML, I create an actual schema model for it so it is stored in an efficient form for retrieval.

Hope that helps.
Racim BOUDJAKDJIDatabase Architect - Dba - Data ScientistCommented:
<<. I would like to use XML to simplify many processes, such as storing structured data in a table row.>>
The further away you stay from XML the better you will be in a database system.  

XML is a bad solution looking for a problem to solve.  Not only XML does not solve anything that would not be solved better without it but it also creates problems that would never exists if XML was simply ignored.

<<Its important that I do the processing of the XML in SQL Server.>>
As any hierarchic model inspired data representation, XML has no data manipulative capabilities and imposes prohibitive resource cost at runtime.  I'd suggest you simply stick to store its content on the database (as FILESTREAM for instance) and let an external parser do what you are looking.

My two cents...
dilithiumtoysAuthor Commented:
I sure hope there are more posts. I have heard from the nays. what about the yeas?

For what I would do with XML, it does not make sense to a) serve it up to an outside process b) parse it elsewhere c) bring it back into SQL server so that I can use it.

One app I am building is for generating on the fly reports. The report schema is in XML (which needs to be parsed by "something") and the results have to be feed into the SQL server and the result has to be returned in XML.

The report has to be somewhat self-contained so that I am not constantly updating code to generate a new report. The XML schema contains the data to construct a variable map table. The variable map is a table variable containing dynamic data assigned to a name pair.

I use the variables to filter the results of the base tables that I attempting to get data from. The entire result is returned as XML so that style sheets can be applied.

I really need this to run on the SQL Server because imagine if I have 50 entries in my xml that need to be variables. I would need to execute 50 database calls over the wire. However if I process the XML on the server where the data is, then number of calls falls to 1.

I guess my main thing is not "IS XML BAD?" because for my purposes it is not. I want to know if OpenXML to parse the data is going to cause serious issues. Or is there another XML parser that will fit the bill.

Build your data science skills into a career

Are you ready to take your data science career to the next step, or break into data science? With Springboard’s Data Science Career Track, you’ll master data science topics, have personalized career guidance, weekly calls with a data science expert, and a job guarantee.

Racim BOUDJAKDJIConnect With a Mentor Database Architect - Dba - Data ScientistCommented:
dilithiumtoysAuthor Commented:
The answers I received would be great if I had SQL 2005 and above. I have SQL 2000, and our budget does not allow for upgrade. The question was not answered very well. I wanted to know about the select * from OpenXML memory issues.
Hi dilithiumtoys,

I can understand if you feel the question was not answered well. I did note in my response that it was not specific to OpenXML. You are under no obligation to close the question so soon and/or to accept any answers.

One problem is this is July 4 weekend in the US, so many people are not here to see the question. We can either re-open the question or you can ask a related question while being very specific in the wording what you want to see. And you can note "Please, only answers from people with OpenXML experience in SQL 2000"

I hope that helps.

There are many articles and links on the net about OpenXML performance issues, which I'm sure you've Googled already, and this supports my findings with other XML implementations such as Oracle. It is intrinsic to XML parsing itself, which is why I discouraged using it _if_ possible. Sometimes XML is the solution. We found it worked well for dynamic forms and reports. But it has to be tested. You cannot really go on someome else's advice here before trying it yourself. My XML is different than yours. You will certainly find the CPU load to be higher in a production app than if regular table structures are used, since an XML doc has to be re-parsed every time. The parse overhead is one of the problems. With regular tables, there is no parse.

I wish you good luck,


Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.