MSSQL Architecture & Design: Improving Report Performance XML Columns

Posted on 2011-04-19
Last Modified: 2012-05-11
Techies, Gurus and Wizards--
One of my colleagues has been working with a 3rd party product which stores  xml data in an xml defined column. As this 3rd party product continues to accrue more and more data, the reports which run off this data have gone from 10 minutes to nearly 2 hours.  I've agreed to look at this with him this week. I'm not sure what I'll find, but I suspect whatever the search criteria is for the where clauses filtering these reports will be dependent upon isolating string values found against specific nodes stored within the xml docs in the xml column.

For those of you who've encountered similar issues, how did you approach isolating the problem? If I do discover that values within the xml doc are used to filter data for the reports what approach do I take?

In reading the MS literature on XML data types, I see that there are primary and secondary indexes available. I'm a little confused on the clustered XML primary. Does this mean that if a table already has something like a primary, clustered integer identifier that it is still possible to create an XML primary, clustered key in addition or not?

In any case, I will be grateful for your suggestions on how to apply best practices to resolving poor performance issues against XML data stored in SQL Server 2005!:-)
Question by:ditallop
    1 Comment
    LVL 8

    Accepted Solution

    To answer your question: Yes. XML indexes are not directly related to 'classic' indexes (however a classic clustered is a prerequisite). This is infact quite logical since XML fields are kind of 'subrecordsets' embedded in a table.

    Just add the right XML indexes and it will work fine.

    Featured Post

    Do You Know the 4 Main Threat Actor Types?

    Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

    Join & Write a Comment

    Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
    Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
    This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
    Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

    755 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    17 Experts available now in Live!

    Get 1:1 Help Now