Pros and Cons of Composite Clustered Indexes

Posted on 2011-04-27
Last Modified: 2012-06-22
We have a Datawarehouse table with the following Primary Key and associated Clustered Index:

ALTER TABLE [dbo].[PosSaleTransactionDerivedCategory] ADD  CONSTRAINT [PK_PosSaleTransactionDerivedCategory] PRIMARY KEY CLUSTERED
      [PK_PosSaleTransactionDerivedCategory] ASC,
      [OutletID] ASC,
      [CompanyID] ASC,
      [DateID] ASC,
      [DistrictID] ASC,
      [Shop1SupplierID] ASC,
      [Shop2SupplierID] ASC,
      [DesignID] ASC,
      [TypeID] ASC

1) Given that PK_PosSaleTransactionDerivedCategory is and IDENTITY field that auto-increments by 1, is there any point having all of these fields as part of the primary key?

2) Given that PK_PosSaleTransactionDerivedCategory is and IDENTITY field that auto-increments by 1, is there any point having all of these fields as part of the clustered index?

3) Does the sequence of the fields in the clustered index matter from a performance point of view? (Note that there is a significant difference in the cardinality of each of the fields above, with DateID having a very high cardinality and TypeID having fewer than 10 distinct values.)

Question by:batbertram

    Author Comment

    Quick remarks:

    The table has 20m rows.

    DateID does currently have the highest cardinality (500 distinct values) but it is likely that OutletID will ultimately have a higher cardinality.

    Often queries will be dealing with a DateID range while they will never deal with a range over any of the other fields.
    LVL 18

    Accepted Solution

    There is a good article on MSDN about clustered indexes:

    The important parts based on your question are:
    Clustered indexes are not a good choice for:
        * Columns that undergo frequent changes - This results in the entire row moving (because SQL Server must keep the data values of a row in physical order). This is an important consideration in high-volume transaction processing systems where data tends to be volatile.
        * Wide keys - The key values from the clustered index are used by all nonclustered indexes as lookup keys and therefore are stored in each nonclustered index leaf entry.

    I personally would look seriously at making the Clustered index just with the DATEID field.  Having such a complete clustered index is going to make inserts very expensive, and you may have more space used by indexes than by the data itself.


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    How to improve team productivity

    Quip adds documents, spreadsheets, and tasklists to your Slack experience
    - Elevate ideas to Quip docs
    - Share Quip docs in Slack
    - Get notified of changes to your docs
    - Available on iOS/Android/Desktop/Web
    - Online/Offline

    JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
    Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
    Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
    This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    11 Experts available now in Live!

    Get 1:1 Help Now