Solved

SQL 2008 Indexing issues on a large table with many searchable fields

Posted on 2013-11-22
5
219 Views
Last Modified: 2013-12-06
I am working on a project where there is a very large table (28,827,861 rows w/ 32,160.711 MB data space)

90% of the 132 fields in this table are queriable (unfortunately, please do not ask why).  I spend last 3 days trying to get the most out of indexing.  I realize I can't capture verey scenario, but if I can get most I will be happy.  I am not sure how to go about asking for help here as I do not want to put the layout of my table.  And I do not want to rename 132 fields to display here.  But basically I been using non-clustered indexing.  Clustered did not help and takes forever to index.  If I create an optimal index per specific query it's great instant run.  However I can't do that for every query.  I tried creating a non-clustered index where I placed 16 (max allowed) fields to ON clause, and added about 100 fields to INCLUDE clause of the indexing clause.  This took about an hour to create index.  and helped some queries, but not all my test case queries.  In any case, I am wondering from our DB guru's is this an impossible task, or what is recommended for large table with many selectable fields?  If you can respond with plain easy to understand explanations I would appreciate it, or I would be spending a day deciphering your answers as I am not a DB expert, just sufficient knowledge to get by.  Thanks in advance for your help.
0
Comment
Question by:CPOINT2000
5 Comments
 
LVL 39

Expert Comment

by:lcohan
Comment Utility
In my opinion the best is to have the "key" of the row in an index - no more than 2/3 columns and unfortunately in SQL (unlike ORACLE) case the order of them matters and INCLUDE all rest of the columns. Only the first 1 and 2 (maybe 3rd) columns makes the difference in an index and if you don't have a clustered index on the table this may work for you.

I am faced with similar issue on the SQL backend hit by LLBGEN generated code where clients can search on any column in a particular set of tables (products, orders) and is impossible as you said to cover all possible scenarios. What helped me tremendously and I warmly recommend it is for you to use SQL Own Performance Dashboard reports and that one has a Missing Index report sortable by different criteria and you can quickly see what indexes are missing that can give you the biggest performance improvement.  


http://blogs.technet.com/b/rob/archive/2009/02/18/performance-dashboard-reports-for-sql-server-2008.aspx
http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/sql-server-performance-dashboard-reports-in-ssms-introduction-install-datediff-error-amp-modified-for-sql-2008.aspx
http://blogs.msdn.com/b/arvindsh/archive/2010/06/25/performance-dashboard-reports-in-sql-server-2008.aspx
0
 
LVL 50

Expert Comment

by:Lowfatspread
Comment Utility
to assist we have to understand more of the background to the table.

how is it updated
   batch...
   online...

 frequency of inserts/updates/deletes

what do you mean by queriable?

are the columns code values ... and queries just look for individual values,or ranges of values, or lists of values, or searched using LIKE or via functions  (e.g... substring(x,1,10)='xxx'

in most cases it is preferable to have a clustering index defined on a table

what types of queries are being run...

return a few rows, return many rows,  
return counts, sums, averages , across sets of data...

how many simultaneous users are there...

what is the server configuration

is the table partitioned?
are the columns sparse ?
what in general are the data types for the columns?

have you considered indexed views?
0
 

Author Comment

by:CPOINT2000
Comment Utility
updated in batch
weekly updates
majority of the fields can be selected and be in where clause
some fields are range checks, some use wildcard, some exact match, some with "IN" clause
adhoc queries run, can be any combination
returns few to many.  when return over 500 lines, extraction comes back as zipped csv file
currently few simultaneous users 2-5, but later 25-50
Windows server 2008 R2 standard, 12GB ram, 64bit OS, intel Xeon CPU X560 @ 2.67GHz (2 processors)
Table is not partitioned.  I tried but did not get good results and due to time sensitivity did not have excessive time to test it thoroughly
Data types are mixed string,numeric,date.  mostly string and numeric, some of numeric are large format currency
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 500 total points
Comment Utility
You must have a clustered index; SQL handles heaps poorly, so they can perform poorly no matter what you do to them.  The best clustered index is the critical part of tuning queries anyway.  Otherwise, as you've discovered, you must have covering indexes for virtually every query, and that's almost never practical.


Please post the results of these queries, including headings, exactly as SQL returns them:


DECLARE @table_name sysname
SET @table_name = '<very_large_table_name_goes_here>' --<<-- chg to your table name

--missing indexes, according to SQL
    SELECT
        GETDATE() AS capture_date,
        DB_NAME(mid.database_id) AS Db_Name,
        OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
        mid.equality_columns, mid.inequality_columns, mid.included_columns,
        migs.*,
        mid.statement, mid.object_id, mid.index_handle
    FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
    LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
        mig.index_handle = mid.index_handle
    LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
        migs.group_handle = mig.index_group_handle
    --order by
        --DB_NAME, Table_Name, equality_columns
    WHERE
        OBJECT_NAME(mid.object_id) = @table_name


-- existing index usage
SELECT
    ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id) AS table_name, i.name AS index_name,
    FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
    (SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days,
    dps.row_count,
    fk.Reference_Count AS fk_ref_count,
    ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
    ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
    ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
    ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
    ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
    o.object_id = i.object_id
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
    dps.object_id = i.object_id AND
    dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
    ius.database_id = DB_ID() AND
    ius.object_id = i.object_id AND
    ius.index_id = i.index_id
LEFT OUTER JOIN (
    SELECT
        database_id, object_id, MAX(user_scans) AS user_scans,
        ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
    FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
    WHERE
        database_id = DB_ID()
        --AND index_id > 0
    GROUP BY
        database_id, object_id
) AS ius2 ON
    ius2.database_id = DB_ID() AND
    ius2.object_id = i.object_id
LEFT OUTER JOIN (
    SELECT
        referenced_object_id, COUNT(*) AS Reference_Count
    FROM sys.foreign_keys
    WHERE
        is_disabled = 0
    GROUP BY
        referenced_object_id
) AS fk ON
    fk.referenced_object_id = i.object_id
WHERE
    OBJECT_NAME(i.object_id) = @table_name
0
 

Author Comment

by:CPOINT2000
Comment Utility
Thanks I will check these out and let you know
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now