SQL 2008 Indexing issues on a large table with many searchable fields

I am working on a project where there is a very large table (28,827,861 rows w/ 32,160.711 MB data space)

90% of the 132 fields in this table are queriable (unfortunately, please do not ask why).  I spend last 3 days trying to get the most out of indexing.  I realize I can't capture verey scenario, but if I can get most I will be happy.  I am not sure how to go about asking for help here as I do not want to put the layout of my table.  And I do not want to rename 132 fields to display here.  But basically I been using non-clustered indexing.  Clustered did not help and takes forever to index.  If I create an optimal index per specific query it's great instant run.  However I can't do that for every query.  I tried creating a non-clustered index where I placed 16 (max allowed) fields to ON clause, and added about 100 fields to INCLUDE clause of the indexing clause.  This took about an hour to create index.  and helped some queries, but not all my test case queries.  In any case, I am wondering from our DB guru's is this an impossible task, or what is recommended for large table with many selectable fields?  If you can respond with plain easy to understand explanations I would appreciate it, or I would be spending a day deciphering your answers as I am not a DB expert, just sufficient knowledge to get by.  Thanks in advance for your help.
CPOINT2000Asked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Scott PletcherConnect With a Mentor Senior DBACommented:
You must have a clustered index; SQL handles heaps poorly, so they can perform poorly no matter what you do to them.  The best clustered index is the critical part of tuning queries anyway.  Otherwise, as you've discovered, you must have covering indexes for virtually every query, and that's almost never practical.


Please post the results of these queries, including headings, exactly as SQL returns them:


DECLARE @table_name sysname
SET @table_name = '<very_large_table_name_goes_here>' --<<-- chg to your table name

--missing indexes, according to SQL
    SELECT
        GETDATE() AS capture_date,
        DB_NAME(mid.database_id) AS Db_Name,
        OBJECT_NAME(mid.object_id /*, mid.database_id*/) AS Table_Name,
        mid.equality_columns, mid.inequality_columns, mid.included_columns,
        migs.*,
        mid.statement, mid.object_id, mid.index_handle
    FROM sys.dm_db_missing_index_details mid WITH (NOLOCK)
    LEFT OUTER JOIN sys.dm_db_missing_index_groups mig WITH (NOLOCK) ON
        mig.index_handle = mid.index_handle
    LEFT OUTER JOIN sys.dm_db_missing_index_group_stats migs WITH (NOLOCK) ON
        migs.group_handle = mig.index_group_handle
    --order by
        --DB_NAME, Table_Name, equality_columns
    WHERE
        OBJECT_NAME(mid.object_id) = @table_name


-- existing index usage
SELECT
    ius2.row_num, DB_NAME() AS db_name, OBJECT_NAME(i.object_id) AS table_name, i.name AS index_name,
    FILEGROUP_NAME(i.data_space_id) AS filegroup_name,
    (SELECT DATEDIFF(DAY, create_date, GETDATE()) FROM sys.databases WHERE name = 'tempdb') AS sql_up_days,
    dps.row_count,
    fk.Reference_Count AS fk_ref_count,
    ius.index_id, --ius.user_seeks + ius.user_scans AS total_reads,
    ius.user_seeks, ius.user_scans, ius.user_lookups, ius.user_updates,
    ius.last_user_seek, ius.last_user_scan, ius.last_user_lookup, ius.last_user_update,
    ius.system_seeks, ius.system_scans, ius.system_lookups, ius.system_updates,
    ius.last_system_seek, ius.last_system_scan, ius.last_system_lookup, ius.last_system_update
FROM sys.indexes i WITH (NOLOCK)
INNER JOIN sys.objects o WITH (NOLOCK) ON
    o.object_id = i.object_id
LEFT OUTER JOIN sys.dm_db_partition_stats dps WITH (NOLOCK) ON
    dps.object_id = i.object_id AND
    dps.index_id = i.index_id
LEFT OUTER JOIN sys.dm_db_index_usage_stats ius WITH (NOLOCK) ON
    ius.database_id = DB_ID() AND
    ius.object_id = i.object_id AND
    ius.index_id = i.index_id
LEFT OUTER JOIN (
    SELECT
        database_id, object_id, MAX(user_scans) AS user_scans,
        ROW_NUMBER() OVER (ORDER BY MAX(user_scans) DESC) AS row_num --user_scans|user_seeks+user_scans
    FROM sys.dm_db_index_usage_stats WITH (NOLOCK)
    WHERE
        database_id = DB_ID()
        --AND index_id > 0
    GROUP BY
        database_id, object_id
) AS ius2 ON
    ius2.database_id = DB_ID() AND
    ius2.object_id = i.object_id
LEFT OUTER JOIN (
    SELECT
        referenced_object_id, COUNT(*) AS Reference_Count
    FROM sys.foreign_keys
    WHERE
        is_disabled = 0
    GROUP BY
        referenced_object_id
) AS fk ON
    fk.referenced_object_id = i.object_id
WHERE
    OBJECT_NAME(i.object_id) = @table_name
0
 
lcohanDatabase AnalystCommented:
In my opinion the best is to have the "key" of the row in an index - no more than 2/3 columns and unfortunately in SQL (unlike ORACLE) case the order of them matters and INCLUDE all rest of the columns. Only the first 1 and 2 (maybe 3rd) columns makes the difference in an index and if you don't have a clustered index on the table this may work for you.

I am faced with similar issue on the SQL backend hit by LLBGEN generated code where clients can search on any column in a particular set of tables (products, orders) and is impossible as you said to cover all possible scenarios. What helped me tremendously and I warmly recommend it is for you to use SQL Own Performance Dashboard reports and that one has a Missing Index report sortable by different criteria and you can quickly see what indexes are missing that can give you the biggest performance improvement.  


http://blogs.technet.com/b/rob/archive/2009/02/18/performance-dashboard-reports-for-sql-server-2008.aspx
http://blogs.msdn.com/b/sqlserverfaq/archive/2010/05/27/sql-server-performance-dashboard-reports-in-ssms-introduction-install-datediff-error-amp-modified-for-sql-2008.aspx
http://blogs.msdn.com/b/arvindsh/archive/2010/06/25/performance-dashboard-reports-in-sql-server-2008.aspx
0
 
LowfatspreadCommented:
to assist we have to understand more of the background to the table.

how is it updated
   batch...
   online...

 frequency of inserts/updates/deletes

what do you mean by queriable?

are the columns code values ... and queries just look for individual values,or ranges of values, or lists of values, or searched using LIKE or via functions  (e.g... substring(x,1,10)='xxx'

in most cases it is preferable to have a clustering index defined on a table

what types of queries are being run...

return a few rows, return many rows,  
return counts, sums, averages , across sets of data...

how many simultaneous users are there...

what is the server configuration

is the table partitioned?
are the columns sparse ?
what in general are the data types for the columns?

have you considered indexed views?
0
 
CPOINT2000Author Commented:
updated in batch
weekly updates
majority of the fields can be selected and be in where clause
some fields are range checks, some use wildcard, some exact match, some with "IN" clause
adhoc queries run, can be any combination
returns few to many.  when return over 500 lines, extraction comes back as zipped csv file
currently few simultaneous users 2-5, but later 25-50
Windows server 2008 R2 standard, 12GB ram, 64bit OS, intel Xeon CPU X560 @ 2.67GHz (2 processors)
Table is not partitioned.  I tried but did not get good results and due to time sensitivity did not have excessive time to test it thoroughly
Data types are mixed string,numeric,date.  mostly string and numeric, some of numeric are large format currency
0
 
CPOINT2000Author Commented:
Thanks I will check these out and let you know
0
All Courses

From novice to tech pro — start learning today.