Link to home
Start Free TrialLog in
Avatar of kkamm
kkammFlag for United States of America

asked on

SQL CHECKSUM as a way to index a large text-oriented table with frequent INSERTS and ocaasional SELECTS?

I have a SQL Server 2008 database table that accepts auditing information. This information is primarily text-based but has a numeric field that acts as a foreign key for SELECT operations. Most of the activity on this table is INSERT-based (no UPDATES though) with an occasional [SELECT WHERE] query to generate reports.

Currently there is a numeric IDENTITY PK autonumbering column with a clustered index on it. The table is constantly growing, nearing 1 million rows, and my concern is that INSERT performance will degrade over time due to index updating,etc.

My question is 2-fold:

1) Is there any benefit to even having an index for this table given the nature of the activity it receives?

2) If an index IS used then would a CHECKSUM column based on the other fields be a candidate for a non-clustered index without adversely affecting performance?
ASKER CERTIFIED SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of kkamm

ASKER

The occasional 'WHERE' involves one INT field that usually returns a small number (50-100) of rows. There are text fields that are returned but they are not being used as part of the WHERE clause.

It looks like a CHECKSUM field index would be beneficial if I had text data that was lengthy and recreatable on the query side (e.g DNA sequences, boilerplate legal language, etc.) and subject to frequent SELECT WHERE or UPDATE operations. The text fields I have are all free-form and would only be queried for keywords, if at all.
In that case (keywords), checksum would not help you. Full-text search perhaps is the better option.
Well, given the new bits of information, I still stand by my original post.

I would keep the identity as a clustered key and given the selection of about 100 from 1000000 rows, would be tempted to add a second index on that int column otherwise you will likely be doing table scans everytime.

There is no need (nor use) for checksum as far as I can ascertain, and certainly no performance increase over the current.

Just make sure your database is well sized and you keep control of any excessive fragmentation.
Avatar of kkamm

ASKER

Yeah - the CHECKSUM seems like the wrong tool here.

I enabled a NCI on the INT field I mentioned and the SELECT WHERE was predictably faster whereas a single row INSERT was slightly slower. Given the majority of activity is INSERT statements I may just keep the PK index for now and enable the NCI if I see more SELECT activity in the future.

Thanks for the help.