Strategy to solve a large table problem
Posted on 2011-10-05
I have a table which is a many to many child of two tables in my data model. It is very large, 21 million rows and occupies about 14GB. 7GB is index, 7GB is data roughly.
The table could be split into 14 tables representing 14 types of data stored in it. They all have the same data model but splitting this way would allow me to drop one int field and a lot of teh indexing and hence query plans would be similar. If I did this split it would cost be about one week of work, and the biggest resulting table would have 8 million records. The indexing would be much smaller because each of the 14 resulting tables only needs a subset of the indexing currently used.
I've tried running it through the database tuning wizard. I ended up with 12GB of indexes, about 40 separate Create Stats statements, and some queries were still slow.
So then I resorted to my own query optimisation work. Being an old hand at query optimisation I can many any query run in a few seconds with the right index, but when I've done that across all queries, I end up with 8GB of indexes and Sql gets confused about which indexes to use and will often choose the wrong one. It even seems necessary to repeat the same indexes with a different column order - hard to understand why sql can't adapt its query plans to use an existing index.
The queries are generated dynamically by the product so I don't want to get into only allowing certain "ways in"
One thing I've not tried yet is clustered views so I would be interested to know if any one has had success with this.
Any general thoughts on other strategies, approaches, experience in dealing with problems with large tables like this would be appreciated