I have a table with 500 million rows. It is not practical to split the table, because a number of queries need to extract data that relies on the complete data set being present.
There are 2 defining parameters; call the one the Identity, and the other Time. For each Identity (of which there are possibly 20,000 unique Identities) there are multiple unique Time-stamped rows (i.e. Each Identity has a sub-array with unique time-stamps).
When I update the table I need to check the Identity-Time combination does not already exist; if it does I do not update (i.e. if there is an existing Identity-Time pair, and a same-value line is attempted to be added, it is discarded because the logic of the system says no new Time-Identity pair can be different to one already posted to the system).
I currently have a Clustered index on Time (Descending), and Identity.
The problem is this means the database is physically sorted with Identity (000001) and all its Time-data, then Identity (000002) and all its Time-data, and so forth. So when I check a pair exists the result is quick. But every insert of a new Time for identity (1234567) requires a re-order because of the Clustered index.
Yes, practically SQL handles the paging better than that, but the principle is I have huge IO as a result of the insertion of new data. And we are adding data at a rate of 100,000 rows a day!
I understand there is a thing called a Horizontally-partitioned Index. I have never used them but am hoping it is possible to partition Identity and only the last Day of time-data so inserts of new data happen near the start (end) of the database / Index, and not throughout the large database depending on the physical order that specific Index is subject to.
I have tried using a normal (non-clustered) index, but found the performance of the clustered index vastly improves other queries. So I hope using a partitioned index will give me the best of both worlds. the problem though is I don't understand the construction of a partitioned index.
I have also tried using the Database (and query) tuning tools included in SQL 2014. I first removed every existing index, and then created a few insert statements representative of the normal operation. the problem is the Tuning Wizards are unable to recommend even a single index. I don't know if this is a bug in SQL 2014, or a result of the huge database, but a number of attempts and no recommended indexing.
If I run the queries with "Show Execution Plan" ON then there are recommended indexes to include, but none are partitioned; they are all standard non-clustered indexes specific to the variables being checked, and updated.