continuously extending a table in SQL Server

We have a large data warehouse database where we continuously get new rows inserted in 5 different tables, at the left-hand side of the b-tree (=at end of the table)

This cause a lot of page splits. Paul Randall calls these "good" page splits as in they do not cost much when inserting the data.

But when we query sys.dm_db_index_physical_stats we see an avg_fragment_size_in_pages just under 8, namely the 8 8kB pages that can fit into one 64kB extend. A table with 50000 pages has 6300 fragments. Not very "good" page splits anymore!

The only way I know we can bring these extends together (in order to get better read-ahead), is to REBUILD the table (or table partition).

Are there other options?
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

hspoulsenAuthor Commented:
One solution might be to create a file group for each large table, and only to have the clustered key in file group, not any non-clustered indexes.

But the hard disk might get a bit fragmented here, when each file group has to grow.

I would need to build a Restore script and keep it at hand for disaster recovery purposes.
Rebuild indexes would have to go into another new file group, otherwise 50% of disk space would be sitting there unused.
Partition the table so that only the last, most current partition gets inserted. This way the split will affect only the last partition and any rebuild will be less resource consuming.

Also make sure that you have a cluster index that always grows in value with every insert.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
hspoulsenAuthor Commented:
I say left, but I mean right-hand side of the index.
The table is clustered on an Bigint Identity(1,1), and it is an ascending index.

We have considered partitioning in batches of 2 million rows.
Just not implemented it yet.

But that will also require frequent REBUILDs.
Those are not as bad as the current situation, but it's still a lot of I/Os.

Is there any way I can design the system such that the table is not built with inherrent fragmentation?
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Fragmentation is caused by updates and not by inserts. If the cluster always grows then the rows will always be appended to the pages and files. One other reason is if the row size is large, fat table. You rebuild the index ONLY if is fragmented.

However fragmentation will most likely occur on an "active" table so I recommend you to use Olla Halegren's maintenents scripts that deal with the index optimization. It will apply the correct operation based on the index fragmentation:

<10% - ignore
between 10 and 30 reorganize
over 30 - rebuild

It deals perfectly with partition scenario as well and will optimize indexes only on the partitions where they are fragmented. This approach reduces drastically the actual rebuild/reorganize process.

It is coupled with statistics updates, which also will be done only if the columns/indexes were changed.For instance there is no need for statistics update after a rebuild index

SQL Maintenance plans don't come even close with this kind of flexibility and control level and it will apply the rebuild/reorganize regardless if needed or not.

Here is the link:

Read the documentation and create a job with the stored procedure that will execute daily over night. For the first time it will run longer time but after that it will take minutes.

We deal with a giant table with 200M rows and 200GB in size, partitioned by year(between 10-35M rows per partition) and the index optimization takes minutes every day.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
What's the purpose of a clustered index on an increasing value never changing? None. It's not clear whether you should use such a clustered index at all - some say yes, some no. If you aren't selecting most of the table anyway, a non-clustered index might be a better idea, as maintaining that is less costly than doing index splits in data tables (and hence having to change non-clustered indexes, if existing).
David ToddSenior DBACommented:

I second the vote for Ola's script.

hspoulsenAuthor Commented:
I posted a comment on Zberteoc and Qlemo postings yesterday, but now it has disappeared? I'm working hard at trying to conquer this problem, so I do follow this thread.

@Zberteoc: The table is only seeing inserts. The table has about 6300 fragments and about 50000 pages in total. Don't you call this fragmented? Yes, each fragment is 64 KB (an Extend), but SQL Server will not be able to do much read-ahead!

@Qlemo: The index is not bad, as it supports most of the daily/weekly work on the table.
But the fragmentation is bad, and statistics is difficult to keep updated. Partitioning does help (we've tried it on some of the tables), but it does not reduce the problem, not eliminate it.

We do use Ole Hallengrens procedures, and have been since April 2009.
Yes, it works well.

Best regards,
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
The question is: does it affect performance if the table gets fragemented to some extend, assuming the index reorg/rebuild is done regularly?

You can keep the table unfragmented only if you do not use a clustered index (and only add rows). A corresponding non-clustered index will be fragmented, but that is "less harmful" - however, I can't recall if a non-clustered index will help for read-ahead.
hspoulsenAuthor Commented:
@Qlemo: Well, all the REBUILD jobs do take up lots of I/Os.
We will focus on getting the partitioning finished, to reduce the I/Os needed to do the REBUILDs.

Our biggest fact table is currently at 4 billion rows, and we add 80000 rows every hour on average. We run the insert statement as frequent as possible, preferable multiple time every hour.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.