change Identity column from int to bigint with least downtime

websss
websss used Ask the Experts™
on
I have a large table with ID identity seed int (primary key)

Its currently at 1.8 billion rows and growing rapidly
When this hits 2.1 billion rows it will reach the max int size.
Therefore I need to change it to bigint with minimal down time

I've read a LOT of articles online about doing this
I have 3 indexes on this table which references the ID, so I know these will need to be dropped and recreated

My current theory is this
1. Create new column NewId Bigint
2. copy the primary key/int values to this new column so it matches
3. drop all indexes, and unset primary key
4. Drop old ID column
5. Rename "NewId" to "ID"
6. make the new ID column primary key, and identity seed (even though they already contain Id's ?)
7. re-create indexes

Is this solution plausible? what are the issues with the above?
Sql server 2017 enterprise.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
That solution is not a very good solution. When I did it with that many rows, we did the following.

1. Create a new table with the Bigint column instead of int
2. Enable Change Tracking on the table
3. Copy all the data into the new table with IDENTITY_INSERT ON
4. BEGIN TRAN
5. Checked Change Tracking to see if any rows were inserted, deleted, updated and performed those changes to the other table
6. Renamed current table to _old and renamed the side table to the original name
7. COMMIT TRAN

The problem with changing or adding columns that are fixed width, even in Enterprise edition is that they are added and change the table for each row so it takes a very long time with that many rows, so alternatives need to be done.

The drawback to the above, is that you house 2 copies of the rows until you drop the _old table when everything is good.

No really easy way to change the datatype of a fixed width column without blocking, except a strategy that looks like above.
Senior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
I'll assume the PK is also your clustering key.  You didn't state it but that's the usual method.

If so:

1) Create a new table, let's call that one dbo.your_table_NEW, with the only index = PK as bigint.  Specify: an appropriate compression factor, if available (at least ROW, PAGE if SQL shows it will be beneficial, it usually is, often very beneficial); a good fill factor, such as 98 or 99.  Don't create any non-clustered indexes yet.

2) If the db is in SIMPLE mode, then:
A) Load the new table in batches.  How big a batch should be depends on your system's I/O capacity.  Let's say you decide on 10M rows at a to,e.  Before you insert data to the new table, review your current log size.  Pre-allocate additional log size in stages to the size you'll need to handle that batch size.
B) Set up a loop to load 10M rows at a time of the existing rows to the new table, in clustered key order.  Specify WITH (TABLOCK) on the table being loaded in the INSERT statement.
C) Once those rows are loaded, you'll use a transaction and do one more load with all rows with new ids created since you started loading the table.  Once that is done, you can rename the old table -- say dbo.your_table_OLD -- and rename the _NEW to the current table name, dbo.your_table.

3) If the db is in FULL/BULK_LOGGED mode, you'll also have to take log backups between each batch load.

If you'd like to take this approach, let me know and we can flesh it out some more for you.
websssCEO

Author

Commented:
Thanks Scott

Actually the clustered index is one you helped me with a long long time ago

Currently I have 3 indexes
Id, DeviceId, GpsDateTime - Clustered Index, Unique
Id - (int, identity seed) = Primary Key, Unique, Non-Clustered
Id, DeviceId, EventId - non-Unique, Non-Clustered

The DB is currently in SIMPLE recovery mode

I'm not sure if that affects your solution?
Build an E-Commerce Site with Angular 5

Learn how to build an E-Commerce site with Angular 5, a JavaScript framework used by developers to build web, desktop, and mobile applications.

Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
It would, to the extent that you create only the clustered index in the new table before its initial load, so the ( Id, DeviceId, GpsDateTime ) one.  You'd create the other, nonclustered indexes only after the new table, the one with a bigint Id column, is fully loaded.

I agree that a new table is the best way to go.  Trying to update that size of table in place would be a nightmare.
websssCEO

Author

Commented:
Thanks, I'll run some tests on this


So your original comment at the end of point 1 said : Don't create any non-clustered indexes yet.

I assume I create a clustered Index at this  point before I sering any datsz?
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Create the clustered index before loading any data to the table.  Then load all the data.  Finally create all the nonclustered indexes.

You're on Enterprise Edition so you can create the nonclustered index using WITH ( ONLINE = ON ), and you should, so that the table is not locked during that time.
websssCEO

Author

Commented:
Regarding this comment from Scott
1) Create a new table, let's call that one dbo.your_table_NEW, with the only index = PK as bigint.  
Specify: an appropriate compression factor, if available (at least ROW, PAGE if SQL shows it will be beneficial, it usually is, often very beneficial); a good fill factor, such as 98 or 99.  

I've done a create script from the old table, but it does not mention anything about compression
only this
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 99) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Open in new window


I've not used compression before so wondering what and how to do this?
websssCEO

Author

Commented:
Also another question regarding reading 10M rows are a time
Its taking 6 mins to read these rows when I try and do this
select * from [table] where [id] between 980000000 and 990000000

Open in new window


Since the table is not updated (just inserts), can i use some sort of NO LOCK  on the select so the system is still usable when the data is being copied over?
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Are you doing this in SSMS and returning the results?  Data transfer and display in SSMS takes a long time.  As a test, change the "Tools", "Options", "Query Results", "Results to Grid" to check "Discard results after execution", then open a new query window and re-try the query.
websssCEO

Author

Commented:
Yes i'm running the query directly in SSMS, I'll test the above

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial