Avatar of websss
websssFlag for Kenya

asked on 

Convert existing table to partitioned table

I have a large busy database where 99.9% of data lives in 1 table
IOT devices report their time series data into this table
This table is on 7 billion rows and around 3.5TB
The Database is running out of space on its current drive.
I have another drive available that I would like to start saving new data for this one table

The challenge here is doing all this with no, or very little down time (max 30 mins)
I need a strategy to save this new data.

Here is idea 1:
1. create new table called IotSensorData_New which is same as the main table, but has partitioning turned on and lives on new drive (the ID identity seed is a concern here)
2. rename current large table to "IotSensorData_Old", then rename empty "IotSensorData_New" to "IotSensorData"
3. turn on data importers so data starts ingesting into this new table (again ID is concern here as it will start from 1)
4. convert the old the "IotSensorData_Old" data into the current table - i'm unsure exactly how to do this bit


Here is idea 2:
1. turn off data importers
2. convert the current table to partitioning (I understand indexes need to be rebuilt if I do this which would probably take forever.
3. turn on importers


I'm open to ideas and suggestion for the steps to take to achieve this

sql server 2017 ent
Microsoft SQL Server

Avatar of undefined
Last Comment
lcohan
Avatar of lcohan
lcohan
Flag of Canada image

I would partition the existing database table/storage plus additional new filegroup(s) sitting on new physical file on new storage and the essential part would be to find the partitioning column (yes it is only 1(one) column that you can use) as this partitioning column will decide in which "bucket" your data will be placed. This is critical because you don't want any data movement while you add the new partition for your downtime to be minimal right? Meaning the table existing data will ideally sit where it is now and any new inserted (or update) will go into the same table, same table name but on the new partition which sits on the new filegroup which sits on the new physical file/storage.

https://www.sqlservercentral.com/articles/creating-partition-on-existing-tables-and-rolling-partitions
Avatar of arnold
arnold
Flag of United States of America image

First, I would suggest you address the duration of how long you need to keep old data
A staggered design that stages/ages.data
You could use the initial entry table as immidiate, limited to a short duration while a trigger on insert replicates the same sata in a longer duration storage.

Your partition choice is based on time when entered, or device, date

If time is of the essence, I would age out oldest data before getting them out of that table, delete.
This will involve zero downtime.

It would however, unlike the partitioning option,would require that you modify your reporting, data extraction code to access the old data in a new table.

While partitioning will span multiple file based onthe criteria, not sure it can be located indifferent locations as part of the partitioning mechanism.
Avatar of lcohan
lcohan
Flag of Canada image

Simply put what Arnold's is saying is similar to my suggestion to your previous question https://www.experts-exchange.com/questions/29175635/advice-and-help-moving-to-partitioning.html where I suggested you to split the active portion of data from inactive by using an "archive" like table(s) but then indeed you would have to either query 2 or more tables now or add a view on the top of them..
Avatar of websss
websss
Flag of Kenya image

ASKER

Thanks, but this is a massive amount of work to do, i'm really looking for a solution without touching webapp.

One thing i've read online is that i can tell sql server about the new drive, and it will start writing data to this new place, which sounds a lot easier and quicker?
Avatar of arnold
arnold
Flag of United States of America image

Please post where and what you read to see if that applies to your situation.
ASKER CERTIFIED SOLUTION
Avatar of lcohan
lcohan
Flag of Canada image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Add a file to the existing filegroup containing that table.  
Limit the size of the existing data files to their current size, which will force all data to the new file.

what is the clustering key of the existing table?  If it's ( insert_datetime, $IDENTITY ) [which it should be] or ( $IDENTITY ) [which it should not, but will work here] you can create the new table and copy the existing rows up to the last current in clustered key order while still using the existing table.  It may take a long time, but that's not really an issue, it will have the absolute max up time.
Avatar of websss
websss
Flag of Kenya image

ASKER

thanks
The primary key is PkId (identity seed)
The clustered index is unique (SensorId, DateTime, PkId)

If I copy the existing rows to the new location, i'm in the same position with a full drive as most data exists on one table
Avatar of arnold
arnold
Flag of United States of America image

Given your situation and the amount of data flowing in you are the same person who posted on a load distribution between two servers.

A revisit of your receiver and data storage is in order.

Much depends on what the data is and what the data conveys, how the data is used, how much of the data is needed, in what girl etc.

Whether data can be crunched, analyzed reports generated, data dumped after a period is met.

The moving of the data using ..

Working ranked live receiving data, trigger inserts the same data into a DB/table then you have a warehouse DB where this data cleared extraneous, unneeded columns, retaining only info of importance.

Potentially if you have a period lit beyond which the data need to be cleared, does this process exist, aging old entries by deleting them.?

Adding anothe file as a fine group could defer but could result in compilation.

If you have a test environment, test add another file as a file group, restrict the current MDB. Pump data into see how it will be handled

Then see if the front end works as expected.....

How are you currently backing up your data?
,
Avatar of websss
websss
Flag of Kenya image

ASKER

thanks
We actually plan to move to a time series database (timescaleDB) which includes all the compression etc.
We have done extensive testing on this platform and its the right choice
However to get to this point, its going to take time, so we are just looking for a short term fix for a running out of space issue.

Currently the .bak and .diff files are going onto a cloud drive
Avatar of arnold
arnold
Flag of United States of America image

2.5TB?

IMHO, copying put the oldest data that is outside the retention window and then removing it from the production would be my approach to cover a just in case the duration was too small.

A perl script that establishes two connections to each DB, run a select on one, the iterate through onteoter.

You can setup a n SP that quests data from one table and inserting it into the other db's table that are stored..

The issue, complexity if you have dependencies. The device list is fixed so it can be synced daily, weekly.
SOLUTION
Avatar of lcohan
lcohan
Flag of Canada image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
Microsoft SQL Server
Microsoft SQL Server

Microsoft SQL Server is a suite of relational database management system (RDBMS) products providing multi-user database access functionality.SQL Server is available in multiple versions, typically identified by release year, and versions are subdivided into editions to distinguish between product functionality. Component services include integration (SSIS), reporting (SSRS), analysis (SSAS), data quality, master data, T-SQL and performance tuning.

171K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo