I have a time series
database with the the following columns
id BIGINT -- PRIMARY KEY,
imei_number BIGINT NOT NULL,
gps_datetime datetime NOT NULL,
raw_data varchar NULL
The are currently 4 billion rows
in this one table, and growing by 1 billion a month.
I need to do a couple of things
1) I would like to partition
/ split the data onto different drives, perhaps "now to 3 months" on current drive, and "3+ months and >" to a different drive.
2) I would like to compress
the older data "3+ > months" for example,
On the compression point, all queries I do are virtually the same
for example, these are typical queries
Where imei_number = 1234
And gps_datetime between 'date1' and 'date'
and event_id in 1,5,14 --this is sometimes not used
My Clustered index is on the following (inc order)
I also have a non clustered index on
and a non clustered index (unique primary key) on
I dont have much experience with compression, but I've been trying out postgres with timescaledb and they have a column store compression option where you define a segment by key, i've used imei_number as SEGMENT BY
(so it sticks everything in b-tree/disc by imei number first) and the compression is outstanding (180 x), and query performance the same as normal row store.
On my SQL server I've run some stats on page/row compression and the saving are not groundbreaking (about 20-30% based on an SP that estimates)
Therefore I'd like to try column store compression but not sure if it has something similar to Segment by
where it orders everything by IMEI_NUMBER in b-tree/disc
Can anyone suggest some direction and options for me to try please, i need to have a much higher compression rate but not kill performance
Luckily data older than 3 months is rarely queried
its on Sql server 2017 enterprise.