Link to home
Start Free TrialLog in
Avatar of a. cloudcare
a. cloudcareFlag for Singapore

asked on

I am looking for an expert in data warehousing in sql server. I am designing the partition strategy. Need help in that

I am an DBA required to design a partition strategy of a new data warehouse for my concern.  Since the development team is interested in partitioning by country code. I want to know if thats a good idea. Also i would like to know the factors to be considered for involving partition for a three country application along with the backup strategy for the same.

Expected data growth of the db in 7 years is 7 TB
Avatar of lcohan
lcohan
Flag of Canada image

In my opinion you need to determine first if partitioning is the best option to move forward or to split the active portion of data from inactive portion as essentially this would be the main reason for SQL Server data partitioning in my opinion.

Why do you think this is a problem?
What kind/type of data is it?
How does active data differ from inactive data?
How big are a those tables to be partitioned by number of rows and data size(row length in bytes)? - my guess is that 7TB is the whole db size.
Is your SQL server maintenance (reindex, backups, dbcc, etc...) too long/heavy on the SQL Server?
Do you have constantly large volumes of Imports or Deletes?
side question - Do you have already SQL Enterprise Edition as partitioning is available in Enterprise only?

Partitioning is primarily used to help with maintenance/archiving/managing large volumes of data where you can speed up loading and archiving of data by using switch in/out functions, where you can perform maintenance operations on individual partitions instead of the whole table. As side effect and not primary reason for partitioning  you may be able to improve query performance to some level.
I hope you will find good info and advise at links below to help you decide which way to go - maybe clustered key on the country code could be enough? Maybe row compression could be enough combined with spitting the database tables on multiple physical files/filegroups?


https://www.brentozar.com/archive/2012/03/how-decide-if-should-use-table-partitioning/
More useful info to help deciding which way to go may be found here: https://www.mssqltips.com/sqlservertip/1914/sql-server-database-partitioning-myths-and-truths/
Partitioning a 7TB database on a key with 3 values doesn't feel right.  If we assume that most of your transactional data is in the partitioned table, what's the intent with separating only the transactional data by country?

How is the data used?  Is the OLTP also used for reporting or does it feed an OLAP (data warehouse) that is the primary reporting database?

If the OLTP is also the reporting database it may make more sense to partition the database by year, not country.  If the partition key is the posting year, you'll have about 1T per partition, a lot less overhead in index maintenance, and reports that focus on current year data would be more efficient.

That may not be optimal, but it may be better.  Without knowing what your database, data, and usage requirements are all we can offer is speculation.  Database design in the dark is at best an educated guess.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.