Solved

SQL Server: Table partitioning with big data

Posted on 2014-02-26
7
897 Views
Last Modified: 2016-03-23
Hello,

I've a big data table which contains 4500 million records.
So, I would like to have table partitioning.

Could you please share your thoughts, what shall be the optimum number of records per partition?

Or do you have other thoughts?

Best Regards,
Mohit Pandit
0
Comment
Question by:MohitPandit
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 13

Assisted Solution

by:sameer2010
sameer2010 earned 100 total points
ID: 39891187
0
 
LVL 4

Expert Comment

by:michaelalphi
ID: 39891188
When size of the tables become very large, partitioning of table is a significant way to improve their performance.
To accomplish this task, there are two different approaches.
Either create a brand new partitioned table and then simply copy the data from your existing table into the new table and do a table rename. Otherwise, you can partition the table in place simply by rebuilding or creating a clustered index on the table.
You can check these links to get further assistance :
http://technet.microsoft.com/en-us/library/ms188730.aspx
Partitioning large tables and indexes
How to partition an existing SQL server table
A good resource you can also explore here if you wish to go with automated solution : Table partitioning with Big data
0
 
LVL 5

Author Comment

by:MohitPandit
ID: 39891199
Thanks for valuable feedback.

But my question was what is optimum number of records per partition.

Should I've 450 partition and each partition shall be having 10 million records/index (if create index)?

Best Regards
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 4

Assisted Solution

by:michaelalphi
michaelalphi earned 150 total points
ID: 39891246
0
 
LVL 13

Expert Comment

by:sameer2010
ID: 39891947
It actually depends on your use. PArtitions may help in certain scenarios and may not so much in other. Above blogs by Michael definitely are a good source for this information.
0
 
LVL 24

Accepted Solution

by:
DBAduck - Ben Miller earned 250 total points
ID: 39893468
After you read the articles or blogs above, you can start by asking yourself questions.

How selective is the data you have in the table.  Are there millions of rows that have no real partitioning key that would group them in blocks of rows?

Do you know which column would be a good partitioning key?

Will that column that you selected be used in the queries? (if it is not then you have successfully partitioned a table, but you may not get as much benefit out of it since it cannot partition eliminate without using the key in the query)

Will data be inserted or updated in ALL partitions (based on key) or will it be a Date type key that will render all past partitions read only or at least non updated or inserted or deleted?

These should lead you to answers for your data.  Because we don't know your data, in some cases 1 Million rows would be fine per partition, but remember that if this is a growing table and you partition by date, the more recent partitions will most likely be bigger than the previous partitions because your are preferably growing the company or data not shrinking.

In some cases you could not tolerate more than 100,000 rows in the tables because of the way they are queried.

Also keep in mind that on an x64 box (which I imagine you have with that many rows) you can have 15,000 partitions max.  So if you put 4.5 Billion rows in 15,000 partitions with no room for another partition, you get the minimum rows you can have is 300,000 rows.  So that may help you determine how many you should be looking at.
0
 
LVL 5

Author Closing Comment

by:MohitPandit
ID: 39921105
thanks
0

Featured Post

U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Use this article to create a batch file to backup a Microsoft SQL Server database to a Windows folder.  The folder can be on the local hard drive or on a network share.  This batch file will query the SQL server to get the current date & time and wi…
Big data transfers via information superhighways require special attention and protection. Learn more about the IT-regulations of the country where your server is located. Analyze cloud providers and their encryption systems for safe data transit. S…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question