Solved

SQL Server: Table partitioning with big data

Posted on 2014-02-26
7
911 Views
Last Modified: 2016-03-23
Hello,

I've a big data table which contains 4500 million records.
So, I would like to have table partitioning.

Could you please share your thoughts, what shall be the optimum number of records per partition?

Or do you have other thoughts?

Best Regards,
Mohit Pandit
0
Comment
Question by:MohitPandit
  • 2
  • 2
  • 2
  • +1
7 Comments
 
LVL 13

Assisted Solution

by:sameer2010
sameer2010 earned 100 total points
ID: 39891187
0
 
LVL 4

Expert Comment

by:michaelalphi
ID: 39891188
When size of the tables become very large, partitioning of table is a significant way to improve their performance.
To accomplish this task, there are two different approaches.
Either create a brand new partitioned table and then simply copy the data from your existing table into the new table and do a table rename. Otherwise, you can partition the table in place simply by rebuilding or creating a clustered index on the table.
You can check these links to get further assistance :
http://technet.microsoft.com/en-us/library/ms188730.aspx
Partitioning large tables and indexes
How to partition an existing SQL server table
A good resource you can also explore here if you wish to go with automated solution : Table partitioning with Big data
0
 
LVL 5

Author Comment

by:MohitPandit
ID: 39891199
Thanks for valuable feedback.

But my question was what is optimum number of records per partition.

Should I've 450 partition and each partition shall be having 10 million records/index (if create index)?

Best Regards
0
Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

 
LVL 4

Assisted Solution

by:michaelalphi
michaelalphi earned 150 total points
ID: 39891246
0
 
LVL 13

Expert Comment

by:sameer2010
ID: 39891947
It actually depends on your use. PArtitions may help in certain scenarios and may not so much in other. Above blogs by Michael definitely are a good source for this information.
0
 
LVL 24

Accepted Solution

by:
DBAduck - Ben Miller earned 250 total points
ID: 39893468
After you read the articles or blogs above, you can start by asking yourself questions.

How selective is the data you have in the table.  Are there millions of rows that have no real partitioning key that would group them in blocks of rows?

Do you know which column would be a good partitioning key?

Will that column that you selected be used in the queries? (if it is not then you have successfully partitioned a table, but you may not get as much benefit out of it since it cannot partition eliminate without using the key in the query)

Will data be inserted or updated in ALL partitions (based on key) or will it be a Date type key that will render all past partitions read only or at least non updated or inserted or deleted?

These should lead you to answers for your data.  Because we don't know your data, in some cases 1 Million rows would be fine per partition, but remember that if this is a growing table and you partition by date, the more recent partitions will most likely be bigger than the previous partitions because your are preferably growing the company or data not shrinking.

In some cases you could not tolerate more than 100,000 rows in the tables because of the way they are queried.

Also keep in mind that on an x64 box (which I imagine you have with that many rows) you can have 15,000 partitions max.  So if you put 4.5 Billion rows in 15,000 partitions with no room for another partition, you get the minimum rows you can have is 300,000 rows.  So that may help you determine how many you should be looking at.
0
 
LVL 5

Author Closing Comment

by:MohitPandit
ID: 39921105
thanks
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question