Solved

Handling large portions of data in SQL Server

Posted on 2011-02-13
4
336 Views
Last Modified: 2012-05-11
Hi,

we are trying to implement an archiving system that keeps up to 15 years of data. The data will be imported daily through files, each of them is 2 GB.

-      I am here concerned with the performance issues that may result during retrieving this data
-      This data is going to be transformed from files to relational DB in order to facilitate the reporting and the retrieval functionalities. each file will include 1 million - 3 million of records
-      15 users will connect to this data, in a parallel manner, in order to perform various search queries
-      No summary reports (statistical) is requested

Is it proper to use SQL Server to handle this? What about the Hardware specifications that should be considered? What is the best techniques that should be considered when structuring this data?

Thanks
0
Comment
Question by:asim252
  • 2
4 Comments
 
LVL 51

Accepted Solution

by:
tedbilly earned 250 total points
ID: 34885240
Well if you can normalize the data it will help tremendously.  I took a metrics database that had over 100 million records and was about 100 GB. and by normalizing I got it down to less then 5 GB with no data loss.  Of course at a smaller size there is less disk I/O to retrieve records and it's less expensive to add more indexes so all the queries were substantially faster.  At that size I could snapshot the database to give directly to some people to analyze the data on their laptop.

If you are aggregating daily sums then create a simply data mart with daily totals but don't denormalize all the detailed data for because then you counter-act the benefit of the normalization.

For example, I had a data mart that stored daily metrics for bug counts (I used to work at a company with thousands of developers) and the datamart tables, had the day, product ID, number of new bugs, fixed bugs, closed bugs (and 2 or three others) but no other detail.  Then if a drill down occurred we could drill into other datamart tables (bugs per team) or even the raw data itself.

Without knowing more we can't predict the type of hardware required.
0
 
LVL 38

Assisted Solution

by:Jim P.
Jim P. earned 250 total points
ID: 34885371
Another question is how much has to be online?

We only needed the last three months online and the rest could be stored offline. So we moved the last years data to a secondary DB in March, backed it up and then from there dumped it to CD/DVD. Three copies are stored -- one onsite, the other at our DR offsite, the third in Iron Mountain.

Another thought -- would partitioning help?
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 34889080
I believe that partitioning would be of tremendous help.  You could partition based upon the years (CurrentYear, CurrYrMinus1, CurrYrMinus2, etc.)  there are some scripts on line that could be adapted to automatically riple the partitions (removing the CurrYrMinus14).  

I also agree that, to the extent you can normalize the data, you probably should.  Normallizing the data should provide space and speed benefits, especially if you are going to provide some sort of web (intra- or inter-net) accessibility.
0
 
LVL 38

Expert Comment

by:Jim P.
ID: 34889248
You could partition based upon the years (CurrentYear, CurrYrMinus1, CurrYrMinus2, etc.)

Depending on the amount of data -- I would suggest doing it quarterly or monthly. It shouldn't significantly increase overhead significantly.
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Recently, when I was asked to create a new SQL 2005 cluster, Microsoft released a new service pack for MS SQL 2005 what is Service Pack 3. When I finished the installation of MS SQL 2005 I found myself troubled why the installation of SP3 failed …
by Mark Wills Attending one of Rob Farley's seminars the other day, I heard the phrase "The Accidental DBA" and fell in love with it. It got me thinking about the plight of the newcomer to SQL Server...  So if you are the accidental DBA, or, simp…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now