Solved

Handling large portions of data in SQL Server

Posted on 2011-02-13
4
358 Views
Last Modified: 2012-05-11
Hi,

we are trying to implement an archiving system that keeps up to 15 years of data. The data will be imported daily through files, each of them is 2 GB.

-      I am here concerned with the performance issues that may result during retrieving this data
-      This data is going to be transformed from files to relational DB in order to facilitate the reporting and the retrieval functionalities. each file will include 1 million - 3 million of records
-      15 users will connect to this data, in a parallel manner, in order to perform various search queries
-      No summary reports (statistical) is requested

Is it proper to use SQL Server to handle this? What about the Hardware specifications that should be considered? What is the best techniques that should be considered when structuring this data?

Thanks
0
Comment
Question by:asim252
  • 2
4 Comments
 
LVL 51

Accepted Solution

by:
Ted Bouskill earned 250 total points
ID: 34885240
Well if you can normalize the data it will help tremendously.  I took a metrics database that had over 100 million records and was about 100 GB. and by normalizing I got it down to less then 5 GB with no data loss.  Of course at a smaller size there is less disk I/O to retrieve records and it's less expensive to add more indexes so all the queries were substantially faster.  At that size I could snapshot the database to give directly to some people to analyze the data on their laptop.

If you are aggregating daily sums then create a simply data mart with daily totals but don't denormalize all the detailed data for because then you counter-act the benefit of the normalization.

For example, I had a data mart that stored daily metrics for bug counts (I used to work at a company with thousands of developers) and the datamart tables, had the day, product ID, number of new bugs, fixed bugs, closed bugs (and 2 or three others) but no other detail.  Then if a drill down occurred we could drill into other datamart tables (bugs per team) or even the raw data itself.

Without knowing more we can't predict the type of hardware required.
0
 
LVL 38

Assisted Solution

by:Jim P.
Jim P. earned 250 total points
ID: 34885371
Another question is how much has to be online?

We only needed the last three months online and the rest could be stored offline. So we moved the last years data to a secondary DB in March, backed it up and then from there dumped it to CD/DVD. Three copies are stored -- one onsite, the other at our DR offsite, the third in Iron Mountain.

Another thought -- would partitioning help?
0
 
LVL 22

Expert Comment

by:8080_Diver
ID: 34889080
I believe that partitioning would be of tremendous help.  You could partition based upon the years (CurrentYear, CurrYrMinus1, CurrYrMinus2, etc.)  there are some scripts on line that could be adapted to automatically riple the partitions (removing the CurrYrMinus14).  

I also agree that, to the extent you can normalize the data, you probably should.  Normallizing the data should provide space and speed benefits, especially if you are going to provide some sort of web (intra- or inter-net) accessibility.
0
 
LVL 38

Expert Comment

by:Jim P.
ID: 34889248
You could partition based upon the years (CurrentYear, CurrYrMinus1, CurrYrMinus2, etc.)

Depending on the amount of data -- I would suggest doing it quarterly or monthly. It shouldn't significantly increase overhead significantly.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL Query 2 61
SHOWPLAN permission denied in database 'AdventureWorks'. 13 99
Delete from table 6 47
Email Notifications for SQL 2005 9 27
This article will describe one method to parse a delimited string into a table of data.   Why would I do that you ask?  Let's say that you need to pass multiple parameters into a stored procedure to search for.  For our sake, we'll say that we wa…
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
This Micro Tutorial will give you a basic overview how to record your screen with Microsoft Expression Encoder. This program is still free and open for the public to download. This will be demonstrated using Microsoft Expression Encoder 4.
Established in 1997, Technology Architects has become one of the most reputable technology solutions companies in the country. TA have been providing businesses with cost effective state-of-the-art solutions and unparalleled service that is designed…

786 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question