Solved

Handling Huge Data insert  and retrieval from MS SQL Server 2005

Posted on 2014-04-03
6
1,310 Views
Last Modified: 2014-04-11
Hii,
I have developed a Software for that inserts large data in to a  SQL Server 2005 table. I am facing issue when extracting the data from the table using  queries, it takes hours to retrieve data from the table. My table columns are
TABLE [history_data](
        [INT_DATETIME] [bigint] NOT NULL,
        [INT_POINTID] [int] NOT NULL,
      [REAL_VALUE] [float] NOT NULL,
      [INT_QUALITY] [smallint] NULL,
 CONSTRAINT [PK_history_data] PRIMARY KEY CLUSTERED
(
      [INT_DATETIME] ASC,
      [INT_POINTID] ASC
)

I have 15000 Points changing values every second. So I am inserting these 15000 records every second. Hence there will be more than 1 GB data in 1 hour and hence every day more than 20GB data will be there in the table.  Is there any issue in storing such huge data with 1 table ?  
Please help me how I can retrieve data fast from this table. I tried BCP command which seems to be faster than the direct select sql query, but that also takes time in hours to get the values. What should be the way to fetch data in seconds ?
0
Comment
Question by:Rooh
  • 3
  • 2
6 Comments
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 39974607
Please provide more details. How many records are there in your table now? What is(are) your SELECT query/queries?
0
 

Author Comment

by:Rooh
ID: 39974627
Hi,
15000 records in one second is inserted. That means 129,60,00,000 records in a day. And we need to store data for 90 days :(

My query is : select REAL_VALUE, INT_QUALITY from HISTORY_DATA where INT_POINTID=9999 and INT_DATETIME>20140330145022968 and INT_DATETIME<20140403145022968
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 200 total points
ID: 39975316
The query looks perfect overall.  You probably want to add "WITH (NOLOCK)" if you are dealing only with historical data to reduce overhead from the SELECT.

Also, make sure you have a large auto-grow amount on the data files.  (Make sure you have IFI turned on, although that affects only INSERT speed).

If possible, make sure the log never needs to dynamically grow but has already been pre-allocated.
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 
LVL 12

Assisted Solution

by:Harish Varghese
Harish Varghese earned 300 total points
ID: 39977445
1.2 Billion records per day.. 116 billion for 90 days.. that's really huge (atleast for me!). I have not dealt with such huge data (or anywhere near to that). Okay, so you have 15000 points which will insert 1 record each per point per second.. which means 3600*24 = 86400 records per point per day, that means 7,776,000 records for 90 days. And you always query for a single Point at a time.
select REAL_VALUE, INT_QUALITY from HISTORY_DATA where INT_POINTID=9999 and INT_DATETIME>20140330145022968 and INT_DATETIME<20140403145022968

Open in new window

So, I would suggest you give emphasis on PointID rather than DATETIME column. Try creating a NON-CLUSTERED index as below which should INCLUDE the columns you want to SELECT. This means your storage requirement will be double, but you may get faster results. Please try and provide your results.
CREATE NONCLUSTERED INDEX IDX_HISTORY_DATA_POINTID
ON HISTORY_DATA (INT_POINTID, INT_DATETIME)
INCLUDE (REAL_VALUE, INT_QUALITY)

Open in new window

Note that you have all columns of your table in your index too, which means that your INSERT into the table may longer than before (may be even double). And if you already have 90 days of data, I am not sure how long the index creation may take.
And yes, as @ScottPletcher suggested, use "WITH (NOLOCK)" in your FROM clause (FROM HISTORY_DATA WITH (NOLOCK)).

-Harish
0
 

Author Closing Comment

by:Rooh
ID: 39993442
Thank you.  This has improved the performance.  I think this is the better optimization I can do with the existing table design.  For a requirement like retrieving data in some seconds or atleast less than 5 minute time, I need to redesign the History table structure. May be storing data in more than one table in the database. Or depending on data stores other than RDBMS !
0
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 39993512
Yes, using multiple tables may also help. Can you explain what did you do now and how much was the improvement in performance. I think, you can keep one table for current day, which will have data appending every second, and another history table for data for past 90 days till yesterday. As long as you have a good index, you may not see considerable increase in performance by splitting data across multiple tables or even doing table partitioning. But that will be helpful to distribute your storage requirement across multiple drives, if that is a concern.

-Harish
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to setup several different housekeeping processes for a SQL Server.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now