Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1470
  • Last Modified:

Handling Huge Data insert and retrieval from MS SQL Server 2005

Hii,
I have developed a Software for that inserts large data in to a  SQL Server 2005 table. I am facing issue when extracting the data from the table using  queries, it takes hours to retrieve data from the table. My table columns are
TABLE [history_data](
        [INT_DATETIME] [bigint] NOT NULL,
        [INT_POINTID] [int] NOT NULL,
      [REAL_VALUE] [float] NOT NULL,
      [INT_QUALITY] [smallint] NULL,
 CONSTRAINT [PK_history_data] PRIMARY KEY CLUSTERED
(
      [INT_DATETIME] ASC,
      [INT_POINTID] ASC
)

I have 15000 Points changing values every second. So I am inserting these 15000 records every second. Hence there will be more than 1 GB data in 1 hour and hence every day more than 20GB data will be there in the table.  Is there any issue in storing such huge data with 1 table ?  
Please help me how I can retrieve data fast from this table. I tried BCP command which seems to be faster than the direct select sql query, but that also takes time in hours to get the values. What should be the way to fetch data in seconds ?
0
Rooh
Asked:
Rooh
  • 3
  • 2
2 Solutions
 
Harish VargheseProject LeaderCommented:
Please provide more details. How many records are there in your table now? What is(are) your SELECT query/queries?
0
 
RoohAuthor Commented:
Hi,
15000 records in one second is inserted. That means 129,60,00,000 records in a day. And we need to store data for 90 days :(

My query is : select REAL_VALUE, INT_QUALITY from HISTORY_DATA where INT_POINTID=9999 and INT_DATETIME>20140330145022968 and INT_DATETIME<20140403145022968
0
 
Scott PletcherSenior DBACommented:
The query looks perfect overall.  You probably want to add "WITH (NOLOCK)" if you are dealing only with historical data to reduce overhead from the SELECT.

Also, make sure you have a large auto-grow amount on the data files.  (Make sure you have IFI turned on, although that affects only INSERT speed).

If possible, make sure the log never needs to dynamically grow but has already been pre-allocated.
0
Learn Veeam advantages over legacy backup

Every day, more and more legacy backup customers switch to Veeam. Technologies designed for the client-server era cannot restore any IT service running in the hybrid cloud within seconds. Learn top Veeam advantages over legacy backup and get Veeam for the price of your renewal

 
Harish VargheseProject LeaderCommented:
1.2 Billion records per day.. 116 billion for 90 days.. that's really huge (atleast for me!). I have not dealt with such huge data (or anywhere near to that). Okay, so you have 15000 points which will insert 1 record each per point per second.. which means 3600*24 = 86400 records per point per day, that means 7,776,000 records for 90 days. And you always query for a single Point at a time.
select REAL_VALUE, INT_QUALITY from HISTORY_DATA where INT_POINTID=9999 and INT_DATETIME>20140330145022968 and INT_DATETIME<20140403145022968

Open in new window

So, I would suggest you give emphasis on PointID rather than DATETIME column. Try creating a NON-CLUSTERED index as below which should INCLUDE the columns you want to SELECT. This means your storage requirement will be double, but you may get faster results. Please try and provide your results.
CREATE NONCLUSTERED INDEX IDX_HISTORY_DATA_POINTID
ON HISTORY_DATA (INT_POINTID, INT_DATETIME)
INCLUDE (REAL_VALUE, INT_QUALITY)

Open in new window

Note that you have all columns of your table in your index too, which means that your INSERT into the table may longer than before (may be even double). And if you already have 90 days of data, I am not sure how long the index creation may take.
And yes, as @ScottPletcher suggested, use "WITH (NOLOCK)" in your FROM clause (FROM HISTORY_DATA WITH (NOLOCK)).

-Harish
0
 
RoohAuthor Commented:
Thank you.  This has improved the performance.  I think this is the better optimization I can do with the existing table design.  For a requirement like retrieving data in some seconds or atleast less than 5 minute time, I need to redesign the History table structure. May be storing data in more than one table in the database. Or depending on data stores other than RDBMS !
0
 
Harish VargheseProject LeaderCommented:
Yes, using multiple tables may also help. Can you explain what did you do now and how much was the improvement in performance. I think, you can keep one table for current day, which will have data appending every second, and another history table for data for past 90 days till yesterday. As long as you have a good index, you may not see considerable increase in performance by splitting data across multiple tables or even doing table partitioning. But that will be helpful to distribute your storage requirement across multiple drives, if that is a concern.

-Harish
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now