We help IT Professionals succeed at work.

C# Querying Large Amount of data from database

Hi,

I have an acess database that stores historical FX prices for EURUSD. It stores 1 mintue data going back to 2002 so there is approx 3,500,000 records with 11 fields.

The code I'm am writting is C# in .Net (please see below) I am querying this data over a period of time say 3 months & obviously require the data be sorted and then added to a chart which is an add in from http://www.dotnetcharting.com. I should also mention this is an ASP.NET web application.

The issue is the time it takes. Firstly it takes to long. Secondly it times out when query more than one months data. So I would like to know if there is a more efficent way to load the data into the chart or query data from an a database?

Thanks,

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Data;
using System.Data.OleDb;
using System.Data.SqlClient;
using System.IO;
using System.Drawing;
using System.Drawing.Drawing2D;
using dotnetCHARTING;
using dotnetCHARTING.Mapping;
using System.Text;

string cn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=";
string dbPath = @"C:\Users\user\Documents\Markets\";

private SeriesCollection GetPriceData(Chart fxChart, DateTime dtFrom, DateTime dtTo, string FXPair)
        {
            OleDbConnection dbConnection = new OleDbConnection(cn + dbPath + "dbFX_1Min_db.mdb");
            dbConnection.Open();

            OleDbCommand SQLQuery = dbConnection.CreateCommand();

            SQLQuery.CommandText = @"SELECT [DTime], Round(([Close_B]+[Close_A])/2,5) AS Close_Mid
                                    FROM " + FXPair +
                                    " WHERE (([DTime])>=#" + dtFrom + @"# 
                                    And ([DTime])<=#" + dtTo + @"#) 
                                    ORDER BY [DTime];";

            OleDbDataReader PriceReader = SQLQuery.ExecuteReader();

            SeriesCollection priceCollection = new SeriesCollection();
            Series priceSeries = new Series();

            if (PriceReader.HasRows)
            {
                while (PriceReader.Read())
                {
                    Element priceElement = new Element();
                    priceElement.YValue = (Double)PriceReader["Close_Mid"];
                    priceElement.XDateTime = (DateTime)PriceReader["DTime"];
                    priceSeries.AddElements(priceElement);
                }
            }

            PriceReader.Close();
            dbConnection.Close();

            priceCollection.Add(priceSeries);

            return priceCollection;

        }

Open in new window

Comment
Watch Question

Commented:
you need to check the status of indexes on the database for the table. What are the indexes on the table and how fragmented they are?

Author

Commented:
Thanks for the reply.

Sorry what do you mean 'status of indexes'?

In the database there is only one table. This table contain 11 fields (DTime, Date, Time, Open Bid, Open Ask, High Bid, High Ask, Low Bid, Low Ask, Close Bid, Close Ask)
AndyAinscowFreelance programmer / Consultant

Commented:
APart from indexs one other point to check is:  Round(([Close_B]+[Close_A])/2,5)

If you replaced that with [Close_B] as Close_Mid what effect does that have on the overall time?  (I know it would be bad design BUT you may have to consider storing the result of the calculation in the table instead of running it each time you want the record.


You have two problems.  The second seems to be filling the SeriesCollection.  Is there sorting disabled (or enabled) on it - you have the data presorted so if sorting is enabled that results in massive overhead for each new item being added.
Commented:
check what is the primary key in the table. Generally primary key are self indexed. The folllowing link will give you more insight.
http://www.tek-tips.com/faqs.cfm?fid=1752

Since you are quering on DTime , try creating an index on DTime.
You can also consider creating partition on the table based on DTime field.

Author

Commented:
Hi Andy,

Thanks for the response. I will try give that ago - having an extra field Close Mid instead of calculating it when I run the query.

On filling the SeriesCollection are you saying that SeriesCollection have a sorting option & that if this is the case that I can disable it?

When loading the chart I only run the query once, would breaking it down ie repeating the query a few times instead make any difference or would this be worse?

Thanks
AndyAinscowFreelance programmer / Consultant

Commented:
>>are you saying that SeriesCollection have a sorting option

I don't know if it does have an inbuilt sorting option - just IF it does then make certain it is switched off.
AndyAinscowFreelance programmer / Consultant

Commented:
>>I will try give that ago - having an extra field Close Mid instead of calculating it when I run the query.

Don't add a new field yet, just test the performance by bringing back an existing field instead of this calculation.  Maybe it won't make much difference to the overall time of the query.

Author

Commented:
Thanks for the replies guys. I'm going into a fun filled meeting for the next two hours. However will implent the ideas suggested here although this will be by tomorrow.

Thanks again
NorieAnalyst Assistant

Commented:
How will you be charting the data?

Will it be in 1-minute intervals, 5-min intervals...?

Have you considered doing some sort of grouping in the query, which could cut down the amount of data.

Another thing you could try is setting up the query in Access and running that instead of writing and executing an SQL statement in the code.

These are the sort of thngs you would do if you were just doing this in Access/Excel and they usually help speed things.

Which part of the code is it that slowing things down, the data retreival part or the charting part?
Most Valuable Expert 2012
Top Expert 2014

Commented:
I think asking Access to store 3.5 million rows of data and then also asking it to sort it and run complex queries against it is cruelty. Can you use SQL Server?