I am working with a mysql database of stock market transactions. I currently have a table for each day's ticks. There are approximately 14 million ticls per day, hence each table has about 14 million rows.
I have indexes on SerialNumber (bigint unique) and Symbol (varchar 20).
Executing a simple query like
Select datetime FROM ticks WHERE symbol='ibm';
returns about 6000 rows, but can take up to 20 seconds to execute. Running the same query again executes almost instantly, suggesting that the results are buffered.
What can I do to make this faster?
I have considered separating the more active symbols into a separate database, or even using a table per symbol. This would almost certainly make the queries faster, but at the expense of having a huge number of tables - there are, including futures and options, some 300,000 symbols. Most of them have little or no data, but all of them have to be accomodated.
I suppose I could have a table with pointers to which table contains the actual data, only having entries for the more active issues which would be stored in separate tables. If a symbol doesn't have an entry in this index, then its data is stored in the main table.
I guess a question that needs to be answered is how many tables is too many tables?
This, by the way is running on Windows XP with a Raid 0 array. Core 2 Duo Processor, 2G ram, and the computer is not doing much other than running this database and collecting data from our service.
These are, by the way, MyISAM tables.
It just seems to me that this should run faster.
Any suggestions? And, while we're at it, would this work better in MS SQL server?
by: Raynard7Posted on 2007-09-11 at 17:14:29ID: 19873069
Hi,
One of the most important thins is what you are doing with your table as far as queries as well as how you are running your hardware setup.
There is much you can do to optimize a database using your configuration settings to get the most out of it.
When you mention that you have indexes on the symbol then is this a composite index or just that field?
Sometimes you can get better improvement by simply creating a lookup table for each symbol then using their numeric index to do searches, as the indexes work better when they have numeric data to search on.
Also, how regularly are you optimizing your tables and rebuilding your indexes? if you are constantly adding new rows then you can end up with problems as far as having data that is not indexed added to the end of the file and as such this then needs to be trolled through one row at a time.
there is never too many tables but you need to consider what sorts of queries you will be running and how you are going to take these into account, also the method that you are getting your data is important to see if it is feasible to be writing data and splitting it across many tables.
Are all your fields in the table fixed width? if so then you should see much better performance of your indexes.
you also need to consider are you going to be doing date queries? or is this simply for the stock index, you have not mentioned if the datetime field is actually indexed, if not then this would mean that ever single ibm record for example would need to be checked to find the date.
Finally, depending on your space requiremnts I would say that once mysql 5.1 exits beta then its partitioning would be perfect for your situation. you could have the same data in two tables, one could have a partition for each date and one could have a partition for each stock; the partitions would then in effect be separate tables but you could query the table like it was just one. at the moment this is probably the one advantage that MSSQL has over mysql; as you could then receive your data, put the current day's data in one location, then write and arrange your partitions and indexes of a night; meaning everything will be fully optimized for you