Thanks for your response Glenn.
- Allocating 3 files/day might work, the only problem is blocking all threads while the files are created and db pointers swapped. However, its worth a try and if nothing else works I would go for this.
- Real RDBMS: wouldnt that be slower? with the SQL layer and the data type checking that an RDBMS has to do (among other things). I actually tried this before I decided to go with BerkeleyDB and it was horribly slow. 25000 inserts/sec would be too much for RDBMS...rt?
- Yeah usability is paramount. A whole bunch of applications/servers are hungry for this realtime tick data. I am storing everything in the JSON format which is small and portable.
I have acheived significant performance improvement with the same 1 database by optimizing (and then reoptimizing) my code. However, its still not enough to handle occasional bursts of 40000 ticks/sec.
I think I have hit a limit on code optimizations. I am not going to upgrade to a quad core 3GHz... that might help... rt?
thanks
Nishant
Main Topics
Browse All Topics





by: GnsPosted on 2008-12-21 at 04:37:58ID: 23221699
I don't think your analysis of why the writes are quite right... I think it is _much_ simpler than that. Likely the initial seek just take a huge time to complete, so each write progressively slows... And for each tick, the write will be just a tad slower....
Now, your second problem is that your wxcessive use of "DB files" simply lock a lot of memory. So that is a no-no.
I'd do one of a few different things:
- Allocate 3 database files/day. Programmatically switch what one to write to depending on the time ... so that you get three equal sized files over the workday.
- Switch to a real RDBMS. It would need some attention, so that a huge amount of writes don't make it keel over.... My "quant-users" used to do this for Reuters... They only saved about 50 ticks/second to our Oracle DB, but ... that was enough to put some hefty pressure on the server. The accumulated data, to be used for some time series or similar, wasn't used much (thankfully), since that would've been the real killer;-).
- Focus on usability. If the saved data cannot be used, don't save it;-):-). Massage it as best you can, then save the results.
Cheers
-- Glenn