Are any of your joins between the tables in the different DBMS? If so, you will never get good performance with tables 0f 10 lakhs without some clever staging, some good middleware, or an industrial-strength ETL/EAI tool.
You definitely need indexes for tables that size. Can you go into each of the source databases and build the indexes there? This still leaves the problem of cross-dbms joins, which will be very slow if you need them.
The best solution I can think of is to create a staging area on the fastest and biggest dbms platform you have (Oracle would always be my first choice), and either stage all of the necessary source data in an indexed warehouse structure, or stage 'join tables' providing indexed joins between the tables in different dbms that you need to join - then you can query the indexed join tables and generate subset queries to apply to the source tables. Staging brings up the ETL/mirroring problem of keeping the staging area updated with changes in the source tables. How volatile is this source data? If it is volatile, do you need instant updates (mirroring), or batch updates (ETL)?
I always recommend www.datamirror.com for data-mirroring and ETL because their tools do the best job I know, but they are not cheap. You could alternatively write it in PL/SQL, or even java, but you will end up with a sizeable piece of code to maintain, and keep in step with the source meta-data - are the source applications still in development, with metadata that is likely to change?
Main Topics
Browse All Topics





by: KokaPosted on 2002-01-26 at 12:58:16ID: 6758284
Some general thoughts:
1. As I understand you can not reorganization the structure of databases for some reasons, but why can not you add indexes where applicable?
2. I do not think PL/SQL will be of much help with Informix and MS SQL :). Better use C++ or Java (or anything you know well enough to use multithreading) to write the search module, and use at least three separate threads that collect information (and rank if possible) from the three databases simultaneously and insert results into the special table created in any of three (which is more convenient for you). Now that you did collect results in a single table, apply your ranking algorythm if you have not ranked records upon collection.
Of course, the most benefit will be from indexing