complex sql query and performace

I am trying to understand how does databases like Oracle handles sql query that has many where (and) clause specially when databases has 1000's of entries.

Does it break it many smaller queris and then merge or something else?

example) consder a DB that has 1000 records
 give me all records if department=account, AND age>=50 and job_level=4.

simple ugly way is to compare sequential all 1000 records but that would be bad if we have million entries. Does the Database maintains some additional structures also for faster processing
Who is Participating?
Database designers must analyze queries used and create appropriate indexes. (Some data engines can do it automatically.)

It is not necessary to create indexes for all possible data combinations but the query time is significantly reduced even when just part of the WHERE clause is optimized. The rest is then processed sequentially.

So if you have
WHERE x=1 AND y=2 OR z="text" etc. etc.
engine could use just index on x column and index on y column. (No combination is necessary, two separate indexes are OK.)

If I simplify it each index contains the indexed values and pointers to original data. SQL engine can count the number of x values equal to 1 and retrieve a set of pointers then the engine can count number of y values equal to 2 and combine previously obtained pointers with this new set etc. etc. If the number of pointers is too large then the engine may decide not to use such index for optimization. If the number is small then engine can decide to stop further optimization.

The pointers combination depends on operators used.
x=1 AND y=2  means you have to do intersection between two sets of pointers
OR operator means you have to do an union operation.

And to do an intersection with a small number of pointers is very fast. FoxPro does not use pointers but single bits which is even better...
Check all about Optimizer here
Steve WalesSenior Database AdministratorCommented:
Relational Database Management Systems (RDBMS) usually have a thing called the Optimizer inside the engine.  Tables usually (or should) have indexes on them.  Indexes are effectively access paths to data.  

Heavily queried patterns should have indexes on them - that way the optimizer (which knows about the indexes) can find the required rows quickly.

Sometimes there's no choice but to scan the entire table, since you can't access everything though.

It's a fairly broad question, but at a very high level that's what happens.

You can do an EXPLAIN PLAN for any query in Oacle to find out what goes on under the covers and it will tell you what the optimizer worked out and how it is going about finding the data.
Network Scalability - Handle Complex Environments

Monitor your entire network from a single platform. Free 30 Day Trial Now!

perlperlAuthor Commented:
>>Sometimes there's no choice but to scan the entire table, since you can't access everything though.

So consider if a database has million users (which is not primary key) and I query my record (which lets say is not not common as I haven't queried for month), does that mean it has to scan all million entries to match WHERE clause of my username? that would be disaster in case of DB
The basic optimization uses indexes obviously - the query is analysed for possible index usage and if appropriate index exists then it is used to optimize the query...

For example job_level=4  may reduce the number of investigated records significantly. Engine looks into index and retrieves just records having job_level=4. The result is then combined with records for department=account condition etc. etc. and the result is no need for sequential data comparison because the index is organised as a tree hierarchy and to find certain value is very fast.

Of course, optimization does not work in all possible situations. E.g. if all your records (or 95% of them) would have job_level=4 then engine cannot optimize this part of the WHERE clause etc.

Another optimization is necessary when you join two or more tables or when you have subqueries. Each data engine has its own optimization algorithm which is secret and still under development obviously.

If you need some simple optimization description then you may look how Visual FoxPro optimizes queries:
perlperlAuthor Commented:
ok I understand indexing is one way of doing it.
So lets say if a row has 10 columns.
A SQL query can do 9 combinations WHERE X=, WHERE Y-,......
one way is create a key (lets say some hash value of these 9 values)

The problem is there be 9! possible key. Does the database stores 9! keys and corresponding record as value? This would require so much memroy

ALSO in my case value of these columns are server specific pre-define and not user-value
perlperlAuthor Commented:
ALSO in my case value of these columns are server specific pre-define and not user-value
so the user can do a crazy combination of 9 WHERE clause
If you want to really get into the internals of the optimizer and how it makes its decisions  I suggest reading the blog of its authors.    It's not updated frequently but the information is tremendous.

Another great resource that IS updated regularly is Jonathan Lewis' blog

And his book Cost-Based Oracle Fundamentals is awesome.

These resources will explain how and why statistics and histograms are gathered and used.  They'll explain join types and join decision trees.  Lewis will show how to read parts of a 10053 trace file, as well as how to generate them.
perlperlAuthor Commented:
Thank You guys,

I was trying to use this concept of multiple query in my project. I get the whole point I don't have to use 9! keys but I can use 2 or 3 or may be 4 which are the most important. This can have max of 24 Keys and rest I can do sequential search
perlperlAuthor Commented:
I can think of one more issue. What if I index by  X and Y and Z but user sends request based on other value say WHERE=M

does this mean I scan all entries? I guess there is no definite solution
If user can select data column for the WHERE clause then you have to restrict his possibilities to columns having index or user must be informed about the possible time consuming query parameters due to the sequential data read.

Also remember the fact 100.000 records means nothing obviously so the restriction is suitable for large data only.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.