Go Premium for a chance to win a PS4. Enter to Win


complex sql query and performace

Posted on 2014-03-12
Medium Priority
Last Modified: 2014-05-07
I am trying to understand how does databases like Oracle handles sql query that has many where (and) clause specially when databases has 1000's of entries.

Does it break it many smaller queris and then merge or something else?

example) consder a DB that has 1000 records
 give me all records if department=account, AND age>=50 and job_level=4.

simple ugly way is to compare sequential all 1000 records but that would be bad if we have million entries. Does the Database maintains some additional structures also for faster processing
Question by:perlperl
LVL 29

Expert Comment

ID: 39924595
Check all about Optimizer here
LVL 23

Expert Comment

by:Steve Wales
ID: 39924604
Relational Database Management Systems (RDBMS) usually have a thing called the Optimizer inside the engine.  Tables usually (or should) have indexes on them.  Indexes are effectively access paths to data.  

Heavily queried patterns should have indexes on them - that way the optimizer (which knows about the indexes) can find the required rows quickly.

Sometimes there's no choice but to scan the entire table, since you can't access everything though.

It's a fairly broad question, but at a very high level that's what happens.

You can do an EXPLAIN PLAN for any query in Oacle to find out what goes on under the covers and it will tell you what the optimizer worked out and how it is going about finding the data.

Author Comment

ID: 39924656
>>Sometimes there's no choice but to scan the entire table, since you can't access everything though.

So consider if a database has million users (which is not primary key) and I query my record (which lets say is not not common as I haven't queried for month), does that mean it has to scan all million entries to match WHERE clause of my username? that would be disaster in case of DB
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

LVL 43

Expert Comment

ID: 39924658
The basic optimization uses indexes obviously - the query is analysed for possible index usage and if appropriate index exists then it is used to optimize the query...

For example job_level=4  may reduce the number of investigated records significantly. Engine looks into index and retrieves just records having job_level=4. The result is then combined with records for department=account condition etc. etc. and the result is no need for sequential data comparison because the index is organised as a tree hierarchy and to find certain value is very fast.

Of course, optimization does not work in all possible situations. E.g. if all your records (or 95% of them) would have job_level=4 then engine cannot optimize this part of the WHERE clause etc.

Another optimization is necessary when you join two or more tables or when you have subqueries. Each data engine has its own optimization algorithm which is secret and still under development obviously.

If you need some simple optimization description then you may look how Visual FoxPro optimizes queries:

Author Comment

ID: 39924699
ok I understand indexing is one way of doing it.
So lets say if a row has 10 columns.
A SQL query can do 9 combinations WHERE X=, WHERE Y-,......
one way is create a key (lets say some hash value of these 9 values)

The problem is there be 9! possible key. Does the database stores 9! keys and corresponding record as value? This would require so much memroy

ALSO in my case value of these columns are server specific pre-define and not user-value

Author Comment

ID: 39924737
ALSO in my case value of these columns are server specific pre-define and not user-value
so the user can do a crazy combination of 9 WHERE clause
LVL 43

Accepted Solution

pcelba earned 2000 total points
ID: 39924747
Database designers must analyze queries used and create appropriate indexes. (Some data engines can do it automatically.)

It is not necessary to create indexes for all possible data combinations but the query time is significantly reduced even when just part of the WHERE clause is optimized. The rest is then processed sequentially.

So if you have
WHERE x=1 AND y=2 OR z="text" etc. etc.
engine could use just index on x column and index on y column. (No combination is necessary, two separate indexes are OK.)

If I simplify it each index contains the indexed values and pointers to original data. SQL engine can count the number of x values equal to 1 and retrieve a set of pointers then the engine can count number of y values equal to 2 and combine previously obtained pointers with this new set etc. etc. If the number of pointers is too large then the engine may decide not to use such index for optimization. If the number is small then engine can decide to stop further optimization.

The pointers combination depends on operators used.
x=1 AND y=2  means you have to do intersection between two sets of pointers
OR operator means you have to do an union operation.

And to do an intersection with a small number of pointers is very fast. FoxPro does not use pointers but single bits which is even better...
LVL 74

Expert Comment

ID: 39924796
If you want to really get into the internals of the optimizer and how it makes its decisions  I suggest reading the blog of its authors.    It's not updated frequently but the information is tremendous.


Another great resource that IS updated regularly is Jonathan Lewis' blog


And his book Cost-Based Oracle Fundamentals is awesome.

These resources will explain how and why statistics and histograms are gathered and used.  They'll explain join types and join decision trees.  Lewis will show how to read parts of a 10053 trace file, as well as how to generate them.

Author Comment

ID: 39924826
Thank You guys,

I was trying to use this concept of multiple query in my project. I get the whole point I don't have to use 9! keys but I can use 2 or 3 or may be 4 which are the most important. This can have max of 24 Keys and rest I can do sequential search

Author Comment

ID: 39924975
I can think of one more issue. What if I index by  X and Y and Z but user sends request based on other value say WHERE=M

does this mean I scan all entries? I guess there is no definite solution
LVL 43

Expert Comment

ID: 39925000
If user can select data column for the WHERE clause then you have to restrict his possibilities to columns having index or user must be informed about the possible time consuming query parameters due to the sequential data read.

Also remember the fact 100.000 records means nothing obviously so the restriction is suitable for large data only.

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Lotus Notes has been used since a very long time as an e-mail client and is very popular because of it's unmatched security. In this article we are going to learn about  RRV Bucket corruption and understand various methods to Fix "RRV Bucket Corrupt…
Windocks is an independent port of Docker's open source to Windows.   This article introduces the use of SQL Server in containers, with integrated support of SQL Server database cloning.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…

782 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question