complex sql query and performace

Posted on 2014-03-12
Last Modified: 2014-05-07
I am trying to understand how does databases like Oracle handles sql query that has many where (and) clause specially when databases has 1000's of entries.

Does it break it many smaller queris and then merge or something else?

example) consder a DB that has 1000 records
 give me all records if department=account, AND age>=50 and job_level=4.

simple ugly way is to compare sequential all 1000 records but that would be bad if we have million entries. Does the Database maintains some additional structures also for faster processing
Question by:perlperl
LVL 29

Expert Comment

ID: 39924595
Check all about Optimizer here
LVL 22

Expert Comment

by:Steve Wales
ID: 39924604
Relational Database Management Systems (RDBMS) usually have a thing called the Optimizer inside the engine.  Tables usually (or should) have indexes on them.  Indexes are effectively access paths to data.  

Heavily queried patterns should have indexes on them - that way the optimizer (which knows about the indexes) can find the required rows quickly.

Sometimes there's no choice but to scan the entire table, since you can't access everything though.

It's a fairly broad question, but at a very high level that's what happens.

You can do an EXPLAIN PLAN for any query in Oacle to find out what goes on under the covers and it will tell you what the optimizer worked out and how it is going about finding the data.

Author Comment

ID: 39924656
>>Sometimes there's no choice but to scan the entire table, since you can't access everything though.

So consider if a database has million users (which is not primary key) and I query my record (which lets say is not not common as I haven't queried for month), does that mean it has to scan all million entries to match WHERE clause of my username? that would be disaster in case of DB
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

LVL 42

Expert Comment

ID: 39924658
The basic optimization uses indexes obviously - the query is analysed for possible index usage and if appropriate index exists then it is used to optimize the query...

For example job_level=4  may reduce the number of investigated records significantly. Engine looks into index and retrieves just records having job_level=4. The result is then combined with records for department=account condition etc. etc. and the result is no need for sequential data comparison because the index is organised as a tree hierarchy and to find certain value is very fast.

Of course, optimization does not work in all possible situations. E.g. if all your records (or 95% of them) would have job_level=4 then engine cannot optimize this part of the WHERE clause etc.

Another optimization is necessary when you join two or more tables or when you have subqueries. Each data engine has its own optimization algorithm which is secret and still under development obviously.

If you need some simple optimization description then you may look how Visual FoxPro optimizes queries:

Author Comment

ID: 39924699
ok I understand indexing is one way of doing it.
So lets say if a row has 10 columns.
A SQL query can do 9 combinations WHERE X=, WHERE Y-,......
one way is create a key (lets say some hash value of these 9 values)

The problem is there be 9! possible key. Does the database stores 9! keys and corresponding record as value? This would require so much memroy

ALSO in my case value of these columns are server specific pre-define and not user-value

Author Comment

ID: 39924737
ALSO in my case value of these columns are server specific pre-define and not user-value
so the user can do a crazy combination of 9 WHERE clause
LVL 42

Accepted Solution

pcelba earned 500 total points
ID: 39924747
Database designers must analyze queries used and create appropriate indexes. (Some data engines can do it automatically.)

It is not necessary to create indexes for all possible data combinations but the query time is significantly reduced even when just part of the WHERE clause is optimized. The rest is then processed sequentially.

So if you have
WHERE x=1 AND y=2 OR z="text" etc. etc.
engine could use just index on x column and index on y column. (No combination is necessary, two separate indexes are OK.)

If I simplify it each index contains the indexed values and pointers to original data. SQL engine can count the number of x values equal to 1 and retrieve a set of pointers then the engine can count number of y values equal to 2 and combine previously obtained pointers with this new set etc. etc. If the number of pointers is too large then the engine may decide not to use such index for optimization. If the number is small then engine can decide to stop further optimization.

The pointers combination depends on operators used.
x=1 AND y=2  means you have to do intersection between two sets of pointers
OR operator means you have to do an union operation.

And to do an intersection with a small number of pointers is very fast. FoxPro does not use pointers but single bits which is even better...
LVL 74

Expert Comment

ID: 39924796
If you want to really get into the internals of the optimizer and how it makes its decisions  I suggest reading the blog of its authors.    It's not updated frequently but the information is tremendous.

Another great resource that IS updated regularly is Jonathan Lewis' blog

And his book Cost-Based Oracle Fundamentals is awesome.

These resources will explain how and why statistics and histograms are gathered and used.  They'll explain join types and join decision trees.  Lewis will show how to read parts of a 10053 trace file, as well as how to generate them.

Author Comment

ID: 39924826
Thank You guys,

I was trying to use this concept of multiple query in my project. I get the whole point I don't have to use 9! keys but I can use 2 or 3 or may be 4 which are the most important. This can have max of 24 Keys and rest I can do sequential search

Author Comment

ID: 39924975
I can think of one more issue. What if I index by  X and Y and Z but user sends request based on other value say WHERE=M

does this mean I scan all entries? I guess there is no definite solution
LVL 42

Expert Comment

ID: 39925000
If user can select data column for the WHERE clause then you have to restrict his possibilities to columns having index or user must be informed about the possible time consuming query parameters due to the sequential data read.

Also remember the fact 100.000 records means nothing obviously so the restriction is suitable for large data only.

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question