complex sql query and performace

Posted on 2014-03-12
Last Modified: 2014-05-07
I am trying to understand how does databases like Oracle handles sql query that has many where (and) clause specially when databases has 1000's of entries.

Does it break it many smaller queris and then merge or something else?

example) consder a DB that has 1000 records
 give me all records if department=account, AND age>=50 and job_level=4.

simple ugly way is to compare sequential all 1000 records but that would be bad if we have million entries. Does the Database maintains some additional structures also for faster processing
Question by:perlperl
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
LVL 29

Expert Comment

ID: 39924595
Check all about Optimizer here
LVL 22

Expert Comment

by:Steve Wales
ID: 39924604
Relational Database Management Systems (RDBMS) usually have a thing called the Optimizer inside the engine.  Tables usually (or should) have indexes on them.  Indexes are effectively access paths to data.  

Heavily queried patterns should have indexes on them - that way the optimizer (which knows about the indexes) can find the required rows quickly.

Sometimes there's no choice but to scan the entire table, since you can't access everything though.

It's a fairly broad question, but at a very high level that's what happens.

You can do an EXPLAIN PLAN for any query in Oacle to find out what goes on under the covers and it will tell you what the optimizer worked out and how it is going about finding the data.

Author Comment

ID: 39924656
>>Sometimes there's no choice but to scan the entire table, since you can't access everything though.

So consider if a database has million users (which is not primary key) and I query my record (which lets say is not not common as I haven't queried for month), does that mean it has to scan all million entries to match WHERE clause of my username? that would be disaster in case of DB
Upcoming Webinar: Securing your MySQL/MariaDB data

Join Percona’s Chief Evangelist, Colin Charles as he presents Securing your MySQL®/MariaDB® data on Tuesday, July 11, 2017 at 7:00 am PDT / 10:00 am EDT (UTC-7).

LVL 42

Expert Comment

ID: 39924658
The basic optimization uses indexes obviously - the query is analysed for possible index usage and if appropriate index exists then it is used to optimize the query...

For example job_level=4  may reduce the number of investigated records significantly. Engine looks into index and retrieves just records having job_level=4. The result is then combined with records for department=account condition etc. etc. and the result is no need for sequential data comparison because the index is organised as a tree hierarchy and to find certain value is very fast.

Of course, optimization does not work in all possible situations. E.g. if all your records (or 95% of them) would have job_level=4 then engine cannot optimize this part of the WHERE clause etc.

Another optimization is necessary when you join two or more tables or when you have subqueries. Each data engine has its own optimization algorithm which is secret and still under development obviously.

If you need some simple optimization description then you may look how Visual FoxPro optimizes queries:

Author Comment

ID: 39924699
ok I understand indexing is one way of doing it.
So lets say if a row has 10 columns.
A SQL query can do 9 combinations WHERE X=, WHERE Y-,......
one way is create a key (lets say some hash value of these 9 values)

The problem is there be 9! possible key. Does the database stores 9! keys and corresponding record as value? This would require so much memroy

ALSO in my case value of these columns are server specific pre-define and not user-value

Author Comment

ID: 39924737
ALSO in my case value of these columns are server specific pre-define and not user-value
so the user can do a crazy combination of 9 WHERE clause
LVL 42

Accepted Solution

pcelba earned 500 total points
ID: 39924747
Database designers must analyze queries used and create appropriate indexes. (Some data engines can do it automatically.)

It is not necessary to create indexes for all possible data combinations but the query time is significantly reduced even when just part of the WHERE clause is optimized. The rest is then processed sequentially.

So if you have
WHERE x=1 AND y=2 OR z="text" etc. etc.
engine could use just index on x column and index on y column. (No combination is necessary, two separate indexes are OK.)

If I simplify it each index contains the indexed values and pointers to original data. SQL engine can count the number of x values equal to 1 and retrieve a set of pointers then the engine can count number of y values equal to 2 and combine previously obtained pointers with this new set etc. etc. If the number of pointers is too large then the engine may decide not to use such index for optimization. If the number is small then engine can decide to stop further optimization.

The pointers combination depends on operators used.
x=1 AND y=2  means you have to do intersection between two sets of pointers
OR operator means you have to do an union operation.

And to do an intersection with a small number of pointers is very fast. FoxPro does not use pointers but single bits which is even better...
LVL 74

Expert Comment

ID: 39924796
If you want to really get into the internals of the optimizer and how it makes its decisions  I suggest reading the blog of its authors.    It's not updated frequently but the information is tremendous.

Another great resource that IS updated regularly is Jonathan Lewis' blog

And his book Cost-Based Oracle Fundamentals is awesome.

These resources will explain how and why statistics and histograms are gathered and used.  They'll explain join types and join decision trees.  Lewis will show how to read parts of a 10053 trace file, as well as how to generate them.

Author Comment

ID: 39924826
Thank You guys,

I was trying to use this concept of multiple query in my project. I get the whole point I don't have to use 9! keys but I can use 2 or 3 or may be 4 which are the most important. This can have max of 24 Keys and rest I can do sequential search

Author Comment

ID: 39924975
I can think of one more issue. What if I index by  X and Y and Z but user sends request based on other value say WHERE=M

does this mean I scan all entries? I guess there is no definite solution
LVL 42

Expert Comment

ID: 39925000
If user can select data column for the WHERE clause then you have to restrict his possibilities to columns having index or user must be informed about the possible time consuming query parameters due to the sequential data read.

Also remember the fact 100.000 records means nothing obviously so the restriction is suitable for large data only.

Featured Post

Why You Need a DevOps Toolchain

IT needs to deliver services with more agility and velocity. IT must roll out application features and innovations faster to keep up with customer demands, which is where a DevOps toolchain steps in. View the infographic to see why you need a DevOps toolchain.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Azure Functions is a solution for easily running small pieces of code, or "functions," in the cloud. This article shows how to create one of these functions to write directly to Azure Table Storage.
In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
This is a high-level webinar that covers the history of enterprise open source database use. It addresses both the advantages companies see in using open source database technologies, as well as the fears and reservations they might have. In this…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question