Solved

complex sql query and performace

Posted on 2014-03-12
11
419 Views
Last Modified: 2014-05-07
I am trying to understand how does databases like Oracle handles sql query that has many where (and) clause specially when databases has 1000's of entries.

Does it break it many smaller queris and then merge or something else?

example) consder a DB that has 1000 records
 give me all records if department=account, AND age>=50 and job_level=4.

simple ugly way is to compare sequential all 1000 records but that would be bad if we have million entries. Does the Database maintains some additional structures also for faster processing
0
Comment
Question by:perlperl
11 Comments
 
LVL 29

Expert Comment

by:MikeOM_DBA
ID: 39924595
Check all about Optimizer here
0
 
LVL 22

Expert Comment

by:Steve Wales
ID: 39924604
Relational Database Management Systems (RDBMS) usually have a thing called the Optimizer inside the engine.  Tables usually (or should) have indexes on them.  Indexes are effectively access paths to data.  

Heavily queried patterns should have indexes on them - that way the optimizer (which knows about the indexes) can find the required rows quickly.

Sometimes there's no choice but to scan the entire table, since you can't access everything though.

It's a fairly broad question, but at a very high level that's what happens.

You can do an EXPLAIN PLAN for any query in Oacle to find out what goes on under the covers and it will tell you what the optimizer worked out and how it is going about finding the data.
0
 

Author Comment

by:perlperl
ID: 39924656
>>Sometimes there's no choice but to scan the entire table, since you can't access everything though.

So consider if a database has million users (which is not primary key) and I query my record (which lets say is not not common as I haven't queried for month), does that mean it has to scan all million entries to match WHERE clause of my username? that would be disaster in case of DB
0
 
LVL 41

Expert Comment

by:pcelba
ID: 39924658
The basic optimization uses indexes obviously - the query is analysed for possible index usage and if appropriate index exists then it is used to optimize the query...

For example job_level=4  may reduce the number of investigated records significantly. Engine looks into index and retrieves just records having job_level=4. The result is then combined with records for department=account condition etc. etc. and the result is no need for sequential data comparison because the index is organised as a tree hierarchy and to find certain value is very fast.

Of course, optimization does not work in all possible situations. E.g. if all your records (or 95% of them) would have job_level=4 then engine cannot optimize this part of the WHERE clause etc.

Another optimization is necessary when you join two or more tables or when you have subqueries. Each data engine has its own optimization algorithm which is secret and still under development obviously.

If you need some simple optimization description then you may look how Visual FoxPro optimizes queries:
http://www.universalthread.com/ViewPageArticle.aspx?ID=736
http://msdn.microsoft.com/en-us/library/1f5d2sa3(v=vs.80).aspx
http://fox.wikis.com/wc.dll?Wiki~UnderstandingRushmore
0
 

Author Comment

by:perlperl
ID: 39924699
ok I understand indexing is one way of doing it.
So lets say if a row has 10 columns.
A SQL query can do 9 combinations WHERE X=, WHERE Y-,......
one way is create a key (lets say some hash value of these 9 values)

The problem is there be 9! possible key. Does the database stores 9! keys and corresponding record as value? This would require so much memroy

ALSO in my case value of these columns are server specific pre-define and not user-value
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 

Author Comment

by:perlperl
ID: 39924737
ALSO in my case value of these columns are server specific pre-define and not user-value
so the user can do a crazy combination of 9 WHERE clause
0
 
LVL 41

Accepted Solution

by:
pcelba earned 500 total points
ID: 39924747
Database designers must analyze queries used and create appropriate indexes. (Some data engines can do it automatically.)

It is not necessary to create indexes for all possible data combinations but the query time is significantly reduced even when just part of the WHERE clause is optimized. The rest is then processed sequentially.

So if you have
WHERE x=1 AND y=2 OR z="text" etc. etc.
engine could use just index on x column and index on y column. (No combination is necessary, two separate indexes are OK.)

If I simplify it each index contains the indexed values and pointers to original data. SQL engine can count the number of x values equal to 1 and retrieve a set of pointers then the engine can count number of y values equal to 2 and combine previously obtained pointers with this new set etc. etc. If the number of pointers is too large then the engine may decide not to use such index for optimization. If the number is small then engine can decide to stop further optimization.

The pointers combination depends on operators used.
x=1 AND y=2  means you have to do intersection between two sets of pointers
OR operator means you have to do an union operation.

And to do an intersection with a small number of pointers is very fast. FoxPro does not use pointers but single bits which is even better...
0
 
LVL 73

Expert Comment

by:sdstuber
ID: 39924796
If you want to really get into the internals of the optimizer and how it makes its decisions  I suggest reading the blog of its authors.    It's not updated frequently but the information is tremendous.

https://blogs.oracle.com/optimizer/


Another great resource that IS updated regularly is Jonathan Lewis' blog

http://jonathanlewis.wordpress.com/

And his book Cost-Based Oracle Fundamentals is awesome.

These resources will explain how and why statistics and histograms are gathered and used.  They'll explain join types and join decision trees.  Lewis will show how to read parts of a 10053 trace file, as well as how to generate them.
0
 

Author Comment

by:perlperl
ID: 39924826
Thank You guys,

I was trying to use this concept of multiple query in my project. I get the whole point I don't have to use 9! keys but I can use 2 or 3 or may be 4 which are the most important. This can have max of 24 Keys and rest I can do sequential search
0
 

Author Comment

by:perlperl
ID: 39924975
I can think of one more issue. What if I index by  X and Y and Z but user sends request based on other value say WHERE=M

does this mean I scan all entries? I guess there is no definite solution
0
 
LVL 41

Expert Comment

by:pcelba
ID: 39925000
If user can select data column for the WHERE clause then you have to restrict his possibilities to columns having index or user must be informed about the possible time consuming query parameters due to the sequential data read.

Also remember the fact 100.000 records means nothing obviously so the restriction is suitable for large data only.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now