[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

optimize

Posted on 2007-10-09
13
Medium Priority
?
1,327 Views
Last Modified: 2008-03-16
hi guys

any one knows what optimize  means

select * from ...

where ...

ORDER BY ID  
OPTIMIZE FOR 1 ROW            // what does this mean??
FETCH FIRST 10 ROWS ONLY

any information appreciated

thanks
J
0
Comment
Question by:jaggernat
  • 4
  • 3
  • 3
  • +2
13 Comments
 
LVL 1

Expert Comment

by:wbhinson
ID: 20042220
DB2 usually optimizes queries to retrieve all rows that qualify. But sometimes you want to retrieve only the first row or first few rows.

Use OPTIMIZE FOR 1 ROW  is used to influence the access path. OPTIMIZE FOR 1 ROW tells DB2 to select an access path that returns the first qualifying row quickly.  It should be used typically for small data sets.

Read more here: http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db2.doc.apsg/qipfrr.htm


0
 
LVL 10

Author Comment

by:jaggernat
ID: 20042338
>>>OPTIMIZE FOR 1 ROW tells DB2 to select an access path that returns the first qualifying row quickly

what happens if first qualifying row is returned quickly? i mean how does it improve performace.
why cant i use 'OPTIMIZE FOR 1 ROW ' for large result sets or can i?

thanks for responding!
J
0
 
LVL 1

Assisted Solution

by:wbhinson
wbhinson earned 300 total points
ID: 20042417
It is really about CPU cost and query cost.  By using this option you are minimize the impact on the system as opposed to asking for all ROWS that match which is potentially a much more impactful query from a cost and system CPU utililization.  As a rule of thumb.

Use OPTIMIZE FOR 1 ROW when:
1 - you truly only need the first row from a set of qualified data
2 - you want to force immediate access, with the short route, for the first row and maybe others

You may want to explore the FETCH command more as well.  It would help to know what you are trying to accompolish with the query as to whether this is a good use of it or not.

0
Get your Disaster Recovery as a Service basics

Disaster Recovery as a Service is one go-to solution that revolutionizes DR planning. Implementing DRaaS could be an efficient process, easily accessible to non-DR experts. Learn about monitoring, testing, executing failovers and failbacks to ensure a "healthy" DR environment.

 
LVL 46

Expert Comment

by:Kent Olsen
ID: 20042686
Hi J,

I don't get anything following your link.  The page never displays.


Hi jaggernat,

Here's another link.  I assume that it contains something very similar to what J posted.  It's got a good desription of how/why OPTIMIZE works, though it may be just a tad technical.


Good Luck,
Kent
0
 
LVL 1

Expert Comment

by:wbhinson
ID: 20043005
From IBM's web site:

OPTIMIZE FOR n ROWS Clause

The OPTIMIZE FOR clause declares the intent to retrieve only a subset of the result or to give priority to retrieving only the first few rows. The optimizer can then prefer access plans that minimize the response time for retrieving the first few rows. In addition, the number of rows that are sent to the client as a single block are bounded by the value of "n" in the OPTIMIZE FOR clause. Thus the OPTIMIZE FOR clause affects both how the server retrieves the qualifying rows from the database by the server, and how it returns the qualifying rows to the client.

For example, suppose you are querying the employee table for the employees with the highest salary on a regular basis.

     SELECT LASTNAME,FIRSTNAME,EMPNO,SALARY
     FROM EMPLOYEE
     ORDER BY SALARY DESC You have defined a descending index on the SALARY column. However, since employees are ordered by employee number, the salary index is likely to be very poorly clustered. To avoid many random synchronous I/Os, the optimizer would probably choose to use the list prefetch access method, which requires sorting the row identifiers of all rows that qualify. This sort causes a delay before the first qualifying rows can be returned to the application. To prevent this delay, add the OPTIMIZE FOR clause to the statement as follows:

     SELECT LASTNAME,FIRSTNAME,EMPNO,SALARY
     FROM EMPLOYEE
     ORDER BY SALARY DESC
     OPTIMIZE FOR 20 ROWS In this case, the optimizer probably chooses to use the SALARY index directly because only the twenty employees with the highest salaries are retrieved. Regardless of how many rows might be blocked, a block of rows is returned to the client every twenty rows.

With the OPTIMIZE FOR clause the optimizer favors access plans that avoid bulk operations or interrupt the flow of rows, such as sorts. You are most likely to influence an access path by using OPTIMIZE FOR 1 ROW. Using this clause might have the following effects:

Join sequences with composite inner tables are less likely because they require a temporary table.
The join method might change. A nested loop join is the most likely choice, because it has low overhead cost and is usually more efficient to retrieve a few rows.
An index that matches the ORDER BY clause is more likely because no sort is required for the ORDER BY.
List prefetch is less likely because this access method requires a sort.
Sequential prefetch is less likely because of the understanding that only a small number of rows is required.
In a join query, the table with the columns in the ORDER BY clause is likely to be picked as the outer table if an index on the outer table provides the ordering needed for the ORDER BY clause.
Although the OPTIMIZE FOR clause applies to all optimization levels, it works best for optimization class 3 and higher because classes below 3 use Greedy join enumeration method. This method sometimes results in access plans for multi-table joins that do not lend themselves to quick retrieval of the first few rows.

The OPTIMIZE FOR clause does not prevent you from retrieving all the qualifying rows. If you do retrieve all qualifying rows, the total elapsed time might be significantly greater than if the optimizer had optimized for the entire answer set.

If a packaged application uses the call level interface (DB2 CLI or ODBC), you can use the OPTIMIZEFORNROWS keyword in the db2cli.ini configuration file to have DB2 CLI automatically append an OPTIMIZE FOR clause to the end of each query statement.

When data is selected from nicknames, results may vary depending on data source support. If the data source referenced by the nickname supports the OPTIMIZE FOR clause and the DB2 optimizer pushes down the entire query to the data source, then the clause is generated in the remote SQL sent to the data source. If the data source does not support this clause or if the optimizer decides that the least-cost plan is local execution, the OPTIMIZE FOR clause is applied locally. In this case, the DB2 optimizer prefers access plans that minimize the response time for retrieving the first few rows of a query, but the options available to the optimizer for generating plans are slightly limited and performance gains from the OPTIMIZE FOR clause may be negligible.

If both the FETCH FIRST clause and the OPTIMIZE FOR clause are specified, the lower of the two values affects the communications buffer size. The two values are considered independent of each other for optimization purposes.

0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 20043069
Hi wbhinson,

That's the exact verbage from the link that I THOUGHT that I'd posted.  :)

It's shaping up to be a long "rest of the week".......


Kent
0
 
LVL 10

Author Comment

by:jaggernat
ID: 20043132
ok, i got it guys
but i still dont understand one thing. When to use OPTIMIZE FOR 1 ROW and when to not?

I am working on a web application where in my back end i am writing a select query and fetching only the first 200 records and displaying those 200 records on screen.

and hence this is how my query looks
select * from ...
where ...
ORDER BY ID  
OPTIMIZE FOR 1 ROW            
FETCH FIRST 201 ROWS ONLY

Therefore, eventhough the query might return 4000 rows but i am fetching the first 200 only
so  is it a good idea to use OPTIMIZE FOR 1 ROW     in above query.

In what scenarios do i AVOID using it? when i have large result sets?not sure..

thanks
0
 
LVL 27

Assisted Solution

by:tliotta
tliotta earned 100 total points
ID: 20043498
jaggernat:

You might use it when a process would be returning rows one at a time for a human to process. Since people (and their interactive response) can be slow compared to the background database, you might want the first row to appear quickly.

The first row appears and the person can take a few seconds to do whatever is needed. In the meantime in the background, the database can get more work done on the rest of the query without causing interactive delay.

If optimization is directed instead, for example, to perform the entire query before returning the first row, the person might need to wait a very long time before seeing a single row on the screen.

Tom
0
 
LVL 1

Expert Comment

by:wbhinson
ID: 20043578
0
 
LVL 13

Expert Comment

by:ghp7000
ID: 20048458
you can use the OPTIMIZE clause in small and large result sets, the impact will be more noticeable in large result sets---why, you may ask-the reason is simple
if you ask for all qualifying rows for a large result set, the dbm will most likely have to perform a sort or use a temp table to write out the result set before returning the data to the user-this means cpu overhead and disk i/o, not to mention bufferpool activity, agent utilization, page swaping etc etc. All of this is avoided when the optimize clause is specified becausethe result is immediately returned to the user as soon as that first row is retrieved, it is similar to using fetch first n rows only, except that with fetch, the query is processed in full and only the first n rows are returned.
0
 
LVL 46

Accepted Solution

by:
Kent Olsen earned 400 total points
ID: 20048516
Hi jaggernat,

If you're going to fetch 200 rows, you're probably best served with an 'OPTIMIZE FOR 200 ROWS' clause.


Kent
0
 
LVL 13

Expert Comment

by:ghp7000
ID: 20056622
in other words, optimize is related to row blocking, that means fetching of the data buffers and is tied to the communication buffers, but I  would think that it does also have an impact on the bufferpool.
0

Featured Post

Nothing ever in the clear!

This technical paper will help you implement VMware’s VM encryption as well as implement Veeam encryption which together will achieve the nothing ever in the clear goal. If a bad guy steals VMs, backups or traffic they get nothing.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

November 2009 Recently, a question came up in the DB2 forum regarding the date format in DB2 UDB for AS/400.  Apparently in UDB LUW (Linux/Unix/Windows), the date format is a system-wide setting, and is not controlled at the session level.  I'm n…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
this video summaries big data hadoop online training demo (http://onlineitguru.com/big-data-hadoop-online-training-placement.html) , and covers basics in big data hadoop .
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …
Suggested Courses

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question