Query Performance Concepts

k_murli_krishna
k_murli_krishna used Ask the Experts™
on
When we fire a query will only fetched data go into bufferpool OR scanned/joined data OR grouped/sorted data OR all tables entire data in FROM clause. Will the query be stored in bufferpool and the fetched data mapped to the query? What happens if data involved in query is modified? On what condition/situation the hard disk is used for query when it is fired second, third etc. time? These are all related questions, hence I am asking under a common title & have given suitable points as well.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Hi k_murli_kirshna,

The bufferpool is basically a memory cache for database objects.  DB2 tries to keep as much of the "active" data in the bufferpool as possible.  "Active" data includes index pages and data pages from both permanent and derived tables.

Ideally, the bufferpool should be sized as large as is practical so that DB2 can keep as much in the bufferpool as possible.  But care needs to be taken that you don't define the bufferpool so large that it interferes with the paging of the underlying operating system.  If the bufferpool is so large the it causes excessive paging you've stepped backwards.

The bufferpool management is a bit of magic.  DB2 keeps track of items in the bufferpool and when it needs to free up space in the bufferpool for new items, will delete from the bufferpool those items that it expects to have the least impact on the system.  The items that have been the least recently used are the most likely candidates to be purged from the bufferpool, but the last usage time isn't the only criteria.  Data pages are typically accessed less frequently than index pages.  Some of the system tables should never be flushed from the bufferpool, etc.

There is some magic there, but tuning what you can control so that DB2 has as much memory without causing thrashing is usually the most efficient.


Kent
k_murli_krishna:

You tagged your question with both openSUSE Linux 10.2 and DB2 V5R3 on iseries AS/400.

Be aware that elements such as 'bufferpools' are fairly low-level implementation details of DB2. They are not particularly parts of DB2, but rather underlying elements that support DB2.

This is important because DB2 on AS/400s has no direct need for those nor any way to affect them directly. They are handled automatically as part of general work management of the entire system. In general, changes to work management affect all tasks in a given subsystem, not just DB2.

"Performance tuning" of DB2 on AS/400s consists more of proper database design in terms of tables, views, indexes, relationships, etc., along with proper construction of the queries that interact with the database.

Tom

Author

Commented:
Thanks, kent and tom. You mentioned good stuff about buffer pool etc. with DB2 on Linux and AS/400. Now consider that we are talking about only Linux DB2. I asked w.r.t. a query. I am repeating my question since it is not that long. Please try to answer specifically to each part.

When we fire a query will only fetched data go into bufferpool OR scanned/joined data OR grouped/sorted data OR all tables entire data in FROM clause. Will the query be stored in bufferpool and the fetched data mapped to the query? What happens if data involved in query is modified? On what condition/situation the hard disk is used for query when it is fired second, third etc. time?
Exploring SharePoint 2016

Explore SharePoint 2016, the web-based, collaborative platform that integrates with Microsoft Office to provide intranets, secure document management, and collaboration so you can develop your online and offline capabilities.

i don't completly understand your question, but i think i know the answer.. :-)

when you invoke a query, db2 finds the pages it needs to bring to the bufferpool (either by tablespace scan or index scan - does not matter right now), when a page that is brought to the buffer is relevant for your query, db2 places a lock on that page, so that the data is not updated until it is returned to your application (that is not entirly correct, it depends on the isolation level you choose).
The data in the bufferpool is not associated to your query, only the lock is. So for that instance, if you have a table which is central and many queries access it, chances are that it will stay in the bufferpool all the time.

A page that is updated, is written to the containers according to several thresholds such as the checkpoint interval, when a certain percent of the buffer is occupied by updated pages etc...

Author

Commented:
Thanks, momi. One question remains. We have SELECT, FROM, JOIN, ON, WHERE, GROUP BY, ORDER BY. Now when buffer pool does not contain any data/index pages of the table i.e. query is fired for the first time, what all data gets into buffer pool provided there is plenty of space i.e. fetched, joined, scanned, grouped, ordered - Will all this data go into buffer pool whether table/index scan.
again, i'm not sure don't understand the question, but this is my answer( i hope)

the first thing db2 does is read the data into the bufferpool, the only way db2 can eliminate some of the table data before it reads it into the bufferpool is by using an index (which has to be read to the buffer pool... :-))
only after the data is in the bufferpool, db2 is able to perform join / group by / order by  etc... (again, unless order by and group by can be eliminated by using an index)

hope this answers your question

Author

Commented:
Thanks. Much more clear now.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial