What DB2 bufferpool stategy would you use for OLTP vs OLAP/Data warehouse database?

Dear gurus,

Can I get some opinion, comments on strategy and best practices for defining bufferpools for a Online transaction processing system vs. a data warehouse/OLAP/DSS system?

All comments, links, etc. welcomed.

Thanks

Enyimba
EnyimbaAsked:
Who is Participating?
 
Kent OlsenData Warehouse Architect / DBACommented:
I always start with default tuning of the bufferpools.  If I had created a monsterous data warehouse for a Home Depot, Walmart, etc. I might do differently, but the built in tuning has really matured over the years so I assume that DB2 has a pretty good idea about what it needs to do.

And memory is so cheap any more that one could easily have a 128GB server that supports approximately 100GB of bufferpool area.  That's huge.  An awful lot of databases become "memory resident" when there's that much memory.  Design almost becomes secondary as the only page faulting are the write faults as modified pages are flushed back to disk.  I can buy that server a lot more cheaply than I can put an architect on site for 24 months.

But that doesn't really answer your question.

Activity on OLTP leans heavily toward write operations.  Most of the writes are sequential (time based) writes to the transaction file.  Peripheral tables may be modified occasionally, perhaps even with some regularity, but the lions share of the writes are just adding rows to the transaction table.  Each write operation will also need to do index updating.  The transaction timestamp would be one likely index.  Other indexes would be dependent upon the design and application of the database.  So the first tuning step would be to force the indexes into their own tablespace and make sure that there's plenty of bufferpool assigned to it.

Activity on an OLAP leans heavily toward read operations.  And if the OLAP is a star schema data warehouse, almost all of the activity is in the index pages.  Essentially, DB2 will perform multiple joins of the indexes (where it can) in the dimension tables and read the fact table last.  If the relevant data is contained in an index that's part of the query, DB2 will not bother reading the fact table and build the result set(s) from the data in the indexes.  Assuming that the fact table is derived from the OLTP transaction table, the index in the OLAP is often smaller than the index in the OLTP.  The transaction table often has a single index on the IDENTITY column.  So again, it's advantageous to assign plenty of bufferpool space to the indexes.  The tricky part is how much to assign to the fact table.  That's something inherent to the design and reporting habits.  Is there enough bufferpool space for the indexes and the current month, quarter, or year's worth of transaction data?  If so, the first query against the database reads and caches the indexes and transaction data.  The second query uses the cached data and can be lightening fast.

Sidebar -- Several years ago I built a data warehouse from an OLTP.  A set of regularly run reports each ran from 1:31 to 1:35 on the OLTP no matter what order they were run.  When run on the OLAP, the first report ran in about 6 seconds.  The others were sub-second, almost instantaneous.  That's because the first report cached all the data that was used in each of the reports.  Only the first report did physical reads from disk.  The 6 reports in the set ran for just over 9 minutes on the OLTP and about 7 seconds on the OLAP.

So the key in both models is to get the caching of the indexes right.  Then adjust the bufferpools for the data pages.  That adjustment requires knowing the design and usage of the database.


Kent
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi Enyimba,

Is this a hypothetical question or do you have a real need to do this?


Kent
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
EnyimbaAuthor Commented:
Real reason to do it...you see, I consult variously and I do have a little bit of experience. But I place a high value and respect to the experience that I glean from talking and interacting with seasoned gurus like yourself and many others in primarily IBM DB2 products.

For every good or elementary questions that I may ask, I probably have my own opinion, but always, I mean I always like to vet my opinion against those of people that I consider superior in their knowledge and opinion to my own. I do not have any "complex", I just happen to think that at any time in life "...there are greater and lesser persons than myself". I don't remember where I got the quote from, but I think it is always true for anybody...

So, can I get an answer from you?

Enyimba
0
 
waynezhuCommented:
My own experience is
-- For OLTP,  BP don't need to be very big, basically it should be reasonably large enough to hold the most active data, due to the nature of OLPT small and quick queries.
-- For DW/OLAP, since reporting queries touch lots of data, you can allocate more memory to BP.
0
 
waynezhuCommented:
I believe kdo and myself had provided enough information the question.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.