Go Premium for a chance to win a PS4. Enter to Win


Yet another performance conundrum

Posted on 2011-09-02
Medium Priority
Last Modified: 2012-05-12
Hi All,

I have posted earlier in regards to this topic and how got a lot of help from the experts. Was hoping for one more piece of advise.

We have 4 MQT's (with refresh deferred) that refresh in parallel. Each of the MQT's have close to 200 million records. I used a lot of advise for improving the performance:

1) Journal are turned off
2) Commitment control is off
3) Indexes are dropped before refresh and created after the refresh
4) We have used service programs to improve efficiency
5) MQT's are refreshed in parallel.
6) All queries in the program use SQE.
8) DEGREE is set to *OPTIMIZE

After all this, before we deployed to our production schema we ran the jobs on a staging region on our production server. The  refresh of all 4 MQT's in parallel took 60 minutes on the production server on the staging schema. Unfortunately, when the job moved to production and ran on the production schema (on the same server as the staging region), the job took more than twice the time (i ran the refresh multiple times to ensure plan caches were built).

One of the differences that i noticed was that the longest running MQT was assigned multiple threads when i do a "Work with Threads" on the staging schema. On production schema, however, this same job is being assigned only thread. These are threads that the system is assigning and is not something i am doing in my program. Is there any way i can control this? Would this be causing the job to perform slower?

Any ideas on what else could be the difference?

Question by:bhagatali
LVL 46

Expert Comment

by:Kent Olsen
ID: 36474559
Hi bhagatali,

Compare the database configurations for both databases.  If the schemas are in separate instances, compare the instance configurations, too.

In the LUW world, you can generate them with the commands:

  db2 get db cfg
  db2 get dbm cfg

I don't know what the equivalent commands are on the AS/400.

Another consideration is the back end storage.  Are the disks used for each of the databases comparable?

LVL 35

Accepted Solution

Gary Patterson earned 2000 total points
ID: 36475692
AS/400 DB2 doesn't support multiple database instances in a single system (or LPAR).  AS/400 DB2 just simply isn't configured the same way as DB2 LUW or mainframe.

However, Kent's advice still applies, to a certain extent.

The concept of a DB2 "region" is a mainframe concept that doesn't apply to the AS/400, so I'm a little confused by your use of the term.  When you say "region", what exactly do you mean?  Are you referring to different LPARs, or just collections of libraries withing the same partition (or unpartitioned system)?

I'm going to assume for now that we are talking about one of more "staging" libraries, and one or more "production" libraries on the same LPAR (on same unpartitioned system).

There are lots of reasons that you may see a difference:

1) Schema differences between the two sets of libraries.
2) Differences in the size of the table in the staging and production libraries.
3) Overall load on the system during the "staging" run vs. the "production" run.
4) Possible differing QAQQQINI settings between jobs.
5) Production and staging jobs running in different subsystems and/or memory pools.
6) Production and staging jobs running under different priorities
7) Production and staging jobs running against differing workloads in the system (system busier when production run happens, for example)

A very likely cause is that I/O activity on the system at the time of the production run was higher than at the time of the staging run.  If the rebuild process is going to be I/O constrained, then the system will start fewer threads, since the extra threads cannot be kept busy if disk arm usage is too high.  I cannot tell you for sure without reviewing comparative performance data.

By the way, is the DB2 Symmetric Multiprocessing (SMP) installed?

You don't mention your OS version, but assuming V5R3 or later, here is the documentation on QAQQQINI PARALLEL_DEGREE *OPTIMIZE from: http://publib.boulder.ibm.com/infocenter/iseries/v5r3/index.jsp?topic=%2Frzajq%2Fcreateqaqqini.htm

The query optimizer can choose to use any number of tasks for either I/O or SMP parallel processing to process the query or database file keyed access path build, rebuild, or maintenance. SMP parallel processing is used only if the system feature, DB2® Symmetric Multiprocessing for OS/400®, is installed. Use of parallel processing and the number of tasks used is determined with respect to the number of processors available in the server, this job has a share of the amount of active memory available in the pool in which the job is run, and whether the expected elapsed time for the query or database file keyed access path build or rebuild is limited by CPU processing or I/O resources. The query optimizer chooses an implementation that minimizes elapsed time based on the job has a share of the memory in the pool.

So available memory is an issue, available CPU is a factor, presence or absence of SMP feature is a factor, IO utilization is a factor, etc.  Other factors that can have an impact on runtime include overall system activity level, QMAXACTLVL system value setting, pool activity levels, competing jobs in the system at runtime, job priorities, and more.

Can you control it?  Well, you can make sure that there is plenty of memory and CPU available during the production run (limit other activity).  I haven't tested it, but it is possible that you can improve build performance by specifying PARALLEL_DEGREE *MAX, but you need to make sure that there is adequate memory available in the pool.  

You definitely need to limit I/O activity, if possible.

It is really tough to give concrete advice without understanding the subsystem, memory, and CPU configuration of your system, plus the workload characteristics of your system during the production run.

Bottom line:  in order for the system to be able to use more threads in these rebuilds, you need three things:  Plenty of memory available to the job, low I/O levels in other jobs (low disk arm utilization, generally), and available processor resources.

- Gary Patterson

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

November 2009 Recently, a question came up in the DB2 forum regarding the date format in DB2 UDB for AS/400.  Apparently in UDB LUW (Linux/Unix/Windows), the date format is a system-wide setting, and is not controlled at the session level.  I'm n…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
Want to learn how to record your desktop screen without having to use an outside camera. Click on this video and learn how to use the cool google extension called "Screencastify"! Step 1: Open a new google tab Step 2: Go to the left hand upper corn…
Look below the covers at a subform control , and the form that is inside it. Explore properties and see how easy it is to aggregate, get statistics, and synchronize results for your data. A Microsoft Access subform is used to show relevant calcul…
Suggested Courses

783 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question