?
Solved

Howto get rid of oracle "message from client" wait time?

Posted on 2008-11-11
17
Medium Priority
?
1,276 Views
Last Modified: 2013-12-19
Case: Websphere talking to Oracle server (both unix) at its busiest both not more then 20% CPU in use each.

What parameter should we tweak/tune to get rid of the "76807.81 Total Waited" in 2,5 hours?
call     count       cpu    elapsed       disk      query    current        rows
 
------- ------  -------- ---------- ---------- ---------- ----------  ----------
 
Parse      202      0.09       0.10          0          0          0           0
 
Execute 100525     43.63      44.31         47      55742     757408       50858
 
Fetch    77564    206.83     211.67       3273    6049646         91      304311
 
------- ------  -------- ---------- ---------- ---------- ----------  ----------
 
total   178291    250.55     256.09       3320    6105388     757499      355169
 
 
 
Misses in library cache during parse: 48
 
Misses in library cache during execute: 6
 
 
 
Elapsed times include waiting on following events:
 
  Event waited on                             Times   Max. Wait  Total Waited
 
  ----------------------------------------   Waited  ----------  ------------
 
  SQL*Net message to client                  144677        0.00          0.21
 
  SQL*Net message from client                144675       29.43      76807.81
 
  log file sync                                 838        0.02          1.62
 
  latch free                                    134        0.02          0.48
 
  buffer busy waits                              51        0.02          0.25
 
  db file sequential read                      1659        0.02          0.25
 
  SQL*Net more data to client                   298        0.00          0.01
 
  SQL*Net more data from client                1182        0.05          0.08
 
  direct path write (lob)                        68        0.00          0.00
 
  SQL*Net break/reset to client                  74        0.00          0.03
 
  db file scattered read                        678        0.05          2.35
 
  log file switch completion                      2        0.00          0.01
 
  log buffer space                               11        0.01          0.08

Open in new window

0
Comment
Question by:paulqna
  • 8
  • 6
  • 3
17 Comments
 
LVL 74

Expert Comment

by:sdstuber
ID: 22934528
The message is just what it sounds like.

The database is waiting on the client.
The most common cause of it is "people time" or "think time"

If I connect to the database, pull some data back to my screen and then spend 10 minutes reading it you'll accumulate 600 seconds of SQL*Net message from client, because the database is idle and is simply waiting on me to do something.

It can also happen if the client is slow to do processing.
And it can happen in loops, where you pull one row from the database do something to it then pull the next row and do something to it.  Each one action might be fast, but if you do it 144 thousand times the accumulation of all those tiny pauses can be significant.

Given your "to" and "from" wait counts it looks like your application probably works in a loop like that.
The database is able to send data to your app very quickly, so even though you sent data from the db to the client 144K times, your client then paused to do something with that data, most of the time that pause was tiny, with a few spikes, one of those spikes was 29 seconds.  So, in general the client processed each trip quickly but it did it 144675 times.

If you can transmit your data in one large chunk or at least a few large chunks that will help.    Each individual wait will be longer but you won't have to do thousands and thousands of handshaking steps.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 22934550
Even better, check what the client side of the application is doing.  Is there any processing that could be left on the database?  Far too many applications try to pull data out under the mistaken assumption they can "control" the operation on the client side better when all that really accomplishes is adding an extra performance bottleneck, extra handshaking steps and an extra break point.
0
 
LVL 5

Author Comment

by:paulqna
ID: 22934662
Ok, but on the application side we see the same waiting times in this 2,5 hour timespan, why are they (websphere and oracle) waiting for eachother?

It does confirm the 80% CPU idle time at both ends...

Is it the sqlnet/tcp handshaking 144K times?
0
Veeam Disaster Recovery in Microsoft Azure

Veeam PN for Microsoft Azure is a FREE solution designed to simplify and automate the setup of a DR site in Microsoft Azure using lightweight software-defined networking. It reduces the complexity of VPN deployments and is designed for businesses of ALL sizes.

 
LVL 74

Expert Comment

by:sdstuber
ID: 22934777
it could be network lag,  eliminating that or at least reducing the number of handshakes will help.

There is no "parameter" to adjust, it's simple mathematics of number of steps.  Reduce the steps and you'll be much faster.
0
 
LVL 5

Author Comment

by:paulqna
ID: 22935152
We have just tested on a test environment...
 - same job,
 - same job data
 - slower servers! (both websphere and oracle)
 - same os version
 - same webphere version
 - same application version
 - same oracle version

But there it takes only 30 minutes... CPU load again 20% both on websphere and oracle...

Same result in the stats, 80% "wait for client" again...

Copying files directly between the servers is as fast as should be...

I'm out of options... any clue...?
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 22935291
if you were doing the exact same work in both environments then it must be your network lag
0
 
LVL 27

Expert Comment

by:sujith80
ID: 22936477
Looks like  you are using "connection pooling". is it?
If you have connection pools, the connections will stay open and idle, the idle time will add to the SQL*Net message from client event.

This wait event you see is not as that harmful as such and with a connection pool, I dont think it can be avoided as well.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 22936623
yes, good catch, that's another great way to artificially inflate the number
0
 
LVL 27

Expert Comment

by:sujith80
ID: 22936632
wow sean, still awake? (lol)
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 22936674
ha, just sat down a little bit ago to do some catchup, I've got a few questions open I need to try to wrap up or at least point them in the right direction.  I really am cutting back on my EE time, I swear!  :)
0
 
LVL 5

Author Comment

by:paulqna
ID: 22937531
I use the connection pooling from websphere.

However I don't understand while both servers have 80% idle time, while waiting for eachother, this "artificially inflate" is harmless and unavoidable...?
0
 
LVL 27

Expert Comment

by:sujith80
ID: 22937686
Yes, the idle time resulted by the "idle" connections in the pool are harmless. In this case it is "just" a number indicating the idle time.

But you need to make sure that your network has a good response time as well.
0
 
LVL 5

Author Comment

by:paulqna
ID: 23084599
I hope all contributors agree that none of the comments have explained how to resolve the issue and I hereby close the question.
0
 
LVL 74

Accepted Solution

by:
sdstuber earned 1000 total points
ID: 23085810
no, I don't think we all agree (I don't anyway)

:)

it's not necessarily something that needs to be resolved it "could" be fine
that message simply means it's idle.  Which is ok if the database is waiting legitimately for "stuff" outside the database to happen.

It is NOT ok, if the events happen due to excessive handshaking.  It's still proper, but it means the application needs to
change to pump data in larger chunks to keep the db busy without having to talk back to the client so much and/or ensure all connections in the pool remain active, which may mean shrinking it or changing the pool configuration.

In either case, the database isn't having a problem.  It's simply telling you it's bored and the application needs to change to make that event count/time change.

Here I'm using the word "application" to mean everything that happens outside of the db, including people, network, the app code, pool, cpu, io lag, integration bottlenecks, etc.

If you're looking for a setting recommendation like "change value X in init.ora" then no, none of the above posts, nor this one will give you what you're looking for.
Such a setting simply doesn't exist. However, that doesn't mean the question hasn't been answered.
0
 
LVL 5

Author Closing Comment

by:paulqna
ID: 31515721
One can't always get what one wants.

My conclusion is that this question is not appropriate for answering via a forum.

Thanks to the experts for the thoughts and time spent on this one.

Case closed.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 23092266
I suggest rather than giving a C to me alone, you should split the points to everyone, I suggest a higher grade too

Split because others contributed to the answer.
Higher grade because, while you didn't get the answer you wanted, we did answer the question.
If you think a penalty grade is in order, please explain how we failed.
0
 
LVL 5

Author Comment

by:paulqna
ID: 23522809
The answer to this initial question:
"What parameter should we tweak/tune to get rid of the "76807.81 Total Waited" in 2,5 hours?"

Is:
Install oracle bugfix for the CBO in Oracle 9i
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to take different types of Oracle backups using RMAN.
Suggested Courses
Course of the Month15 days, 12 hours left to enroll

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question