tuning a query over DBLINK

Posted on 2006-04-03
Last Modified: 2010-04-05
  3 Servers, all with Oracle 9.2
  * PROD    --- live production server
  * STANDBY  --- standby database, updated every night by applying the logs, the rest of the day it is open as read_only
  * DW   --- a database that reports using data on the main database.

On PROD (and hence by STANDBY), there is a "big" table (10M rows), partitionned on a time field (creationtime).
+ index on some_field

I have a query that queries this table with a range condition on PROD/STANDBY:

select * from big_table where creationtime >= x and creationtime < y and some_field in ( list of values ) ;
returns in 8:35 minutes (0 rows)

I run the same query over DBLINK:
select * from big_table@PROD where creationtime >= x and creationtime < y and some_field in ( list of values ) ;
returns in 1:44:32 minutes (0 rows)

what could be the cause of this ?
what could be done to optimize this query over the dblink?

* I cannot add any view, functions, MViews or anything else to the PROD database. It's a custom application database, I have no change permissions to it at all.
* I tried to add the following hints with no (positive) effect:
  - driving_site(big_table)  
  - parallel  
  - all_rows
  - index ( myindex )


    LVL 4

    Accepted Solution

    Please reviews the following points:

    The SORT Area . Temporary tablespace

    If the database, exists in differents hardwares, can be a communication problems.


    Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)
    (HOST = COROZAL.acp)(PORT = 1526))) (CONNECT_DATA = (SERVICE_NAME = chiru)))
    OK (110 msec)

    Look the milliseconds.

    LVL 23

    Assisted Solution

    I often use DRIVING_SITE along with INDEX hint when joining small local tables to big remote tables. But, I first run an explain plan to see what the optimizer is doing.

    Can you post the two explain plans?  
    LVL 142

    Author Comment

    by:Guy Hengel [angelIII / a3]
    tnsping only reflects the time to connect, not the query performance as far as I know. if I am wrong there, let me read some article, please.

    >The SORT Area
    in how far the sort area is considered if there is no ORDER BY or GROUP BY in the query?

    > Temporary tablespace
    has 6 GB of which 1.5 are free. should be fine so far.

    I tried with both with and without (additional) index hint, no change.

    the explain plan, when run on the PROD, uses the INDEX
    the explain plan, when run on the DW, only shows FILTER on REMOTE table
    LVL 7

    Expert Comment

    Which user do you use in the DBLINK. Is it the same user that you use when running the query directly on PROD or a different user
    LVL 142

    Author Comment

    by:Guy Hengel [angelIII / a3]
    same user/schema.
    LVL 13

    Assisted Solution

    Can you use following meothod in order to get explain plan?

      1* explain plan for select * from t1@server2
    SQL> /


    SQL> select * from table( dbms_xplan.display )


    | Id  | Operation              |  Name        | Rows  | Bytes | Cost  | Inst   |
    |   0 | SELECT STATEMENT REMOTE|              | 21473 |  6416K|   553 |        |
    |   1 |  TABLE ACCESS FULL     | T1           | 21473 |  6416K|   553 | SERVER2|

    Note: fully remote operation, cpu costing is off
    LVL 7

    Assisted Solution

    are the x & y in your query bind variables. I remember seeing on metalink some bugs about bad query performance over dblink when using bind variables. If x & y are bind variables, try to run the query with literals, just to check if you are hitting one of the above bugs.
    LVL 142

    Author Comment

    by:Guy Hengel [angelIII / a3]
    >Can you use following meothod in order to get explain plan?
    nice information, good to know.

    however, this output confirms that the good index (Index Range Scan) of the table is used.

    >are the x & y in your query bind variables.
    I use simple variables in the code, but for testing I use litteral values (both same execution time)
    LVL 7

    Expert Comment

    If the explain plan is same, then the most probably the problem is in the underlying network. Check the wait events for the session. Check specifically for the waits events like 'SQL*Net .....' wait events. If you are getting high values for these events then the problem would be within the underlying network.

    LVL 7

    Expert Comment

    Another test that you can run is
    On DW system
    start sql*plus and connect to the PROD instance using the same tnsnames entry that your DBLINK uses.
    Run the query and check the performance.

    If it is a pure network problem, then this query performance should also be slow.

    LVL 142

    Author Comment

    by:Guy Hengel [angelIII / a3]
    Here a report of the wait events, seems to confirm your last guess goes into the right direction:

    now, what part to check first, the SQL*Net seconds waiting (1/820) seconds in average, or rather the latch free timeouts for example?

    Ma Avr 04                                                                        page
                                    System-wide Wait Analysis
                                     for current wait events

    Event                                 Total  Seconds        Total      Wait
    Name                                  Waits  Waiting     Timeouts (in secs)
    ------------------------------ ------------ -------- ------------ ---------
    SQL*Net more data from dblink    82,052,571  115,298            0      .000
    db file sequential read          61,054,126   74,980            0      .000
    SQL*Net message to dblink        56,003,325      560            0      .000
    SQL*Net message from dblink      56,003,324  453,138            0      .010
    db file scattered read           52,383,908   93,782            0      .000
    direct path read                 11,853,067  417,496            0      .040
    log file parallel write           3,022,317   51,753            0      .020
    direct path write                 1,422,830  310,557            0      .220
    log file sync                     1,319,468   36,425            7      .030
    PX Deq: Table Q Normal            1,189,847   24,136        1,019      .020
    latch free                          764,165   29,369      673,500      .040
    control file sequential read        341,706      209            0      .000
    control file parallel write         308,090   28,788            0      .090
    PX Deq: Execute Reply               270,819    6,080          937      .020
    buffer busy waits                   198,275   20,527          605      .100
    PX Deq Credit: send blkd            142,000    5,506          511      .040
    db file parallel write               89,905  124,229            0     1.380
    free buffer waits                    89,485   86,982       86,507      .970
    PX qref latch                        83,901    1,334       79,869      .020
    direct path read (lob)               56,558      360            0      .010
    PX Deq: Execution Msg                38,633    5,034        2,089      .130
    wakeup time manager                  30,527  887,645       30,507    29.080
    LGWR wait for redo copy              27,439      121        6,094      .000
    PX Deq Credit: need buffer           12,511      747          140      .060
    PX Deq: Table Q qref                 12,452      123           44      .010
    rdbms ipc reply                      11,562    4,304        1,400      .370
    enqueue                               9,742   24,246        8,021     2.490
    write complete waits                  8,326    8,133        8,122      .980
    SQL*Net more data from client         7,636       19            0      .000
    jobq slave wait                       4,824   14,045        4,687     2.910
    log buffer space                      3,882      875            1      .230
    library cache pin                     2,167    5,243        1,773     2.420
    SQL*Net more data to dblink           2,068        1            0      .000
    PX Deq: Signal ACK                    1,658       52          852      .030
    single-task message                   1,454      130            0      .090
    library cache lock                    1,358    3,959        1,343     2.920
    PX Deq: Msg Fragment                  1,165       16            3      .010
    PX Deq: Parse Reply                   1,010      110           44      .110
    undo segment extension                  498        0          498      .000
    local write wait                        421      195          153      .460
    async disk IO                           400       61            0      .150
    process startup                         275       32            0      .120
    db file parallel read                   239       32            0      .140
    PX Deq: Join ACK                        233        1            0      .000
    SQL*Net break/reset to dblink           198        0            0      .000
    wait for stopper event to be i          120       13          117      .100

    PX Deq: Table Q Sample                  116        2            0      .010
    log file switch completion              115       67           32      .590
    row cache lock                           89      181           59     2.030
    db file single write                     67        4            0      .060
    log file sequential read                 54        0            0      .010
    log file single write                    54        1            0      .020
    wait list latch free                     52        1            0      .020
    inactive session                         45       44           45      .990
    PX Deq: Table Q Get Keys                 42        1            0      .020
    library cache load lock                  20        2            0      .100
    PX Deq: Txn Recovery Start                5        0            0      .010
    reliable message                          3        3            2      .930
    buffer deadlock                           2        0            2      .000
    refresh controlfile command               1        0            0      .000
    control file heartbeat                    1        4            1     3.910

    61 ligne(s) s&#920;lectionn&#920;e(s).
    LVL 7

    Expert Comment


    following is the "SQL*Net message from dblink" Reference Note from Metalink. The note id is 34396.1. See if it can help you in isolating the problem


    _________START OF THE REFERENCE NOTE_________________-

    This is a reference note for the wait event "SQL*Net message from dblink" which includes the following subsections:
    Brief definition
    Individual wait details (eg: For waits seen in <View:V$SESSION_WAIT>)
    Systemwide wait details (eg: For waits seen in <View:V$SYSTEM_EVENT>)
    Reducing waits / wait times
    See Note 61998.1 for an introduction to Wait Events.

    Versions:7.0 - 9.2 Documentation: 9.0
    The Oracle shadow process is waiting for a message over a database link from a remote process. Note that this wait is also used when waiting for data from "extproc" or from a remote gateway process.
    Individual Waits:
    P1 = driver id
    P2 = bytes#
    P3 = Not used
    driver id
    In Oracle8i onwards P1RAW can be decoded into ASCII characters to give a clue as to which Net driver is used.
    Eg: P1RAW=0x62657100 = 'beq\0' , P1RAW=0x54435000 = 'TCP\0' etc.
    In earlier releases the value here is the value of the disconnect function of the Net driver being used (which is not much use).

    bytes# (number of bytes received)
    The number of bytes we need to receive. (This is often just 1 even though the expected packet will be much larger)
      Wait Time:
    This wait blocks until a message is received from the remote connection (or until an abnormal end of file condition occurs on the underlying Net transport layer). There is no Oracle timeout on the wait.
      Finding Blockers:
    The blocker is the network plus the remote process. If the remote process is another database instance (accessed via a database link) then the information in <View:V$SESSION> on the REMOTE database will have the same client details as the V$SESSION entry on the local database for the blocked session. You need to look at the remote instance to determine where that session is spending time.
    Systemwide Waits:
    If systemwide waits for this event are significant it is best to determine where the remote connections are to and switch attention to the remote instance / instances to determine where they are spending time. One can also look at:
    Sessions with high values in <View:V$SESSTAT> for:
    SQL*Net roundtrips to/from dblink
    bytes sent via SQL*Net to dblink
    bytes received via SQL*Net from dblink
    The Network between the local and remote systems (problems are usually related to time spent ON the remote instance rather than in the network but it is worth checking if the network between the instances is slow / not)
    Reducing Waits / Wait times:
    It is normally best to see what propertion of the time is actually spent on the remote instance as the most common cause of long waits over a DB link is that the time is actually all spent doing work on the remote instance (either waiting or working).
    Trace the session locally and at the remote instance to see what work a user "transaction" actually consists of
    Check the execution plans for any distributed queries
    See if frequently accessed remote data can be kept in a local snapshot (Materialized View).

    LVL 142

    Author Comment

    by:Guy Hengel [angelIII / a3]
    Ok guys, I found that my database suffered several problems:

    TEMP tablespace close to full (and no autogrow)
    DATA tablespace close to full , but with LOW file increase
    jobs that failed and rerun automatically all the time

    Problems all solved now, tips used, performance is up again.

    Thanks for the support
    I will distribute the points for some new things learned!

    pbocanegra (temp tablespace)
    paquicuba (DRIVING_SITE hint)
    riazpk (dbms_xplan.display)
    vishal68 (bind variables / DBLINK)

    Featured Post

    Looking for New Ways to Advertise?

    Engage with tech pros in our community with native advertising, as a Vendor Expert, and more.

    Join & Write a Comment

    Article by: Swadhin
    From the Oracle SQL Reference ( we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
    Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here. General principle of Oracle compression Oracle compression is a way of reducing the d…
    This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
    This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

    755 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    23 Experts available now in Live!

    Get 1:1 Help Now