Solved

Union All causing headache

Posted on 2010-08-24
22
928 Views
Last Modified: 2012-05-10
I wrote a query for DB2 that takes about 4 seconds to run once and subsequent runs take 250 ms. I wrote another query that takes about 7 seconds to run once and 300 ms afterwards. If i UNION ALL them it'll take 90-120 seconds every time. I'm only producing a single result even. That's a tremendous increase in processing time for what seems like a simple operation. Is there a strategy to run it faster?

I've included the 2 queries in question. I just can't figure out why UNION ALL slows them down so much.
select oainv
from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
where obcmp = 1 and obdel = 'A'
and oacmp = 1 and oadel = 'A'
and oaord = obord and oabocd = obbocd
and oaord = mrord
and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = oacust
 
union all
 
select sainv oainv
from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship
where sbcmp = 1
and sacmp = 1
and saord = sbord and sabocd = sbbocd and saindt = sbindt
and saord = mrord
and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = sacust

Open in new window

0
Comment
Question by:wellhole
  • 9
  • 7
  • 4
  • +2
22 Comments
 
LVL 45

Expert Comment

by:Kdo
ID: 33516096
Hi Wellhole,


Are you saying that if you run lines 1-9 of your SQL and then run lines 13-21 you get answers back in about 1/10 the time that you do when you run all 21 lines at once?

I'd like to see the explain plans for all 3 queries.


Kent
0
 
LVL 37

Expert Comment

by:momi_sabag
ID: 33516350
just couple of questions
i know it will sound silly, but maybe you can try these things

try the same union all but reverse the order of the queries

or try the query i added in the code section
select * from (

select oainv

from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST

left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship

where obcmp = 1 and obdel = 'A'

and oacmp = 1 and oadel = 'A'

and oaord = obord and oabocd = obbocd

and oaord = mrord

and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'

and rmcmp = 1 and rmcust = oacust

 fetch first 1 rows only) t

union all

 select * from (

select sainv oainv

from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST

left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship

where sbcmp = 1

and sacmp = 1

and saord = sbord and sabocd = sbbocd and saindt = sbindt

and saord = mrord

and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'

and rmcmp = 1 and rmcust = sacust

fetch first 1 rows only

) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33516832
Kdo: Yes, the union all runs very very slowly. I'll try to get the explain tomorrow morning (off work now).

momi_sabag: I don't think reversing it will help because I tried union all with the same top query on the top & bottom and its also very slow in repeated runs (nowhere close to 500ms). Fetch first wouldn't work out - I'm saying 1 but that's just a sample from the database; It's actually unknown results. Sorry for the confusion. I've already tried using select * (select x) union all select * (select y) and it produces same slow result.
0
 
LVL 27

Expert Comment

by:tliotta
ID: 33517387
Just to be certain, is line 13 a typo?

select sainv oainv
Tom
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33517400
Hi Tom,

I looked at that, too.  I think that it's just an alias...


Kent
0
 
LVL 9

Author Comment

by:wellhole
ID: 33517417
tliotta: [as] is optional, so i dont use it often. i needed the columns to have the same name when i eventually group them to do a summary. so just didnt bother stripping the alias out.
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33517429
Hi Wellhole,

What O/S and version of DB2 are you using?


Kent
0
 
LVL 27

Expert Comment

by:tliotta
ID: 33517433
Is there any chance you could assign correlation-names to the tables and correlation references for the columns? (Or supply the table definitions?)

Tom
0
 
LVL 27

Assisted Solution

by:tliotta
tliotta earned 250 total points
ID: 33517464
If you specify your two SELECTs as CTEs, does it still take excessive time?

Tom
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33517511
Hi Wellhole,

Try something else.

Run the top query, then the bottom query.  Repeat the top query, then the bottom query.

Get timing results.

There are significant differences in the table list in the two queries, and ultimately the indexes being used.  I'm wondering if if there's much (anything) left of one of the subqueries in the bufferpool so that the other subquery has to do physical I/O to repopulate the buffer pool.

Running the same subquery in succession minimizes physical I/O by reusing the buffer pool.  I'm hoping that the queries above will maximize physical I/O, replicating your problem.


Kent
0
 
LVL 9

Author Comment

by:wellhole
ID: 33517687
Kdo: It's running on AS400 v5r3m0.

tliotta: I tried renaming them as a b c d e f g on the bottom and doing a.x b.y etc and no improvement. I don't think this older version of db2 supports CTEs.

Kdo: I ran them - top, bottom, top, bottom - 3 seconds, 6 seconds, 218 ms, 266 ms. I guess that doesn't replicate my problem.

I'm stumped. I'd be happy with 3 + 6 = 9 second runs or some more, but no it gives me 90+ seconds.
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 45

Expert Comment

by:Kdo
ID: 33520294
Hi WellHole,

How many rows are returned?

0
 
LVL 9

Author Comment

by:wellhole
ID: 33520830
Kdo: 4 rows
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33520947

[amazement]

2 minutes for 4 rows?????

[/amazement]


Did you run the query that momi_sabaq posted above?  It more clearly defines the scope of each subquery.  It's repeated below, with the row limiter removed.

You mix join styles, using the obsolete form for the inner joins and the modern style for the outer joins.  I wouldn't expect it to make a difference, but without an explain plan to check the parser logic it may be that the joins happen in a different order, increasing the workload.


Kent

select * 

from 

(

  select oainv

  from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD,

    TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST

  left outer join TESTDATA.VARSHIP 

    on rvcmp = 1 

   and rvcust = oacust 

   and rvship = oaship

  where obcmp = 1 and obdel = 'A'

    and oacmp = 1 and oadel = 'A'

    and oaord = obord 

    and oabocd = obbocd

    and oaord = mrord

    and obitem = icitem 

    and ictrln = i001trln 

    and i001001 = 'manufacturer'

    and rmcmp = 1 

    and rmcust = oacust

) t1

union all

select * 

from 

(

  select sainv oainv

  from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD,

    TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST

  left outer join TESTDATA.VARSHIP 

    on rvcmp = 1 

   and rvcust = sacust 

   and rvship = saship

  where sbcmp = 1

    and sacmp = 1

    and saord = sbord 

    and sabocd = sbbocd 

    and saindt = sbindt

    and saord = mrord

    and sbitem = icitem 

    and ictrln = i001trln 

    and i001001 = 'manufacturer'

    and rmcmp = 1 

    and rmcust = sacust

) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33521412
That took extra long to run - 228 seconds. Rerun took 111 seconds. I wish I could post an explain plan.

Back to original query - I only know how to create a visual explain through the iseries navigator and that's easy to show because theres a lot of screens and info. It looks like separately they're just nested joins using indexes, but when unioned it uses temporary indexes for VCOHEAD and VSAHEAD and a different index for VSADETL is used and table scan for MR20100824 (it only has 2 records).
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33521567

MR20100824 only has 2 records.  :)

Let's rewrite the INNER joins in the modern style.  That way we can control the order of the joins.

I've start the process in the code below.  Sorry, but I don't know your database structure well enough to convert the matches in the WHERE clause to the ON {key = key} format.


Kent

select * 

from 

(

  select oainv

  from TESTDATA.MR20100823

  inner join TESTDATA.VCODETL, 

  inner join TESTDATA.VCOHEAD,

  inner join TESTDATA.VINITEM, 

  inner join TESTDATA.VININDEX, 

  inner join TESTDATA.VARCUST

  left outer join TESTDATA.VARSHIP 

    on rvcmp = 1 

   and rvcust = oacust 

   and rvship = oaship

  where obcmp = 1 and obdel = 'A'

    and oacmp = 1 and oadel = 'A'

    and oaord = obord 

    and oabocd = obbocd

    and oaord = mrord

    and obitem = icitem 

    and ictrln = i001trln 

    and i001001 = 'manufacturer'

    and rmcmp = 1 

    and rmcust = oacust

) t1

union all

select * 

from 

(

  select sainv oainv

  from TESTDATA.MR20100823, 

  inner join TESTDATA.VSADETL, 

  inner join TESTDATA.VSAHEAD,

  inner join TESTDATA.VINITEM, 

  inner join TESTDATA.VININDEX, 

  inner join TESTDATA.VARCUST

  left outer join TESTDATA.VARSHIP 

    on rvcmp = 1 

   and rvcust = sacust 

   and rvship = saship

  where sbcmp = 1

    and sacmp = 1

    and saord = sbord 

    and sabocd = sbbocd 

    and saindt = sbindt

    and saord = mrord

    and sbitem = icitem 

    and ictrln = i001trln 

    and i001001 = 'manufacturer'

    and rmcmp = 1 

    and rmcust = sacust

) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33521718
This took 140 seconds. Running as t1/t2 took 130 seconds.
select oainv

from TESTDATA.MR20100824

join TESTDATA.VCODETL on obcmp = 1 and obdel = 'A' and obord = mrord

join TESTDATA.VCOHEAD on oacmp = 1 and oadel = 'A' and oaord = obord and oabocd = obbocd

join TESTDATA.VINITEM on icitem = obitem

join TESTDATA.VININDEX on i001trln = ictrln and i001001 = 'manufacturer'

join TESTDATA.VARCUST on rmcmp = 1 and rmcust = oacust

left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship

 

union all

 

select sainv

from TESTDATA.MR20100824

join TESTDATA.VSADETL on sbcmp = 1 and sbord = mrord

join TESTDATA.VSAHEAD on sacmp = 1 and saord = sbord and sabocd = sbbocd and saindt = sbindt

join TESTDATA.VINITEM on icitem = sbitem

join TESTDATA.VININDEX on i001trln = ictrln and i001001 = 'manufacturer'

join TESTDATA.VARCUST on rmcmp = 1 and rmcust = sacust

left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship

Open in new window

0
 
LVL 45

Expert Comment

by:Kdo
ID: 33521919
Can we change the order of the tables in the inner joins?  Since MR20100824 has only 2 rows, I'd like to structure the joins so that any cartesian product occurs last, or is followed immediately by a join that reduces the size of the result set.  This may not be possible, but if we can force the cartesian join to be last, we'll use the indexes to our best advantage.


Kent
0
 
LVL 27

Expert Comment

by:tliotta
ID: 33527560
It's running on AS400 v5r3m0.

V5R3 supports CTEs.

I tried renaming them as a b c d e f g on the bottom and doing a.x b.y etc and no improvement.

It wasn't as much for improvement as it was for readability to tell us something about which columns are coming from which tables. Knowing something about the table structures could tell us a lot about potential changes to suggest. However, the various JOIN clauses in the latest examples look possibly sufficient.

Tom
0
 
LVL 2

Accepted Solution

by:
dcgrindle earned 250 total points
ID: 33534143
Have you tried using WITH clauses?  I find it speeds things up for me often.


With table1 as (
select oainv
from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
where obcmp = 1 and obdel = 'A'
and oacmp = 1 and oadel = 'A'
and oaord = obord and oabocd = obbocd
and oaord = mrord
and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = oacust
 ),

table2 as (
 
select sainv oainv
from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship
where sbcmp = 1
and sacmp = 1
and saord = sbord and sabocd = sbbocd and saindt = sbindt
and saord = mrord
and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = sacust
 )

table1
union all
table2

Open in new window

0
 
LVL 9

Author Closing Comment

by:wellhole
ID: 33534390
You rocked it! Run took 14 seconds on first run. Looking back, tliotta actually came up with this first. I was confused about what the system could do and just wasn't able to solve it. Thanks everyone.
0
 
LVL 45

Expert Comment

by:Kdo
ID: 33534460
Using CTEs never occurred to me, but it will next time.  :)


A day where you learn something is always a good day.....

0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Entity Framework is a powerful tool to help you interact with the DataBase but still doesn't help much when we have a Stored Procedure that returns more than one resultset. The solution takes some of out-of-the-box thinking; read on!
Read about achieving the basic levels of HRIS security in the workplace.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now