Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Union All causing headache

Posted on 2010-08-24
22
Medium Priority
?
952 Views
Last Modified: 2012-05-10
I wrote a query for DB2 that takes about 4 seconds to run once and subsequent runs take 250 ms. I wrote another query that takes about 7 seconds to run once and 300 ms afterwards. If i UNION ALL them it'll take 90-120 seconds every time. I'm only producing a single result even. That's a tremendous increase in processing time for what seems like a simple operation. Is there a strategy to run it faster?

I've included the 2 queries in question. I just can't figure out why UNION ALL slows them down so much.
select oainv
from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
where obcmp = 1 and obdel = 'A'
and oacmp = 1 and oadel = 'A'
and oaord = obord and oabocd = obbocd
and oaord = mrord
and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = oacust
 
union all
 
select sainv oainv
from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship
where sbcmp = 1
and sacmp = 1
and saord = sbord and sabocd = sbbocd and saindt = sbindt
and saord = mrord
and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = sacust

Open in new window

0
Comment
Question by:wellhole
  • 9
  • 7
  • 4
  • +2
22 Comments
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33516096
Hi Wellhole,


Are you saying that if you run lines 1-9 of your SQL and then run lines 13-21 you get answers back in about 1/10 the time that you do when you run all 21 lines at once?

I'd like to see the explain plans for all 3 queries.


Kent
0
 
LVL 37

Expert Comment

by:momi_sabag
ID: 33516350
just couple of questions
i know it will sound silly, but maybe you can try these things

try the same union all but reverse the order of the queries

or try the query i added in the code section
select * from (
select oainv
from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
where obcmp = 1 and obdel = 'A'
and oacmp = 1 and oadel = 'A'
and oaord = obord and oabocd = obbocd
and oaord = mrord
and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = oacust
 fetch first 1 rows only) t
union all
 select * from (
select sainv oainv
from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship
where sbcmp = 1
and sacmp = 1
and saord = sbord and sabocd = sbbocd and saindt = sbindt
and saord = mrord
and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = sacust
fetch first 1 rows only
) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33516832
Kdo: Yes, the union all runs very very slowly. I'll try to get the explain tomorrow morning (off work now).

momi_sabag: I don't think reversing it will help because I tried union all with the same top query on the top & bottom and its also very slow in repeated runs (nowhere close to 500ms). Fetch first wouldn't work out - I'm saying 1 but that's just a sample from the database; It's actually unknown results. Sorry for the confusion. I've already tried using select * (select x) union all select * (select y) and it produces same slow result.
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
LVL 27

Expert Comment

by:tliotta
ID: 33517387
Just to be certain, is line 13 a typo?

select sainv oainv
Tom
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33517400
Hi Tom,

I looked at that, too.  I think that it's just an alias...


Kent
0
 
LVL 9

Author Comment

by:wellhole
ID: 33517417
tliotta: [as] is optional, so i dont use it often. i needed the columns to have the same name when i eventually group them to do a summary. so just didnt bother stripping the alias out.
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33517429
Hi Wellhole,

What O/S and version of DB2 are you using?


Kent
0
 
LVL 27

Expert Comment

by:tliotta
ID: 33517433
Is there any chance you could assign correlation-names to the tables and correlation references for the columns? (Or supply the table definitions?)

Tom
0
 
LVL 27

Assisted Solution

by:tliotta
tliotta earned 1000 total points
ID: 33517464
If you specify your two SELECTs as CTEs, does it still take excessive time?

Tom
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33517511
Hi Wellhole,

Try something else.

Run the top query, then the bottom query.  Repeat the top query, then the bottom query.

Get timing results.

There are significant differences in the table list in the two queries, and ultimately the indexes being used.  I'm wondering if if there's much (anything) left of one of the subqueries in the bufferpool so that the other subquery has to do physical I/O to repopulate the buffer pool.

Running the same subquery in succession minimizes physical I/O by reusing the buffer pool.  I'm hoping that the queries above will maximize physical I/O, replicating your problem.


Kent
0
 
LVL 9

Author Comment

by:wellhole
ID: 33517687
Kdo: It's running on AS400 v5r3m0.

tliotta: I tried renaming them as a b c d e f g on the bottom and doing a.x b.y etc and no improvement. I don't think this older version of db2 supports CTEs.

Kdo: I ran them - top, bottom, top, bottom - 3 seconds, 6 seconds, 218 ms, 266 ms. I guess that doesn't replicate my problem.

I'm stumped. I'd be happy with 3 + 6 = 9 second runs or some more, but no it gives me 90+ seconds.
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33520294
Hi WellHole,

How many rows are returned?

0
 
LVL 9

Author Comment

by:wellhole
ID: 33520830
Kdo: 4 rows
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33520947

[amazement]

2 minutes for 4 rows?????

[/amazement]


Did you run the query that momi_sabaq posted above?  It more clearly defines the scope of each subquery.  It's repeated below, with the row limiter removed.

You mix join styles, using the obsolete form for the inner joins and the modern style for the outer joins.  I wouldn't expect it to make a difference, but without an explain plan to check the parser logic it may be that the joins happen in a different order, increasing the workload.


Kent

select * 
from 
(
  select oainv
  from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD,
    TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
  left outer join TESTDATA.VARSHIP 
    on rvcmp = 1 
   and rvcust = oacust 
   and rvship = oaship
  where obcmp = 1 and obdel = 'A'
    and oacmp = 1 and oadel = 'A'
    and oaord = obord 
    and oabocd = obbocd
    and oaord = mrord
    and obitem = icitem 
    and ictrln = i001trln 
    and i001001 = 'manufacturer'
    and rmcmp = 1 
    and rmcust = oacust
) t1
union all
select * 
from 
(
  select sainv oainv
  from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD,
    TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
  left outer join TESTDATA.VARSHIP 
    on rvcmp = 1 
   and rvcust = sacust 
   and rvship = saship
  where sbcmp = 1
    and sacmp = 1
    and saord = sbord 
    and sabocd = sbbocd 
    and saindt = sbindt
    and saord = mrord
    and sbitem = icitem 
    and ictrln = i001trln 
    and i001001 = 'manufacturer'
    and rmcmp = 1 
    and rmcust = sacust
) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33521412
That took extra long to run - 228 seconds. Rerun took 111 seconds. I wish I could post an explain plan.

Back to original query - I only know how to create a visual explain through the iseries navigator and that's easy to show because theres a lot of screens and info. It looks like separately they're just nested joins using indexes, but when unioned it uses temporary indexes for VCOHEAD and VSAHEAD and a different index for VSADETL is used and table scan for MR20100824 (it only has 2 records).
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33521567

MR20100824 only has 2 records.  :)

Let's rewrite the INNER joins in the modern style.  That way we can control the order of the joins.

I've start the process in the code below.  Sorry, but I don't know your database structure well enough to convert the matches in the WHERE clause to the ON {key = key} format.


Kent

select * 
from 
(
  select oainv
  from TESTDATA.MR20100823
  inner join TESTDATA.VCODETL, 
  inner join TESTDATA.VCOHEAD,
  inner join TESTDATA.VINITEM, 
  inner join TESTDATA.VININDEX, 
  inner join TESTDATA.VARCUST
  left outer join TESTDATA.VARSHIP 
    on rvcmp = 1 
   and rvcust = oacust 
   and rvship = oaship
  where obcmp = 1 and obdel = 'A'
    and oacmp = 1 and oadel = 'A'
    and oaord = obord 
    and oabocd = obbocd
    and oaord = mrord
    and obitem = icitem 
    and ictrln = i001trln 
    and i001001 = 'manufacturer'
    and rmcmp = 1 
    and rmcust = oacust
) t1
union all
select * 
from 
(
  select sainv oainv
  from TESTDATA.MR20100823, 
  inner join TESTDATA.VSADETL, 
  inner join TESTDATA.VSAHEAD,
  inner join TESTDATA.VINITEM, 
  inner join TESTDATA.VININDEX, 
  inner join TESTDATA.VARCUST
  left outer join TESTDATA.VARSHIP 
    on rvcmp = 1 
   and rvcust = sacust 
   and rvship = saship
  where sbcmp = 1
    and sacmp = 1
    and saord = sbord 
    and sabocd = sbbocd 
    and saindt = sbindt
    and saord = mrord
    and sbitem = icitem 
    and ictrln = i001trln 
    and i001001 = 'manufacturer'
    and rmcmp = 1 
    and rmcust = sacust
) t2

Open in new window

0
 
LVL 9

Author Comment

by:wellhole
ID: 33521718
This took 140 seconds. Running as t1/t2 took 130 seconds.
select oainv
from TESTDATA.MR20100824
join TESTDATA.VCODETL on obcmp = 1 and obdel = 'A' and obord = mrord
join TESTDATA.VCOHEAD on oacmp = 1 and oadel = 'A' and oaord = obord and oabocd = obbocd
join TESTDATA.VINITEM on icitem = obitem
join TESTDATA.VININDEX on i001trln = ictrln and i001001 = 'manufacturer'
join TESTDATA.VARCUST on rmcmp = 1 and rmcust = oacust
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
 
union all
 
select sainv
from TESTDATA.MR20100824
join TESTDATA.VSADETL on sbcmp = 1 and sbord = mrord
join TESTDATA.VSAHEAD on sacmp = 1 and saord = sbord and sabocd = sbbocd and saindt = sbindt
join TESTDATA.VINITEM on icitem = sbitem
join TESTDATA.VININDEX on i001trln = ictrln and i001001 = 'manufacturer'
join TESTDATA.VARCUST on rmcmp = 1 and rmcust = sacust
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship

Open in new window

0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33521919
Can we change the order of the tables in the inner joins?  Since MR20100824 has only 2 rows, I'd like to structure the joins so that any cartesian product occurs last, or is followed immediately by a join that reduces the size of the result set.  This may not be possible, but if we can force the cartesian join to be last, we'll use the indexes to our best advantage.


Kent
0
 
LVL 27

Expert Comment

by:tliotta
ID: 33527560
It's running on AS400 v5r3m0.

V5R3 supports CTEs.

I tried renaming them as a b c d e f g on the bottom and doing a.x b.y etc and no improvement.

It wasn't as much for improvement as it was for readability to tell us something about which columns are coming from which tables. Knowing something about the table structures could tell us a lot about potential changes to suggest. However, the various JOIN clauses in the latest examples look possibly sufficient.

Tom
0
 
LVL 2

Accepted Solution

by:
dcgrindle earned 1000 total points
ID: 33534143
Have you tried using WITH clauses?  I find it speeds things up for me often.


With table1 as (
select oainv
from TESTDATA.MR20100823, TESTDATA.VCODETL, TESTDATA.VCOHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = oacust and rvship = oaship
where obcmp = 1 and obdel = 'A'
and oacmp = 1 and oadel = 'A'
and oaord = obord and oabocd = obbocd
and oaord = mrord
and obitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = oacust
 ),

table2 as (
 
select sainv oainv
from TESTDATA.MR20100823, TESTDATA.VSADETL, TESTDATA.VSAHEAD, TESTDATA.VINITEM, TESTDATA.VININDEX, TESTDATA.VARCUST
left outer join TESTDATA.VARSHIP on rvcmp = 1 and rvcust = sacust and rvship = saship
where sbcmp = 1
and sacmp = 1
and saord = sbord and sabocd = sbbocd and saindt = sbindt
and saord = mrord
and sbitem = icitem and ictrln = i001trln and i001001 = 'manufacturer'
and rmcmp = 1 and rmcust = sacust
 )

table1
union all
table2

Open in new window

0
 
LVL 9

Author Closing Comment

by:wellhole
ID: 33534390
You rocked it! Run took 14 seconds on first run. Looking back, tliotta actually came up with this first. I was confused about what the system could do and just wasn't able to solve it. Thanks everyone.
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 33534460
Using CTEs never occurred to me, but it will next time.  :)


A day where you learn something is always a good day.....

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This post looks at MongoDB and MySQL, and covers high-level MongoDB strengths, weaknesses, features, and uses from the perspective of an SQL user.
Instead of error trapping or hard-coding for non-updateable fields when using QODBC, let VBA automatically disable them when forms open. This way, users can view but not change the data. Part 1 explained how to use schema tables to do this. Part 2 h…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
Suggested Courses

972 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question