I'm a bit rusty is it something to do with nulls?
have you tried
count(wp.cardnbr)
..
btw
i usually specify my on condition in the
same context as the join...
left
wp.cardnbr = pol_policy
hth
Main Topics
Browse All TopicsHello,
I've two tables, one called westpac_data and the other called pol_policy. The linking field between those two tables is cardnbr on the westpac_data table and plc_bill_acctnum on the pol_policy table. I want all the records from westpac_data and whatever matches on the pol_policy table. When I run the following query it acts like an equal join, it only gives me data from both tables where those two fields match.
There are 38442 records on westpac_data and 406000 records on pol_policy, the query below only returns 22622 records, i.e. only the matching records. Does anyone know why this does not work???
select count(*)
from admin.westpac_data wp left outer join admin.pol_policy s6 on wp.cardnbr = s6.plc_bill_acctnum
I don't think this question is difficult, I'm just not doing something right.
Cheers,
Chris
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
select count(*) from admin.westpac_data wp left outer join admin.pol_policy s6 on wp.cardnbr = s6.plc_bill_acctnum
weird...I created tables to test this as follows...
test(acct int)
test2(acct2 int,data char(1))
then inserted the following data into test...
1,2,3,4
then inserted into test2...
1,'y'
3,'y'
7,'y'
then ran the following...
select count(*) from test1 a left outer join test2 b on a.acct=b.acct2
and it returned
1
-----------
4
1 record(s) selected.
as expected.
so there is nothing wrong with your sql that's for certain...what db2 version and fixpack level are you at? there has been some problems with certain joins at different fixpack levels (mostly hybrid joins though).
Hello,
Thanks all for your replies. Very much appreciated.
1. I've checked and re-checked the total number of records on each table. There are 38,442 records on the westpac_data table and a litte over 406,000 on the pol_policy table.
2. We are using IBM DB2 7 fix pack 5.
If this helps anyone that could assist then that would be perfect. I've put on some more points as this seems a little more complicated than I first thought.
Thanks again,
Chris
yongsing, since you are on fixpack 5 could you create two tables like I did and test the sql on fixpack 5? I'm running fp6 and had no problems. I just wondering if the sql is hitting some weird bug in the optimizer that was fixed in fixpack 6. I know from experience that neither fixpack 4 or 5 are really that stable and had some serious issues.
Thanks
>> I've two tables, one called westpac_data and the other called pol_policy. The linking field [sic] between those two tables is cardnbr on the westpac_data table and plc_bill_acctnum on the pol_policy table. <<
This is SQL and not IMS, isn't it? Tables have rows and columns, which are totally different from the records and fields found in file systems.
>> I want all the records [sic] from westpac_data and whatever matches on the pol_policy table. <<
Where is the DDL for these tables? I suppose we can just guess at the keys, DRI, datatypes, constraints and everything else.
SELECT * -- change to real column names!!
FROM Westpac_data AS wp,
Pol_policy AS S6
WHERE WP.cardnbr = S6.plc_bill_acctnum;
If you want all the data, why are you asking for the cardinality of the outer join, which has to be at least equal to cardinality of the preserved table (Westpac_data)? YOur spec and your only sample do not match.
Are WP.cardnbr and S6.plc_bill_acctnum synomyms for the same data element? If so, why do you have two names for the same data element? Are they keys?
Here is how OUTER JOINs work in SQL-92. Assume you are given:
Table1 Table2
a b a c
====== ======
1 w 1 r
2 x 2 s
3 y 3 t
4 z
and the outer join expression:
Table1
LEFT OUTER JOIN
Table2
ON Table1.a = Table2.a <== join condition
AND Table2.c = 't'; <== single table condition
We call Table1 the "preserved table" and Table2 the "unpreserved table" in the query. What I am going to give you is a little different, but equivalent to the ANSI/ISO standards.
1) We build the CROSS JOIN of the two tables. Scan each row in the result set.
2) If the predicate tests TRUE for that row, then you keep it. You also remove all rows derived from it from the CROSS JOIN
3) If the predicate tests FALSE or UNKNOWN for that row, then keep the columns from the preserved table, convert all the columns from the unpreserved table to NULLs and remove the duplicates.
So let us execute this by hand:
Let @ = passed the first predicate
Let * = passed the second predicate
Table1 CROSS JOIN Table2
a b a c
=========================
1 w 1 r @
1 w 2 s
1 w 3 t *
2 x 1 r
2 x 2 s @
2 x 3 t *
3 y 1 r
3 y 2 s
3 y 3 t @* <== the TRUE set
4 z 1 r
4 z 2 s
4 z 3 t *
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
3 y 3 t <= only TRUE row
-----------------------
1 w NULL NULL Sets of duplicates
1 w NULL NULL
1 w NULL NULL
-----------------------
2 x NULL NULL
2 x NULL NULL
2 x NULL NULL
3 y NULL NULL <== derived from the TRUE set - Remove
3 y NULL NULL
-----------------------
4 z NULL NULL
4 z NULL NULL
4 z NULL NULL
the final results:
Table1 LEFT OUTER JOIN Table2
a b a c
=========================
1 w NULL NULL
2 x NULL NULL
3 y 3 t
4 z NULL NULL
The basic rule is that every row in the preserved table is represented in the results in at least one result row.
There are limitations and very serious problems with the extended equality version of an outer join used in some diseased mutant products. Consider the two Chris Date tables
Suppliers SupParts
supno supno partno qty
========= ==============
S1 S1 P1 100
S2 S1 P2 250
S3 S2 P1 100
S2 P2 250
and let's do an extended equality outer join like this:
SELECT *
FROM Supplier, SupParts
WHERE Supplier.supno *= SupParts.supno
AND qty < 200;
If I do the outer first, I get:
Suppliers LOJ SupParts
supno supno partno qty
=======================
S1 S1 P1 100
S1 S1 P2 250
S2 S2 P1 100
S2 S2 P2 250
S3 NULL NULL NULL
Then I apply the (qty < 200) predicate and get
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
Doing it in the opposite order
Suppliers LOJ SupParts
supno supno partno qty
===================
S1 S1 P1 100
S2 S2 P1 100
S3 NULL NULL NULL
Sybase does it one way, Oracle does it the other and Centura (nee Gupta) lets you pick which one -- the worst of both non-standard worlds! In SQL-92, you have a choice and can force the order of execution. Either do the predicates after the join ...
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
WHERE qty < 200;
... or do it in the joining:
SELECT *
FROM Supplier
LEFT OUTER JOIN
SupParts
ON Supplier.supno = SupParts.supno
AND qty < 200;
Another problem is that you cannot show the same table as preserved and unpreserved in the extended equality version, but it is easy in SQL-92. For example to find the students who have taken Math 101 and might have taken Math 102:
SELECT C1.student, C1.math, C2.math
FROM (SELECT * FROM Courses WHERE math = 101) AS C1
LEFT OUTER JOIN
(SELECT * FROM Courses WHERE math = 102) AS C2
ON C1.student = C2.student;
2 more ideas:
1/ db2set parameters
Can you although provide us with the "DB2SET -all" results as you may have some specific join type disabled ?
==> I bet your shirt on this one ;-)
2/ Different datatype for the joined fields ?
Does westpac_data.cardnbr and pol_policy table.plc_bill_acctnum use exactly the same datatype or is there any difference between them ?
And thanks to jcelko for giving us his next unusefull "The Theory lesson" using an URL instead of a 5 pages text :-(
Hello,
I do appreciate your efforts on this for me. I've got the DDL and the DB2SET -all (interesting your willing to bet my shirt on that :) results.
The DB2SET -all...
C:\>db2set -all
[e] DB2PATH=C:\Program Files\SQLLIB
[i] DB2SATELLITEID=admin
[i] DB2NBADAPTERS=1
[i] DB2INSTPROF=C:\PROGRAM FILES\SQLLIB
[i] DB2COMM=NETBIOS,TCPIP,NPIP
[g] DB2SYSTEM=WLIDB201
[g] DB2PATH=C:\Program Files\SQLLIB
[g] DB2INSTDEF=DB2
[g] DB2COMM=NETBIOS,TCPIP,NPIP
[g] DB2ADMINSERVER=DB2DAS00
C:\>
The DDL for POL_POLICY...
--------------------------
-- DDL Statements for table "ADMIN "."POL_POLICY"
--------------------------
CREATE TABLE "ADMIN "."POL_POLICY" (
"PLC_START" SMALLINT NOT NULL ,
"PLC_END" SMALLINT NOT NULL ,
"PLC_POLICY" CHAR(15) NOT NULL ,
"PLC_BILL_METHOD" CHAR(2) ,
"PLC_BILL_FREQ" CHAR(2) ,
"PLC_BILL_ACCTNUM" CHAR(20) ,
"PLC_CCEXPDATE" CHAR(5) ,
"PLC_BANKCODE" CHAR(10) NOT NULL ,
"PLC_BRANCHCD" CHAR(10) NOT NULL ,
"PLC_CREDIT_CARD" CHAR(2) NOT NULL ,
"PLC_TAX_JURIS" CHAR(2) NOT NULL WITH DEFAULT '',
"PLC_REF1" VARCHAR(100) ,
"PLC_REF2" VARCHAR(100) ,
"PLC_REF3" VARCHAR(100) ,
"PLC_BILLNAME" CHAR(50) NOT NULL WITH DEFAULT '',
"PLC_PRESENT_CUR" CHAR(2) NOT NULL WITH DEFAULT ' ',
"PLC_BILL_CUR" CHAR(2) NOT NULL WITH DEFAULT ' ',
"PLC_ACCT_DEBIT_DT" DATE )
IN "USERSPACE1" ;
ALTER TABLE "ADMIN "."POL_POLICY" PCTFREE 0;
-- DDL Statements for indexes on Table "ADMIN "."POL_POLICY"
CREATE INDEX "ADMIN "."PLC_ACCOUNTNUM" ON "ADMIN "."POL_POLICY"
("PLC_BILL_ACCTNUM" ASC);
-- DDL Statements for primary key on Table "ADMIN "."POL_POLICY"
ALTER TABLE "ADMIN "."POL_POLICY"
ADD PRIMARY KEY
("PLC_POLICY",
"PLC_START");
And finally the DDL for the Westpac_Data table...
--------------------------
-- DDL Statements for table "ADMIN "."WESTPAC_DATA"
--------------------------
CREATE TABLE "ADMIN "."WESTPAC_DATA" (
"ID" INTEGER NOT NULL ,
"ORG" INTEGER ,
"ACCTNBR" CHAR(20) ,
"CARDNBR" CHAR(30) ,
"CUSTNBR" CHAR(30) ,
"BLOCK1" CHAR(50) ,
"BLOCK2" CHAR(50) ,
"OPENDATE" DATE ,
"INS1CDE" CHAR(50) ,
"MAILNAME" CHAR(50) ,
"SALUTAT" CHAR(50) ,
"DOB" DATE ,
"ADDR1" CHAR(50) ,
"ADDR2" CHAR(50) ,
"ADDR3" CHAR(50) ,
"POSTCDE" CHAR(50) ,
"HOMEPHON" CHAR(50) )
IN "USERSPACE1" ;
I hope this assists in solving this most troublesome issue. I've upped the points to 500, thanks again for all your suggestions.
Chris
ok, think I may see where your problem lies....you are allowing the account number to be null on both tables and it is not guaranteed to be unique. so if you select count(*) from admin.pol_policy where PLC_BILL_ACCTNUM is null you will find your remaining rows. basically with no uniqueness ensured you will get one row back for all rows that contain a null account number in plc_bill_acctnum. You should redefine this table as not null and look into defining it as a primary key or creating a unique index on it.
never mind...I take that last one back. at least as far as the nulls dorking up the count returned. however, the query won't produce truly accurate results as long as you are allowing nulls in the account number columns (since theoretically, and from my testing, you could have multiple null values for account number in the first table and multiple null values in the second...which one matches the other???)
Sorry about that Shirras but Null values NEVER joins as they never matchs ("NULL = NULL" is NULL means neither FALSE nor TRUE)
==> I still do not see any non-BUG related reason for your simple count query not to retrieve your 38442 rows (and I loose a shirt)
You can although try to modify your columns datatype (or cast your query) to be exactly the same as it is not very normal to join using different column datatype (even if it is totally possible "theorically")
westpac_data.cardnbr is a char(30) (Make it 20)
pol_policy table.plc_bill_acctnum is a char(20) (Or make this one 30)
I agree BigSchmuh....but multiple null values in the policy table should all return the first null value row from westpac (which is probably not a good thing), but 38422 rows should still be returned. I've tried using char of different lengths and the same lengths on the join and both worked, even so 38422 rows should still be returned even if spaces are filled in different places. Haven't been able to find a specific apar in fixpack 5 or higher dealing with an issue like this yet. but I'll keep digging.
no NULL does not equal NULL
so you never match Table1 to Table2
and because your join condition is itself Null i think it
gets excluded....
you'll need to check Codd & Date, and go into relational/set theory to see why
try
select count(*)
from
(select case when cardnbr is null then 0 else 1 end as joinid
, cardnbr from admin.westpac_data a ) as wp
left outer join
(select 1 as joinid , plc_bill_acctnum from table b
where plc_bill_acctnum is not null) as s6
on a.joinid = b.joinid
where wp.cardnbr=s6.plc_bill_acc
or wp.cardnbr is null
hth
you are correct...you will get the null values from the first table and nulls returned for the values (or what should be the values) of the second making every null account number in pol_policy appear (and the count should come out correctly) but all columns selected from westpac will have null values for these null pol_policy account number rows. Hopefully that clarifies the point that I was trying to say...
Chris: can you just add a condition in your left join for us to get rid of this specific NULL case:
select count(*)
from admin.westpac_data wp
left outer join admin.pol_policy s6
on wp.cardnbr = s6.plc_bill_acctnum
AND s6.plc_bill_acctnum IS NOT NULL
==> This is my final post on your question as if you still do NOT have your 38442 rows, I will conclude "That is definitely a BUG. Please open a case at IBM support and ask for an APAR number (The BUG #reference you got when IBM can reproduce it) and its first FixPak".
Business Accounts
Answer for Membership
by: yongsingPosted on 2003-05-13 at 03:03:28ID: 8515355
The result is unexpected. Are you sure about the number of rows in each table?