Solved

DB2 Query Joining Two Tables, One Needing to be Distinct Rows

Posted on 2013-11-07
4
1,489 Views
Last Modified: 2013-11-15
I am trying to preform a left join with two DB2 tables. My primary table may or may not contain a value in the field I am joining on but I still will need all rows from my primary table. My secondary table contains some reference fields that I need. If the secondary table contained only the reference tables I needed then I would have a perfect one to many relationship but thats not the case. The table contains other fields that need to be eliminated thus the reason for wanting distinct rows from the secondary table.

I need some assistance in making this query more efficient. I know that the query will work because I have broken out the two tables, extracted a joining value and queried them seperatly and it works. Its extremely slow right now. The primary keys are the claim_no, policy_no, base_policy_no and customer_id fields. Below is the code that I am trying to run.

SELECT
	DATE(A.CALL_LOG_DT) AS CALLDATE,
	A.CALL_LOG_TIME,
	
	CASE LENGTH(RTRIM(A.CUSTOMER_ID))
		WHEN 8 THEN '0000'||RTRIM(A.CUSTOMER_ID)
		WHEN 7 THEN '00000'||RTRIM(A.CUSTOMER_ID)
		WHEN 6 THEN '000000'||RTRIM(A.CUSTOMER_ID)
		WHEN 5 THEN '0000000'||RTRIM(A.CUSTOMER_ID)
		WHEN 4 THEN '00000000'||RTRIM(A.CUSTOMER_ID)
		WHEN 3 THEN '000000000'||RTRIM(A.CUSTOMER_ID)
	ELSE A.CUSTOMER_ID END AS CUSTOMER_ID,
	A.COMMTYPE_CD,
	A.REQ_AGT_CD,
	A.CLAIM_NO,
	A.POLICY_NO,
	A.REQ_TYP_CD,
	A.BASE_POLICY_NO,
	A.LOG_USER_ID,
	
	B.CLAIM_TYP_CD,
	B.CLOSE_DT,
	B.OPEN_DT,
	B.OPEN_IND
	
FROM CALLDETAILS_CURRENT A

LEFT JOIN CLAIMSTABLE B ON 
	B.CLAIM_NO = A.CLAIM_NO
	AND B.BASE_POLICY_NO = A.BASE_POLICY_NO

WHERE
A.REQ_TYP_CD IN ('CU', 'PA','SR','BR')
AND LEFT(A.LOG_USER_ID,3) IN ('C75', 'C80')
AND DATE(A.CALL_LOG_DT) >= '10/01/2013';

Open in new window

0
Comment
Question by:spaced45
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39632533
There's not a lot of scope to tune this query

however the last 2 where conditions apply functions to the data to suit the condition
this should be reversed, change the condition to suit the data

line 34 could be:
AND ( A.LOG_USER_ID LIKE 'C75%' OR  A.LOG_USER_ID LIKE 'C80%' )

line 35 doesn't need the function:
AND A.CALL_LOG_DT >= '10/01/2013';

(a call_log_dt of 10/01/2013 15:16:17 is >= 10/01/2013, it does not need to be truncated to date only)
see: http://en.wikipedia.org/wiki/Sargable

Rules of thumb
Avoid applying functions on data values in a sql condition.
Avoid non-sargable predicates and replace them with sargable equivalents.
Beyond those simple changes I would expect that only inspecting indexes would reveal further tuning potential.
0
 
LVL 1

Author Comment

by:spaced45
ID: 39632622
I was able to run it quicker but I am still pulling more rows than I need from the secondary table. Here is why I am thinking that there is a problem. The primary table can contain duplicate claim numbers. The secondary table if could also contain duplicate rows with the same claim number. So if I have I have 2 rows from on and 2 matching rows from the other I get four rows where I only need the 2 rows from the primary table and only 1 from the seondary.

So I am still have the issue. In other words I am using the seondary table as a reference table although its not really a reference table. Here is an additional example.
One claim can be only one claim type, have one close date and only one open date. Mixed into the table though is a field for payments where there can be more than one. So now I have all of the fields below more than once. The only way that I know how to get rid of those extra rows is by using distinct but I know thats not the efficient way to do it.
B.CLAIM_TYP_CD,
B.CLOSE_DT,
B.OPEN_DT,
B.OPEN_IND
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
ID: 39632731
you could try this
SELECT
        DATE(A.CALL_LOG_DT) AS CALLDATE
      , A.CALL_LOG_TIME
      , RIGHT('00000000' || RTRIM(A.CUSTOMER_ID), 12)
      , A.COMMTYPE_CD
      , A.REQ_AGT_CD
      , A.CLAIM_NO
      , A.POLICY_NO
      , A.REQ_TYP_CD
      , A.BASE_POLICY_NO
      , A.LOG_USER_ID
      , B.CLAIM_TYP_CD
      , B.CLOSE_DT
      , B.OPEN_DT
      , B.OPEN_IND

FROM CALLDETAILS_CURRENT A

        LEFT JOIN (
                SELECT
                        CLAIM_NO
                      , BASE_POLICY_NO
                      , MAX(CLAIM_TYP_CD) AS CLAIM_TYP_CD
                      , MAX(CLOSE_DT)     AS CLOSE_DT
                      , MIN(OPEN_DT)      AS OPEN_DT
                      , MIN(OPEN_IND)     AS OPEN_IND
                FROM CLAIMSTABLE
                GROUP BY
                        CLAIM_NO
                      , BASE_POLICY_NO
        ) B ON B.CLAIM_NO = A.CLAIM_NO
                        AND B.BASE_POLICY_NO = A.BASE_POLICY_NO

WHERE A.REQ_TYP_CD IN ('CU', 'PA', 'SR', 'BR')
        AND (A.LOG_USER_ID LIKE 'C75%'
        OR A.LOG_USER_ID LIKE 'C80%')
        AND A.CALL_LOG_DT >= '10/01/2013';

Open in new window

Notes:
I have chosen to use MAX() and MIN() in the new subquery, but you need to verify that is what you actually want.

I think that case expression you had in the original may be replaced by using RIGHT() but In not entirely sure how long the final result is, I have chosen 12. Adjust that number to suit.

The above is untested and might have syntax errors I haven't caught.
0
 
LVL 1

Author Closing Comment

by:spaced45
ID: 39651839
Worked great! thank you,
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
A company’s centralized system that manages user data, security, and distributed resources is often a focus of criminal attention. Active Directory (AD) is no exception. In truth, it’s even more likely to be targeted due to the number of companies …
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question