Solved

Joining 2 tables in SQL

Posted on 2014-07-31
20
164 Views
Last Modified: 2014-09-04
Hello Experts,

Can someone please help me join tables with different number of fields.  I am trying to select records from one table and do a left join with another table that i have created from 2 other tables.  My left table is the customer table.  I want all the records in the customer table and all the records in the Here is an example query that I have:

select ID
,CUSTID
,BPCUST
,BPV_USER_ID
,PRDUSER
,ENROLLMENT_DATE
,LOGINDATE
,LSTUSEDATE
,LSTSRTDATE
from CUSTOMER x

LEFT JOIN

select ID, CUSTID,'Online' as Online
from (

select ID,CUSTID,TRANSDT
from (
select ID,CUSTID
,max(b.TRANSDT) as TRANSDT
from tbl1
group by di_id,CUSTID)

union all

select DI_ID ,CUSTID
,max(TRANSDT) as TRANSDT
from tbl2
group by ID, CUSTID) t
on x.id = t.id and x.custid = t.custid

This is just a concept, but it did not work.  Can someone please come my aid?

Thanks.
0
Comment
Question by:fb1990
  • 8
  • 6
  • 3
  • +2
20 Comments
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 40232404
Please provide the structure of the tables and state the requirement clearly.
0
 
LVL 1

Author Comment

by:fb1990
ID: 40232418
My apologies for the ambiguity.  It is like i have 2 tables. The query works individually.  For example when i run this query:
select ID
,CUSTID
,BPCUST
,BPV_USER_ID
,PRDUSER
,ENROLLMENT_DATE
,LOGINDATE
,LSTUSEDATE
,LSTSRTDATE
from CUSTOMER x

I get my results.  But i want combine the results to the data that i am getting from this other query

select ID,CUSTID,TRANSDT
from (
select ID,CUSTID
,max(b.TRANSDT) as TRANSDT
from tbl1
group by di_id,CUSTID)

union all

select DI_ID ,CUSTID
,max(TRANSDT) as TRANSDT
from tbl2
group by ID, CUSTID

The 2 resultsets have ID and CUSTID in common.  Does that bring some perspective to what I am trying to do?

Thanks for the swift response.
0
 
LVL 13

Expert Comment

by:Russell Fox
ID: 40232429
You're close, just over-complicating things a bit. This should work - feel free to reformat, this is just how I tend to indent things for clarity:
SELECT ID
	,CUSTID
	,BPCUST
	,BPV_USER_ID
	,PRDUSER
	,ENROLLMENT_DATE
	,LOGINDATE
	,LSTUSEDATE
	,LSTSRTDATE
FROM CUSTOMER x

LEFT JOIN
	(
	SELECT ID
		, CUSTID
		, MAX(b.TRANSDT) AS TRANSDT
	FROM tbl1 
	GROUP BY di_id,CUSTID
	-- -------------------------------
	UNION ALL 
	-- -------------------------------
	SELECT DI_ID
		, CUSTID
		, MAX(TRANSDT) AS TRANSDT
	FROM tbl2
	GROUP BY ID, CUSTID
	) t
ON x.id = y.id and x.custid = y.custid

Open in new window

0
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 40232446
select ID
	,CUSTID
	,BPCUST
	,BPV_USER_ID
	,PRDUSER
	,ENROLLMENT_DATE
	,LOGINDATE
	,LSTUSEDATE
	,LSTSRTDATE
from CUSTOMER x
LEFT OUTER JOIN 
	(select ID,CUSTID,TRANSDT
	from ( select ID,CUSTID
			 ,max(b.TRANSDT) as TRANSDT
			 from tbl1 
			 group by id,CUSTID)

	union all

	select DI_ID ,CUSTID
	,max(TRANSDT) as TRANSDT
	from tbl2
	group by DI_ID, CUSTID) T
ON x.ID = T.ID 
AND x.CUSTID = T.CUSTID

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40233085
Please note your existing UNION ALL query could lead to "duplicates" in the final result. Is that expected/wanted?

Also note that the UNION ALL query you provided has some syntax errors e.g. "max(b.TRANSDT) " but no alias b, and there is mention of DI_ID and ID but they are inconsistent.

SELECT
      t.ID
    , t.CUSTID
    , c.BPCUST
    , c.BPV_USER_ID
    , c.PRDUSER
    , c.ENROLLMENT_DATE
    , c.LOGINDATE
    , c.LSTUSEDATE
    , c.LSTSRTDATE
FROM (
      SELECT
            di_id        AS ID
          , CUSTID
          , MAX(TRANSDT) AS TRANSDT
      FROM tbl1
      GROUP BY
            di_id
          , CUSTID

      UNION ALL

            SELECT
                  di_id        AS ID
                , CUSTID
                , MAX(TRANSDT) AS TRANSDT
            FROM tbl2
            GROUP BY
                  di_id
                , CUSTID
      ) t
      INNER JOIN CUSTOMER c
                  ON t.id = c.id
                        AND t.custid = c.custid

Open in new window

0
 

Expert Comment

by:FutureDBA-
ID: 40233122
I don't have data to test the query, and it all comes down to writing preferences, but for things like this, I like using with clause.

Try something like this. This will also get rid of any duplicate rows by using distinct clause.

WITH X AS 
	(select 
		ID,
		CUSTID,
		BPCUST,
		BPV_USER_ID,
		PRDUSER,
		ENROLLMENT_DATE,
		LOGINDATE,
		LSTUSEDATE,
		LSTSRTDATE,
	from CUSTOMER),
	 Q AS
		(select 
			ID,
			CUSTID,
			max(b.TRANSDT) as TRANSDT
		from tbl1 
		group by di_id,CUSTID

		union all

		select 
			DI_ID AS ID,
			CUSTID,
			max(TRANSDT) as TRANSDT
		from tbl2
		group by ID, CUSTID)

 SELECT DISTINCT * FROM X LEFT OUTER JOIN Q on X.ID = Q.ID AND Q.CUSTID = X.CUSTID

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40233133
My reference to duplicates wasn't meant to alarm, or to criticize, it is simply a fact that the existing union all query could produce duplicates. But we have no way of knowing if it will. For example "table1" and "table2" might be mutually exclusive sets of customers and hence no duplication can ever occur. Or it may be that the possible repetition is actually needed.

----
Why would you prefer a CTE when there is absolutely no advantage from it* & using distinct across the final result would be the least efficient way to remove duplication too (just use union instead of union all).

* my rule of thumb is that a CTE is worth it if:
a. there is a need for recursion
b. there is re-use potential (avoid doing the same thing over and over)

I utterly disagree with those who do it for cosmetic reasons, it just does not make the query simpler to read
(because the CTE is still part of the query, all you achieve is having to "look upward" to comprehend the whole query)
0
 
LVL 1

Author Comment

by:fb1990
ID: 40233157
You guys are so great.  I am testing PortletPaul and Harish Varghese solutions right now with my live data.  I will back soon.

Thanks for the help thus far...
0
 
LVL 1

Author Comment

by:fb1990
ID: 40233164
I am testing the solution provided by FutureDBA as well.  I want examine my results and get back...All 3 solution are running as we speak

Thanks
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 1

Author Comment

by:fb1990
ID: 40233439
whew..The codes have running for about 5 hour still not results.  I usually get data out of the environment within a maximum of 1 hour..What could be going here?
0
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 40233472
How many rows are there in CUSTOMER table? Since you are using a LEFT OUTER JOIN here, you are going to fetch all records from CUSTOMER table. And do you have proper indexes in tbl1 and tbl2? You need to have a clustered index on ID (or DI_ID?) and CUSTID columns in tbl1 and tbl2 OR a non-clustered index on same columns with an INCLUDE (TRANSDT).
0
 
LVL 1

Author Comment

by:fb1990
ID: 40233913
Hello Harish, You are definitely right.  The customer table has about 16M records while tbl1 and tabl2 will yield approximately 6 million record each with the max TRANSDT.  I might have to work on a sample of the data
0
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 40233948
Fetching 16M records should not be a routine activity. You can also split the query into multiple statements and see how much time each takes and then fine tune from that point:
Create Table TempTblAll (
	ID int,
	CUSTID int,
	TRANSDT datetime,
	Primary Key (ID, CUSTID) WITH FILLFACTOR = 90
)

Insert into TempTblAll
Select ID,CUSTID
	 ,max(b.TRANSDT) as TRANSDT
from tbl1 
group by di_id,CUSTID

Insert into TempTblAll
select DI_ID ,CUSTID
	,max(TRANSDT) as TRANSDT
from tbl2 T
Where Not Exists (Select 1 from TempTblAll A WHERE A.ID = T.DI_ID And A.CUSTID = T.CUSTID)
group by DI_ID, CUSTID

SELECT ID
	,CUSTID
	,BPCUST
	,BPV_USER_ID
	,PRDUSER
	,ENROLLMENT_DATE
	,LOGINDATE
	,LSTUSEDATE
	,LSTSRTDATE
FROM CUSTOMER x
LEFT JOIN TempTblAll T
	ON x.ID = T.ID 
	AND x.CUSTID = T.CUSTID

Open in new window

0
 
LVL 1

Author Comment

by:fb1990
ID: 40233968
Thanks for this suggestion Harish.  One question for you.  I used a where clause on the customer table based on your original solution
-------
from customer x
where x.ID in ('114','806','111','520','340','391','316','364','536','533','728','102','135')
-----
, i am getting SQL Command Not Properly ended error.  Do you you any suggestion for me.

Also,  I will not pulling this as a routine.  I am try to find a pattern and then decide on what i really wanted.  For example i do not need all the field.  The best option for would have been pull a small subset of the data  and decide from that point. You have some great suggestions.  Thank you!
0
 
LVL 12

Expert Comment

by:Harish Varghese
ID: 40233977
I do not see any issue with the WHERE clause as such. Can you please post the complete query that you are running? You can also give just one value first and see if the error still occurs - WHERE x.ID in  ('114')
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40234111
ID would be integer wouldn't it?
why are you using strings?
you could be causing "implicit conversion" which will slow your query
0
 
LVL 1

Author Comment

by:fb1990
ID: 40234363
Here is my code based on Harish Solution and my ID is varchar2 not integer

select 
substr(DI_ID,3,4) as diid
,x.MEM_NUMBER
,x.IS_BP_USER
,x.ENROLLMENT_DATE

from CUSTOMER x
where substr(x.DI_ID,3,4) in ('114','806','111','520','340','391','316','364','536','533','728','102','135')

LEFT OUTER JOIN 
    (select diid,MEM_NUMBER,trans_dt
from (select DI_ID as diid,MEM_NUM as MEM_NUMBER
,max(TRANS_DT) as trans_dt
from TBL1
where DI_ID in ('114','806','111','520','340','391','316','364','536','533','728','102','135')

group by di_id,mem_num)

    union all

select DI_ID as diid,MEM_NUM as MEM_NUMBER
,max(TRANS_DT) as trans_dt
from TBL2
where DI_ID in ('114','806','111','520','340','391','316','364','536','533','728','102','135')
group by di_id,mem_num) T
ON substr(x.DI_ID,3,4)  = t.diid 
AND x.MEM_NUMBER = T.MEM_NUMBER

Open in new window

0
 
LVL 12

Accepted Solution

by:
Harish Varghese earned 500 total points
ID: 40235973
Yes, PortletPaul is right. In fact, you can skip the wrapper SELECT on the UNION sqls:
select 
	substring(DI_ID,3,4) as diid
	,x.MEM_NUMBER
	,x.IS_BP_USER
	,x.ENROLLMENT_DATE

from CUSTOMER x
--where substring(x.DI_ID,3,4) in ('114','806','111','520','340','391','316','364','536','533','728','102','135')

LEFT OUTER JOIN 
    (select DI_ID as diid,MEM_NUM as MEM_NUMBER
		,max(TRANS_DT) as trans_dt
		from TBL1
		where DI_ID in ('114','806','111','520','340','391','316','364','536','533','728','102','135')
		group by di_id,mem_num

    union all

	select DI_ID as diid,MEM_NUM as MEM_NUMBER
		,max(TRANS_DT) as trans_dt
		from TBL2
		where DI_ID in ('114','806','111','520','340','391','316','364','536','533','728','102','135')
		group by di_id,mem_num
	) T
ON substring(x.DI_ID,3,4)  = t.diid 
       AND x.MEM_NUMBER = T.MEM_NUMBER
--where substring(x.DI_ID,3,4) in ('114','806','111','520','340','391','316','364','536','533','728','102','135')

Open in new window

And the WHERE clause for CUSTOMER table should be at the end. Again, using a substring in your only WHERE condition is definitely a performance concern with 16M records. If that cannot be avoided, try using a computed column on this table using substring and create an index on it.
0
 
LVL 1

Author Closing Comment

by:fb1990
ID: 40303755
Thank you very much to all that contributed.  I have been on assignment and away from the office.  Please excuse my inaction.
0

Featured Post

Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Join & Write a Comment

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now