?
Solved

How to write the efficient query to merge two big tables

Posted on 2011-02-17
9
Medium Priority
?
756 Views
Last Modified: 2012-05-11
Have a question to simplify like this:

table T1 (ID number(5), FID number(10) )
table T2  (ID number(5), FID number(10) )
IDs are PK.

T1 contains
ID   FID
10   2
30   5
45   1

T2 contains
ID    FID
5     5
10   3
30   4

Would like to get:
ID  FID
5    5
10  3
30  5
45  1

That is if the records are in T1 but not in T2, take them, vice versa.
If they have the same PK, compare the FID. keep the larger one.
Since the tables are big, the efficiency should be taken into consideration. Is there any Oracle package for it?

Greatly appreciate the guru's tips/codes
0
Comment
Question by:jl66
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
9 Comments
 
LVL 143

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 1000 total points
ID: 34918934
I see 2 options:

select nvl(t1.ID, t2.ID) ID , max(t1.FID, t2.FID) fid
  from table1 t1
  full outer join table2 t2
   on t1.ID = t2.ID 

Open in new window


select ID, FID
  from ( select sq.*, row_number() over ( partition by ID order by FID desc) rn
              from ( select ID, FID from table1 union all select ID, FID from table2 ) sq
        ) q
 where q.RN = 1 

Open in new window

0
 
LVL 77

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 480 total points
ID: 34918970
I was thinking along the lines of the first one but think it needs some tweaks:


select nvl(t1.ID, t2.ID) ID , greatest(nvl(t1.FID,0), nvl(t2.FID,0)) fid
  from tab1 t1
  full outer join tab2 t2
   on t1.ID = t2.ID  
/
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 34919021
indeed GREATEST and not MAX ...
0
Percona Live Europe 2017 | Sep 25 - 27, 2017

The Percona Live Open Source Database Conference Europe 2017 is the premier event for the diverse and active European open source database community, as well as businesses that develop and use open source database software.

 
LVL 7

Assisted Solution

by:tlovie
tlovie earned 520 total points
ID: 34919265
I actually think something like this would have the best performance... but it would have to be tested.

select ID, MAX(FID) FID
from ( select ID, FID from T1 union all select ID, FID from T2 ) u
group by ID
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34919529
For what it's worth:  I was trying to set up some more realistic test cases to try performance differences and the code I tweaked from angelIII returns incorrect results if the IDs repeat in the same table.
0
 

Author Comment

by:jl66
ID: 34920910
Thanks so much for the inputs. It is hard for me to select the best sicne everyone got the right answer.
0
 
LVL 7

Expert Comment

by:tlovie
ID: 34921053
I'm curious - you asked for an efficient query - which query works best with your data set?
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34921483
The explain plans will give you a good estimate of 'better'.

Execution times are important as well.

If you must have a definite 'best' run tkprof stats.  That will decide a winner.

If they all appear equal, feel free to award points to all contributors.
0
 

Author Closing Comment

by:jl66
ID: 34922015
With the 2-million records of test data,
angelIII's 2nd query is the best: 32 unit time
tlovie's:  54
slightwv: 60
But considering easy usage to expand to the real world. The order is different.
Greatly appreciated everyone's tip. It helps a lot.
0

Featured Post

Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Shell script to create broker configuration file using current broker Configuration, solely for purpose of backup on Linux. Script may need to be modified depending on OS-installation. Please deploy and verify the script in a test environment.
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.
This video shows how to configure and send email from and Oracle database using both UTL_SMTP and UTL_MAIL, as well as comparing UTL_SMTP to a manual SMTP conversation with a mail server.
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question