[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Oracle SQL distinct query

Posted on 2007-10-17
5
Medium Priority
?
5,818 Views
Last Modified: 2013-12-07
Hello All,

I have two tables in oracle database.
1) MOTDC_STG_SHIP_CONF_HEADER_TMP
2) MOTDC_STG_SHIP_CONF_HEADER

Both tables are identical, only first table (MOTDC_STG_SHIP_CONF_HEADER_TMP)  does not have any unique constraint and has lots of duplicate data.  And 2nd Main table has unique contraint on transaction_id, CONTAINER_ID, CONTAINER_TYPE, and does not have duplicate data.  

I want to select all distinct records based on transaction_id, CONTAINER_ID, CONTAINER_TYPE from MOTDC_STG_SHIP_CONF_HEADER_TMP table and insert into the MOTDC_STG_SHIP_CONF_HEADER table.

Schema for MOTDC_STG_SHIP_CONTAINER

CREATE TABLE MOTDC_STG_SHIP_CONTAINER
(
  TRANSACTION_ID       NUMBER,
  CONTAINER_TYPE       VARCHAR2(10 BYTE),
  CONTAINER_ID         VARCHAR2(40 BYTE),
  CONTAINER_LENGTH     NUMBER(9,3),
  CONTAINER_WIDTH      NUMBER(9,3),
  CONTAINER_HEIGHT     NUMBER(9,3),
  CONTAINER_WEIGHT     NUMBER(9,3)              NOT NULL,
  CONTAINER_PACK_TYPE  VARCHAR2(10 BYTE),
  CONTAINER_TRK_NUM    VARCHAR2(50 BYTE)        NOT NULL,
  CREATION_DATE        DATE                     NOT NULL,
  CREATED_BY           NUMBER                   NOT NULL,
  LAST_UPDATE_DATE     DATE                     NOT NULL,
  LAST_UPDATED_BY      NUMBER                   NOT NULL
)

Schema for MOTDC_STG_SHIP_CONTAINER_TMP

CREATE TABLE MOTDC_STG_SHIP_CONTAINER_TMP
(
  TRANSACTION_ID       NUMBER,
  CONTAINER_TYPE       VARCHAR2(10 BYTE),
  CONTAINER_ID         VARCHAR2(40 BYTE),
  CONTAINER_LENGTH     NUMBER(9,3),
  CONTAINER_WIDTH      NUMBER(9,3),
  CONTAINER_HEIGHT     NUMBER(9,3),
  CONTAINER_WEIGHT     NUMBER(9,3)              NOT NULL,
  CONTAINER_PACK_TYPE  VARCHAR2(10 BYTE),
  CONTAINER_TRK_NUM    VARCHAR2(50 BYTE)        NOT NULL,
  CREATION_DATE        DATE                     NOT NULL,
  CREATED_BY           NUMBER                   NOT NULL,
  LAST_UPDATE_DATE     DATE                     NOT NULL,
  LAST_UPDATED_BY      NUMBER                   NOT NULL
)

0
Comment
Question by:huzefaq
5 Comments
 
LVL 14

Expert Comment

by:mherchl
ID: 20099188
INSERT INTO MOTDC_STG_SHIP_CONTAINER
SELECT a.* FROM
MOTDC_STG_SHIP_CONTAINER_TMP a
JOIN (SELECT MAX(LAST_UPDATE_DATE) as last, TRANSACTION_ID, CONTAINER_TYPE,  CONTAINER_ID FROM MOTDC_STG_SHIP_CONTAINER_TMP GROUP BY TRANSACTION_ID, CONTAINER_TYPE,  CONTAINER_ID) b on a.TRANSACTION_ID = b.TRANSACTION_ID AND a.CONTAINER_TYPE = b.CONTAINER_TYPE AND a.CONTAINER_ID = b.CONTAINER_ID and a.LAST_UPDATE_DATE = b.last


0
 
LVL 28

Expert Comment

by:Naveen Kumar
ID: 20099268
taking distinct data from _TMP table ( which has duplicates ) to _HEADER table is easy but you need to tell what values should
go into all other fields apart from  transaction_id, CONTAINER_ID, CONTAINER_TYPE when insert into _HEADER table.

let us say if there are 2 records in _TMP with same transaction_id, CONTAINER_ID, CONTAINER_TYPE in _TMP table but other fields have
different values ( say container_weight has 100 for 1 record and 300 for 2nd record ). which value should we use for the record
which we are going to insert into _HEADER table.

similary, you need to tell based on what logic should we pick up the rest of the fields like as i said above.

Thanks
0
 
LVL 28

Expert Comment

by:Naveen Kumar
ID: 20099279
the below will work as long as we do not get more than 1 record for the same LAST_UPDATE_DATE and for the same TRANSACTION_ID, CONTAINER_TYPE,  CONTAINER_ID.

If we have 2 records with same value to all the 4 fields then the insert will fail because of unique constraint error in the table which we are inserting.

INSERT INTO MOTDC_STG_SHIP_CONTAINER
SELECT a.* FROM
MOTDC_STG_SHIP_CONTAINER_TMP a
JOIN (SELECT MAX(LAST_UPDATE_DATE) as last, TRANSACTION_ID, CONTAINER_TYPE,  CONTAINER_ID FROM MOTDC_STG_SHIP_CONTAINER_TMP GROUP BY TRANSACTION_ID, CONTAINER_TYPE,  CONTAINER_ID) b on a.TRANSACTION_ID = b.TRANSACTION_ID AND a.CONTAINER_TYPE = b.CONTAINER_TYPE AND a.CONTAINER_ID = b.CONTAINER_ID and a.LAST_UPDATE_DATE = b.last
0
 
LVL 9

Accepted Solution

by:
konektor earned 2000 total points
ID: 20099298
if it doesn't matter which "duplicate" row will be choosen, you can use rowid.

eg. you can delete duplicates as

delete from MOTDC_STG_SHIP_CONTAINER_TMP o where rowid > (select min(rowid) from MOTDC_STG_SHIP_CONTAINER_TMP where   TRANSACTION_ID = o.TRANSACTION_ID and CONTAINER_TYPE = o.CONTAINER_TYPE and CONTAINER_ID = o.CONTAINER_ID)

or select-insert to MOTDC_STG_SHIP_CONTAINER

insert into MOTDC_STG_SHIP_CONTAINER (TRANSACTION_ID, CONTAINER_TYPE, CONTAINER_ID, CONTAINER_LENGTH, CONTAINER_WIDTH, CONTAINER_HEIGHT, CONTAINER_WEIGHT, CONTAINER_PACK_TYPE, CONTAINER_TRK_NUM, CREATION_DATE, CREATED_BY, LAST_UPDATE_DATE, LAST_UPDATED_BY)
  select TRANSACTION_ID, CONTAINER_TYPE, CONTAINER_ID, CONTAINER_LENGTH, CONTAINER_WIDTH, CONTAINER_HEIGHT, CONTAINER_WEIGHT, CONTAINER_PACK_TYPE, CONTAINER_TRK_NUM, CREATION_DATE, CREATED_BY, LAST_UPDATE_DATE, LAST_UPDATED_BY
  from MOTDC_STG_SHIP_CONTAINER_TMP o where rowid = (select min(rowid) from MOTDC_STG_SHIP_CONTAINER_TMP where   TRANSACTION_ID = o.TRANSACTION_ID and CONTAINER_TYPE = o.CONTAINER_TYPE and CONTAINER_ID = o.CONTAINER_ID)

0
 
LVL 74

Expert Comment

by:sdstuber
ID: 20102197
I wouldn't do a double query on your temp table.  Simple analytics will let you pick one row from each duplicate and copy it over.

You didn't specify which rows you wanted to keep out of your duplicates though
so for my example below, I simply pick the first one, based on last_update_date, for each
transaction_id,container_id,container_type combination.

This should run more efficiently than a self join or query with requery


INSERT INTO motdc_stg_ship_container
            (transaction_id, container_type, container_id, container_length,
             container_width, container_height, container_weight,
             container_pack_type, container_trk_num, creation_date,
             created_by, last_update_date, last_updated_by)
    SELECT transaction_id, container_type, container_id, container_length,
           container_width, container_height, container_weight,
           container_pack_type, container_trk_num, creation_date, created_by,
           last_update_date, last_updated_by
      FROM (SELECT transaction_id, container_type, container_id,
                   container_length, container_width, container_height,
                   container_weight, container_pack_type, container_trk_num,
                   creation_date, created_by, last_update_date,
                   last_updated_by,
                   ROW_NUMBER() OVER(PARTITION BY transaction_id, container_id, container_type ORDER BY last_update_date)
                                                                           rn
              FROM motdc_stg_ship_container_tmp)
     WHERE rn = 1
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
Windocks is an independent port of Docker's open source to Windows.   This article introduces the use of SQL Server in containers, with integrated support of SQL Server database cloning.
This video shows how to recover a database from a user managed backup
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question