Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

How to remove identify the duplicate with two distinct rows?

Posted on 2009-04-13
2
721 Views
Last Modified: 2013-12-19
Hi,
   I need to identify the duplicate data with two distinct rows. See my data structure below.
NAME      NAME_1   VALUE    START_DATE      END_DATE      FLAG     INDEX
SURI         SE            275        13/12/2005            31/12/2010         B          1
SURI         SE            375                                                                 A          1
SURI         SE            475                                                                 A          1
SURI         SE            275        13/12/2005            31/12/2010         B          2
SURI         SE           375                                                                  A          2
SURI         SE           475                                                                  A          2
SURI         SE           175        13/12/2006            31/12/2010         B          3
SURI         SE           375                                                                  A          3
SURI         SE           475                                                                  A          3

                This is my sample data. Here data are duplicate with two different index column. INDEX 1 and 2 contains same group of combination. So i need to identify any one of duplicate combination(i.e INDEX 1 or 2). Can anyone come up with exact solution?

Thanks
0
Comment
2 Comments
 
LVL 25

Assisted Solution

by:lwadwell
lwadwell earned 250 total points
ID: 24128739
Hi Suriyaraj_Sudalaiappan,

I see combinations that occur more than twice - if you ignore the value in the INDEX column.

This row is unique
SURI         SE     175    13/12/2006      31/12/2010       B         3

These two are duplicate
SURI         SE     275    13/12/2005      31/12/2010       B         1
SURI         SE     275    13/12/2005      31/12/2010       B         2

These three are triplicate
SURI         SE     375                                     A         1
SURI         SE     375                                     A         2
SURI         SE     375                                     A         3

So are these three
SURI         SE     475                                     A         1
SURI         SE     475                                     A         2
SURI         SE     475                                     A         3

Is the below what you where looking for?
--If you want to identify the doubles/triples/... combinations etc 
SELECT NAME, NAME_1, VALUE, START_DATE, END_DATE, FLAG
  FROM yourtable
 GROUP BY NAME, NAME_1, VALUE, START_DATE, END_DATE, FLAG
HAVING COUNT(*) > 1 
-- If you want the individual rows, this assumes you do not have duplicates with the same INDEX value
SELECT *
  FROM yourtable t,
       (SELECT NAME, NAME_1, VALUE, START_DATE, END_DATE, FLAG, MIN(INDEX) as min_index
          FROM yourtable
         GROUP BY NAME, NAME_1, VALUE, START_DATE, END_DATE, FLAG
        HAVING COUNT(*) > 1) v
 WHERE t.NAME        = v.NAME
   AND t.NAME_1      = v.NAME_1
   AND t.VALUE       = v.VALUE
   AND t.START_DATE  = v.START_DATE
   AND t.END_DATE    = v.END_DATE
   AND t.FLAG        = v.FLAG
   AND v.INDEX       > v.min_index

Open in new window

0
 

Accepted Solution

by:
Suriyaraj_Sudalaiappan earned 0 total points
ID: 24773616
Hi,

Below query is giving the exact answer.

SELECT t4.*,
                               MIN(t4.diff_flag) over(PARTITION BY t4.LINE_INDEX) min_flag
                        FROM (SELECT t3.*,
                                        CASE
                                            WHEN lag(ATTR_VALUE_FROM, lag_flag)OVER(ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE, ATTR_VALUE_FROM) != ATTR_VALUE_FROM AND BASE_ASSOCIATE_FLAG = 'A'
                                            THEN
                                                1
                                            WHEN (lag(ATTR_VALUE_FROM, lag_flag)OVER(ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE, ATTR_VALUE_FROM) != ATTR_VALUE_FROM
                                                OR lag(START_DATE_ACTIVE, lag_flag) OVER(ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE, ATTR_VALUE_FROM) != START_DATE_ACTIVE
                                                OR lag(END_DATE_ACTIVE, lag_flag) OVER(ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE, ATTR_VALUE_FROM) != END_DATE_ACTIVE)
                                                AND BASE_ASSOCIATE_FLAG = 'B'
                                            THEN
                                                1
                                            WHEN lag_flag = 0
                                            THEN
                                                1
                                            ELSE
                                                2
                                        END diff_flag
                                  FROM (SELECT t2.*,
                                                first_value(lag_count_flag) OVER(PARTITION BY LINE_INDEX
                                                                                    ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE) lag_flag
                                        FROM (SELECT t1.*,
                                                        CASE
                                                            WHEN BASE_ASSOCIATE_FLAG = 'B'
                                                            THEN
                                                                lag(COUNT, 1, 0) OVER(ORDER BY LINE_INDEX, START_DATE_ACTIVE, END_DATE_ACTIVE, ATTR_VALUE_FROM)
                                                        END lag_count_flag
                                                FROM (SELECT t.*,
                                                                COUNT(*) over(PARTITION BY LINE_INDEX) COUNT,
                                                                row_number() over(PARTITION BY NAME, FACTOR_ATTR_CONTEXT, FACTOR_ATTRIBUTE, COMP_OPERATOR, ATTR_VALUE_FROM,
                                                                                               ADJUSTMENT_FACTOR, START_DATE_ACTIVE, END_DATE_ACTIVE, BASE_ASSOCIATE_FLAG
                                                                                  ORDER BY LINE_INDEX) row_number
                                                        FROM xxqp_pam_factor_stg t
                                                        WHERE
                                                            t.process_flag = 1 AND t.request_id = v_conc_req_id) t1) t2) t3) t4
                        ORDER BY t4.LINE_INDEX,
                                 t4.START_DATE_ACTIVE,
                                 t4.ATTR_VALUE_FROM
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question