Link to home
Start Free TrialLog in
Avatar of Techneut
TechneutFlag for United States of America

asked on

sql find duplicate records in same row then display on same row

I want to find duplicates in one Column, then display it on one Row. For instance:

DATA:

Row 1:  ID1 Test1
Row 2:  ID2 Test2
Row 3:  ID1 Test3
Row 4:  ID3 Test4
Row 5:  ID3 Test5

Output:
Row1:  ID1 Test1 Test3
Row2:  ID3 Test4 Test5
Avatar of Brian Crowe
Brian Crowe
Flag of United States of America image

Providing schema helps in these examples.

SELECT DISTINCT ID,
   (
      SELECT ' ' + TestString
      FROM myTable
      WHERE ID = T.ID
      FOR XML PATH('')
   ) AS TestSTrings
FROM myTable AS T
SELECT
       dups.id
     , oa.TestConcat
FROM (
      SELECT ID
      FROM MyTable
      GROUP BY id
      HAVING COUNT(*) > 1
      ) DUPS
OUTER APPLY (
      SELECT ' ' + TestString
      FROM myTable
      WHERE ID = DUPS.ID
      FOR XML PATH('')
      ) OA (TestConcat)

Result:
| id |   TestConcat |
|----|--------------|
|  1 |  Test1 Test3 |
|  3 |  Test4 Test5 |
        

Open in new window


Details:
    CREATE TABLE MyTable
        ([ID] int, [TestString] varchar(5))
    ;
        
    INSERT INTO MyTable
        ([ID], [TestString])
    VALUES
        (1, 'Test1'),
        (2, 'Test2'),
        (1, 'Test3'),
        (3, 'Test4'),
        (3, 'Test5')
    ;
    
**Query 1**:

    SELECT
           dups.id
         , oa.TestConcat
    FROM (
          SELECT ID
          FROM MyTable
          GROUP BY id
          HAVING COUNT(*) > 1
          ) DUPS
    OUTER APPLY (
          SELECT ' ' + TestString
          FROM myTable
          WHERE ID = DUPS.ID
          FOR XML PATH('')
          ) OA (TestConcat)

**[Results][2]**:
    | id |   TestConcat |
    |----|--------------|
    |  1 |  Test1 Test3 |
    |  3 |  Test4 Test5 |

  [1]: http://sqlfiddle.com/#!6/31311/1
  [2]: http://sqlfiddle.com/#!6/31311/1/0

Open in new window


{+edit}
You can also remove the leading space introduced to the concatenation by using STUFF(), like this:
SELECT
       dups.id
     , oa.TestConcat
FROM (
      SELECT ID
      FROM MyTable
      GROUP BY id
      HAVING COUNT(*) > 1
      ) DUPS
OUTER APPLY (
      SELECT STUFF((
            SELECT ' ' + TestString
            FROM myTable
            WHERE ID = DUPS.ID
            FOR XML PATH('')
       ), 1, 1, '')
      ) OA (TestConcat)

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Techneut
Techneut
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Techneut

ASKER

Okay, so I apologize. After trying out the two scripts I see that I did not articulate myself correctly. The chart should had helped, but I see that misleading column and row is a significant issue.

The correct question should had been, "find duplicate in column (id), then display the values (value) in the same row".

DATA:

Row 1:  ID1 Test1
Row 2:  ID2 Test2
Row 3:  ID1 Test3
Row 4:  ID3 Test4
Row 5:  ID3 Test5

Output:
Row1:  ID1 Test1 Test3
Row2:  ID3 Test4 Test5
Actually believe my approach not only works, but is more efficient.

Plus, if there are more than two records for an id my approach will list all. The join approach is limited.
Consider this set of data:

CREATE TABLE MyTable
    ([ID] int, [TestString] varchar(50))
;
   
INSERT INTO MyTable
    ([ID], [TestString])
VALUES
    (1, 'Test1'),
    (2, 'Test2'),
    (1, 'Test3'),
    (1, 'Test3a'), --<< new
    (3, 'Test4'),
    (3, 'Test5')
;

Using the join approach this is the result:
| Value |  Value |
|-------|--------|
| Test3 | Test3a |
| Test4 |  Test5 |

Open in new window


Note how "Test1" is excluded from that result

Using the approach I proposed, this is the result:
| id |          TestConcat |
|----|---------------------|
|  1 |  Test1 Test3 Test3a |
|  3 |         Test4 Test5 |

Open in new window


The 2 approaches may be compared here: http://sqlfiddle.com/#!6/baf30/1
Done