Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

SELECT DISTINCT

Posted on 2011-02-25
6
Medium Priority
?
572 Views
Last Modified: 2013-11-10
I have a dtsx package that I built and within one of the data flows, I'm manipulating the data in a way that makes me end up with a bunch of duplicates. Is there a way to produce the same effect as SELECT DISTINCT within one of the transformations?

Here's a mini version of what I start with:

ContactId   Code   Name
35              NK1     Joe Blow
35              NK3     Joe Blow
35              NP6     Joe Blow
19              SJ1     Jane Blow
19              SJ4     Jane Blow
19              NK1    Jane Blow

In one of my existing transformations I'm taking off the numeric character of the codes and ending up with:

ContactId   Code   Name
35              NK      Joe Blow
35              NK      Joe Blow
35              NP      Joe Blow
19              SJ      Jane Blow
19              SJ      Jane Blow
19              NK     Jane Blow

I want to end up with:

ContactId   Code   Name
35              NK      Joe Blow
35              NP      Joe Blow
19              SJ      Jane Blow
19              NK     Jane Blow

Can I achieve this with a transformation, or will it require a SQL Task?

In a really perfect world I would end up with
ContactId   Code   Name
35              NK,NP      Joe Blow
19              SJ,NK      Jane Blow
0
Comment
Question by:MuffyBunny
  • 3
  • 2
6 Comments
 
LVL 57

Expert Comment

by:Raja Jegan R
ID: 34980291
This would do:
SELECT t1.ContactId, t1.NAME, SUBSTRING((SELECT ', ' + t2.Code FROM cte t2 WHERE t1.ContactId = t2.ContactId AND t1.NAME = t2.name FOR XML PATH('')), 3, 1000) Code
FROM cte t1
group BY ContactId, NAME

Open in new window

0
 
LVL 6

Author Comment

by:MuffyBunny
ID: 34980330
I guess I should have specified that the data is originating from an IBM UniVerse ODBC data source. I do not have the ability to take the numbers off at the source. In fact, I don't have the ability to do much of anything with the data from the source other than a standard no-frills SELECT statement. All manipulations have to take place AFTER getting the dataset from the source.
0
 
LVL 57

Expert Comment

by:Raja Jegan R
ID: 34980364
Ok, in that case pull data from your IBM data source and insert records into temp tables..
Once done, you can run the query provided above to get the records transformed like this using either transformation or just the query above..
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 7

Accepted Solution

by:
rmm2001 earned 2000 total points
ID: 34989599
So you can do it using ssis. This way will get

ContactId   Code   Name
35              NK      Joe Blow
35              NP      Joe Blow
19              SJ      Jane Blow
19              NK     Jane Blow

First - add your source and logic to strip off the numeric char at the end of the code. And whatever logic you have. After you've done all of that, add a Fuzzy Grouping component. And tell it to group off of your ContactID, NewCode (the one where you stripped the number) and Name.

This will give you a "typical" set of fuzzy results. What you're wanting though is when the _key_in == _key_out. So if it was in a table - you could put a checksum on the column and determine dupes that way. That's the same thinking as this.

So that will give you the 4 rows of distinct values that you're looking for.

You'd have to concat the rows in sql or by writing the data to a raw file and reading it back in, putting a script component in to do the work. (Or just inserting with an oledb command)

data flow fuzzy lookup transformation conditional split to get distinct rows
0
 
LVL 6

Author Comment

by:MuffyBunny
ID: 34997368
You ROCK
0
 
LVL 6

Author Closing Comment

by:MuffyBunny
ID: 34997370
Thanks a million
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how the fundamental information of how to create a table.
Suggested Courses

571 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question