Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

cvs Flat File Source to OLE DB Destination SSIS

Posted on 2014-07-28
6
Medium Priority
?
964 Views
Last Modified: 2016-02-11
Dear Experts;

My brain just ran out of go juice, but I need to query a .csv file to return only DISTINCT records into a SQL 2008 table using SSIS.  I'm cool on using the Data Flow Task to pull all of the records from the .csv file, but can I just query only the DISTINCT records form the .csv flat file?  I have included a screen shot of the Flat File Source Editor and the OLE DB Destination Editor.
FlatandOLE.docx
0
Comment
Question by:wdbates
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 40

Expert Comment

by:lcohan
ID: 40225202
Why necesarily SSIS? you can import csv directly into a (temp) table then just select your DISTINCT records from that table - please see pseudo code below to help you with that and note that the location of the CSV file is relative to the SQL server not client running the query.



--IMPORT
CREATE TABLE Sample-Output.csv
(ID INT,
FirstName VARCHAR(40),
LastName VARCHAR(40),
BirthDate SMALLDATETIME)
GO

--Create CSV file in drive C: with name Sample-Output.csv.txt with following content. The location of the file is C:\Sample-Output.csv.txt
1,James,Smith,19750101
2,Meggie,Smith,19790122
3,Robert,Smith,20071101
4,Alex,Smith,20040202


--Now run following script to load all the data from CSV to database table. If there is any error in any row it will be not inserted but other rows will be inserted.
BULK
INSERT Sample-Output.csv
FROM 'c:\Sample-Output.csv.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
--Check the content of the table.
SELECT DISTINCT * --or your own criteria for DISTINCT here
FROM Sample-Output.csv
GO
0
 

Author Comment

by:wdbates
ID: 40225234
Hello Iochan;

This just one file of many very large files.  Presently I am loading the .csv files into a staging table as seen in the attachments.  After the staging files are loaded I run an editing processing checking for errors, etc. and then I use MERGE to UPDATE or INSERT the record into the final table.  The client performs very little checking and is known for sending duplicate records.  I thought if I just removed them even prior to the staging table that would save some processing time.
0
 
LVL 40

Expert Comment

by:lcohan
ID: 40225263
As far as I'm not aware of anything like that to exists - I mean eliminating duplicates during insert while reading the CSV file you may need and intermediate staging where to put the current csv and select from there the DISTINCT records to be inserted into the Staging or...if you think you can add a PK on the Staging to ignore the duplicates (although this may be costly if you have many columns as part of the PK) IGNORE_DUP_KEY = ON may help you and please see code sample below:


CREATE TABLE dbo.foo (col1 int,col2 sysname PRIMARY KEY WITH (FILLFACTOR=90, IGNORE_DUP_KEY = ON))
GO
INSERT dbo.foo VALUES (1,'Fname')
GO
INSERT dbo.foo VALUES (1,'Fname')
GO
INSERT dbo.foo VALUES (1,'Fname')
GO
--gives only  

(1 row(s) affected)
Duplicate key was ignored.

(0 row(s) affected)
Duplicate key was ignored.

(0 row(s) affected)
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:rtay
ID: 40225692
By Distinct records, do you mean no duplicate STUD_ID's to be entered into the database?  If so, use a lookup Lookup Component to check for STUD_ID before importing into the DB.
0
 
LVL 15

Accepted Solution

by:
Vikas Garg earned 1000 total points
ID: 40226277
Hi,

You can use the Sort Transformation which will give you option to remove duplicate records and thus you can get distinct records from the CSV file.

Sort Transformation Image
0
 

Author Closing Comment

by:wdbates
ID: 40227018
Dear Vikas Garg;

Your solution was great and thank you for the screen shot.  I forgot all about the Sort Transformation.
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question