Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


SSIS CSV conversion to Pipe Flat File

Posted on 2014-01-21
Medium Priority
Last Modified: 2016-02-26
Implemented the solution below to convert extra csv commas to a pipe flat file, now getting an error regarding commas at the end of the file or outside of the delimiters:  

Solution for SSIS’s Flat File Connection which does not support alternating text qualifiers.
Replace the comma as our delimiter to the vertical pipe (|) character thus leaving commas intact. The quotes become unnecessary. For Tab Use:      '''' + REPLACE(REPLACE([Column1], '''', ''''''), CHAR(13) + CHAR(10), CHAR(9)) + ''''

Created a new DFT to reformat the input file
Configured the Flat File Source to “see” the input file as a single column file:
Added the output that will be the modified row:
     Imports System
     Imports System.Data
     Imports System.Math
     Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
     Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
    Public Class ScriptMain
        Inherits UserComponent
   Public Overrides Sub UnTransformedInput_ProcessInputRow(ByVal Row As UnTransformedInputBuffer)
            ' Unquote and  replace the comma to pipe (|) to become the delimiter
            Dim raw_string As String = String.Empty
            Dim seacrhedPos As Int32
            Dim firstCommaPos As Int32
            Dim nextQuotePos As Int32
            Dim section_to_modify As String
            raw_string = Row.UnTransformedOrgRow
            'MesssageBox to see the initial value
            If raw_string.Length > 0 Then
                seacrhedPos = raw_string.IndexOf(",""", 0)
                While seacrhedPos <> -1
                    firstCommaPos = raw_string.IndexOf(",", seacrhedPos + 1)
                    nextQuotePos = raw_string.IndexOf(""",", firstCommaPos + 1)
                    If nextQuotePos > 0 Then
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - (raw_string.Length - nextQuotePos))
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify & _
                        raw_string.Substring(nextQuotePos + 1, raw_string.Length - nextQuotePos - 1)
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - 1)
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify
                    End If
                    'Get the next set
                    seacrhedPos = raw_string.IndexOf(",""", 0)
                End While
       raw_string = raw_string.Replace(",", "|").Replace("~", ",")
            End If
           'MesssageBox to see the final value
            Row.TransformedRow = raw_string
        End Sub
    End Class

Open in new window

Question by:JGH5
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Author Comment

ID: 39798549
Name             Number
Should Be:        Apple Inc            12345
Getting Results Of:       Apple                 Inc
LVL 37

Accepted Solution

ValentinoV earned 2000 total points
ID: 39799730
Sounds to me like you're making this way too complicated.  If all you want/need to do is replace the comma separator with a pipe then I wouldn't bother implementing a data flow transformation.  Just use a Script Task in the Control Flow with following code (or similar).  It assumes that there are two package variables: Input that contains full path + name of input file and Output with full path + name for output file.

//additional using
using System.IO;

        public void Main()
            string input = Dts.Variables["Input"].Value.ToString();
            string output = Dts.Variables["Output"].Value.ToString();

            string fileContent = File.ReadAllText(input);
            fileContent = fileContent.Replace(",", "|");
            File.WriteAllText(output, fileContent);

            Dts.TaskResult = (int)ScriptResults.Success;

Open in new window

BTW: I don't have the time to analyze your code but have the impression, based on the length and the comment at the top, that it's doing more than just replacing comma with pipe.  You should be able to achieve this through my code as well, by adding additional calls to .Replace.  Please post more details (in words and with examples, not code) on what you need to achieve if you need further assistance with this.
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 39800857
which version of SQL Server / SSIS?

Can you perhaps please explain your original issue?
As far as I understand, you have a CSV file with comma as delimiter and the quotes as text separators to encapsulate commas inside text columns.
You want now to import / read this CSV and export it as flat file with the pipe as delimiter without text separators - correct?


Author Comment

ID: 39801852

2008 R2

Yes read CSV and export to a pipe delimiter without text separators.

Expert Comment

ID: 40202417
This solution has one failure point, Escaped text qualifiers (0,"This is a ""test""!", 0.00) will cause it to fail. I put this script in the middle of a for each file loop to process all of the files before i began the import. Not the most elegant, but it solves the problem for me.

using System;
using System.IO;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;

        public void Main()
            try {
                string _srcFile = Dts.Variables["CurrentFile"].Value.ToString();
                string _tmpFile = _srcFile + ".tmp";
                string _inputLine = "";
                string _outputLine = "";

                //build the regex
                string _expression = String.Format("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",Regex.Escape(","), Regex.Escape("\""));
                RegexOptions _regexOptions = RegexOptions.Compiled | RegexOptions.Multiline;
                Regex _regex = new Regex(_expression, _regexOptions);

                //rename the file to tmp
                File.Move(_srcFile, _tmpFile);

                using (StreamReader sr = new StreamReader(_tmpFile))
                      //open the output file
                    using (StreamWriter sw = new StreamWriter(_srcFile))
                            //read the file in line by line
                        while (sr.Peek() >= 0)
                            _inputLine = sr.ReadLine();

                                  try {
                                        //rip, strip and replace
                                        _outputLine = string.Join("|", _regex.Split(_inputLine));
                                  } catch (Exception p_Ex) {

                            //close the output file

                      //close the input file

                //delete the input file
            } catch (Exception p_Ex) {
                Dts.Log(p_Ex.Message,0,new byte[0]);
                Dts.TaskResult = (int)ScriptResults.Failure;

           Dts.TaskResult = (int)ScriptResults.Success;

Featured Post


Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
The viewer will learn how to simulate a series of sales calls dependent on a single skill level and learn how to simulate a series of sales calls dependent on two skill levels. Simulating Independent Sales Calls: Enter .75 into cell C2 – “skill leve…
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question