SSIS CSV conversion to Pipe Flat File

Posted on 2014-01-21
Last Modified: 2016-02-26
Implemented the solution below to convert extra csv commas to a pipe flat file, now getting an error regarding commas at the end of the file or outside of the delimiters:  

Solution for SSIS’s Flat File Connection which does not support alternating text qualifiers.
Replace the comma as our delimiter to the vertical pipe (|) character thus leaving commas intact. The quotes become unnecessary. For Tab Use:      '''' + REPLACE(REPLACE([Column1], '''', ''''''), CHAR(13) + CHAR(10), CHAR(9)) + ''''

Created a new DFT to reformat the input file
Configured the Flat File Source to “see” the input file as a single column file:
Added the output that will be the modified row:
     Imports System
     Imports System.Data
     Imports System.Math
     Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
     Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
    Public Class ScriptMain
        Inherits UserComponent
   Public Overrides Sub UnTransformedInput_ProcessInputRow(ByVal Row As UnTransformedInputBuffer)
            ' Unquote and  replace the comma to pipe (|) to become the delimiter
            Dim raw_string As String = String.Empty
            Dim seacrhedPos As Int32
            Dim firstCommaPos As Int32
            Dim nextQuotePos As Int32
            Dim section_to_modify As String
            raw_string = Row.UnTransformedOrgRow
            'MesssageBox to see the initial value
            If raw_string.Length > 0 Then
                seacrhedPos = raw_string.IndexOf(",""", 0)
                While seacrhedPos <> -1
                    firstCommaPos = raw_string.IndexOf(",", seacrhedPos + 1)
                    nextQuotePos = raw_string.IndexOf(""",", firstCommaPos + 1)
                    If nextQuotePos > 0 Then
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - (raw_string.Length - nextQuotePos))
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify & _
                        raw_string.Substring(nextQuotePos + 1, raw_string.Length - nextQuotePos - 1)
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - 1)
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify
                    End If
                    'Get the next set
                    seacrhedPos = raw_string.IndexOf(",""", 0)
                End While
       raw_string = raw_string.Replace(",", "|").Replace("~", ",")
            End If
           'MesssageBox to see the final value
            Row.TransformedRow = raw_string
        End Sub
    End Class

Open in new window

Question by:JGH5
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Author Comment

ID: 39798549
Name             Number
Should Be:        Apple Inc            12345
Getting Results Of:       Apple                 Inc
LVL 37

Accepted Solution

ValentinoV earned 500 total points
ID: 39799730
Sounds to me like you're making this way too complicated.  If all you want/need to do is replace the comma separator with a pipe then I wouldn't bother implementing a data flow transformation.  Just use a Script Task in the Control Flow with following code (or similar).  It assumes that there are two package variables: Input that contains full path + name of input file and Output with full path + name for output file.

//additional using
using System.IO;

        public void Main()
            string input = Dts.Variables["Input"].Value.ToString();
            string output = Dts.Variables["Output"].Value.ToString();

            string fileContent = File.ReadAllText(input);
            fileContent = fileContent.Replace(",", "|");
            File.WriteAllText(output, fileContent);

            Dts.TaskResult = (int)ScriptResults.Success;

Open in new window

BTW: I don't have the time to analyze your code but have the impression, based on the length and the comment at the top, that it's doing more than just replacing comma with pipe.  You should be able to achieve this through my code as well, by adding additional calls to .Replace.  Please post more details (in words and with examples, not code) on what you need to achieve if you need further assistance with this.
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 39800857
which version of SQL Server / SSIS?

Can you perhaps please explain your original issue?
As far as I understand, you have a CSV file with comma as delimiter and the quotes as text separators to encapsulate commas inside text columns.
You want now to import / read this CSV and export it as flat file with the pipe as delimiter without text separators - correct?


Author Comment

ID: 39801852

2008 R2

Yes read CSV and export to a pipe delimiter without text separators.

Expert Comment

ID: 40202417
This solution has one failure point, Escaped text qualifiers (0,"This is a ""test""!", 0.00) will cause it to fail. I put this script in the middle of a for each file loop to process all of the files before i began the import. Not the most elegant, but it solves the problem for me.

using System;
using System.IO;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;

        public void Main()
            try {
                string _srcFile = Dts.Variables["CurrentFile"].Value.ToString();
                string _tmpFile = _srcFile + ".tmp";
                string _inputLine = "";
                string _outputLine = "";

                //build the regex
                string _expression = String.Format("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",Regex.Escape(","), Regex.Escape("\""));
                RegexOptions _regexOptions = RegexOptions.Compiled | RegexOptions.Multiline;
                Regex _regex = new Regex(_expression, _regexOptions);

                //rename the file to tmp
                File.Move(_srcFile, _tmpFile);

                using (StreamReader sr = new StreamReader(_tmpFile))
                      //open the output file
                    using (StreamWriter sw = new StreamWriter(_srcFile))
                            //read the file in line by line
                        while (sr.Peek() >= 0)
                            _inputLine = sr.ReadLine();

                                  try {
                                        //rip, strip and replace
                                        _outputLine = string.Join("|", _regex.Split(_inputLine));
                                  } catch (Exception p_Ex) {

                            //close the output file

                      //close the input file

                //delete the input file
            } catch (Exception p_Ex) {
                Dts.Log(p_Ex.Message,0,new byte[0]);
                Dts.TaskResult = (int)ScriptResults.Failure;

           Dts.TaskResult = (int)ScriptResults.Success;

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Technology opened people to different means of presenting information, but PowerPoint remains to be above competition. Know why PPT still works today.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
The viewer will learn how to use the =DISCRINV command to create a discrete random variable, use this command to model a set of probabilities and outcomes in a Monte Carlo simulation, and learn how to find the standard deviation of a set of probabil…
The viewer will learn how to create a normally distributed random variable in Excel, use a normal distribution to simulate the return on an investment over a period of years, Create a Monte Carlo simulation using a normal random variable, and calcul…

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question