• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1103
  • Last Modified:

SSIS CSV conversion to Pipe Flat File

Implemented the solution below to convert extra csv commas to a pipe flat file, now getting an error regarding commas at the end of the file or outside of the delimiters:  

Solution for SSIS’s Flat File Connection which does not support alternating text qualifiers.
Replace the comma as our delimiter to the vertical pipe (|) character thus leaving commas intact. The quotes become unnecessary. For Tab Use:      '''' + REPLACE(REPLACE([Column1], '''', ''''''), CHAR(13) + CHAR(10), CHAR(9)) + ''''

Created a new DFT to reformat the input file
Configured the Flat File Source to “see” the input file as a single column file:
Added the output that will be the modified row:
     Imports System
     Imports System.Data
     Imports System.Math
     Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
     Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
    Public Class ScriptMain
        Inherits UserComponent
   Public Overrides Sub UnTransformedInput_ProcessInputRow(ByVal Row As UnTransformedInputBuffer)
            ' Unquote and  replace the comma to pipe (|) to become the delimiter
            Dim raw_string As String = String.Empty
            Dim seacrhedPos As Int32
            Dim firstCommaPos As Int32
            Dim nextQuotePos As Int32
            Dim section_to_modify As String
            raw_string = Row.UnTransformedOrgRow
            'MesssageBox to see the initial value
            If raw_string.Length > 0 Then
                seacrhedPos = raw_string.IndexOf(",""", 0)
                While seacrhedPos <> -1
                    firstCommaPos = raw_string.IndexOf(",", seacrhedPos + 1)
                    nextQuotePos = raw_string.IndexOf(""",", firstCommaPos + 1)
                    If nextQuotePos > 0 Then
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - (raw_string.Length - nextQuotePos))
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify & _
                        raw_string.Substring(nextQuotePos + 1, raw_string.Length - nextQuotePos - 1)
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - 1)
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify
                    End If
                    'Get the next set
                    seacrhedPos = raw_string.IndexOf(",""", 0)
                End While
       raw_string = raw_string.Replace(",", "|").Replace("~", ",")
            End If
           'MesssageBox to see the final value
            Row.TransformedRow = raw_string
        End Sub
    End Class

Open in new window

1 Solution
JGH5Author Commented:
Name             Number
Should Be:        Apple Inc            12345
Getting Results Of:       Apple                 Inc
ValentinoVBI ConsultantCommented:
Sounds to me like you're making this way too complicated.  If all you want/need to do is replace the comma separator with a pipe then I wouldn't bother implementing a data flow transformation.  Just use a Script Task in the Control Flow with following code (or similar).  It assumes that there are two package variables: Input that contains full path + name of input file and Output with full path + name for output file.

//additional using
using System.IO;

        public void Main()
            string input = Dts.Variables["Input"].Value.ToString();
            string output = Dts.Variables["Output"].Value.ToString();

            string fileContent = File.ReadAllText(input);
            fileContent = fileContent.Replace(",", "|");
            File.WriteAllText(output, fileContent);

            Dts.TaskResult = (int)ScriptResults.Success;

Open in new window

BTW: I don't have the time to analyze your code but have the impression, based on the length and the comment at the top, that it's doing more than just replacing comma with pipe.  You should be able to achieve this through my code as well, by adding additional calls to .Replace.  Please post more details (in words and with examples, not code) on what you need to achieve if you need further assistance with this.
Rainer JeschorCommented:
which version of SQL Server / SSIS?

Can you perhaps please explain your original issue?
As far as I understand, you have a CSV file with comma as delimiter and the quotes as text separators to encapsulate commas inside text columns.
You want now to import / read this CSV and export it as flat file with the pipe as delimiter without text separators - correct?

JGH5Author Commented:

2008 R2

Yes read CSV and export to a pipe delimiter without text separators.
This solution has one failure point, Escaped text qualifiers (0,"This is a ""test""!", 0.00) will cause it to fail. I put this script in the middle of a for each file loop to process all of the files before i began the import. Not the most elegant, but it solves the problem for me.

using System;
using System.IO;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;

        public void Main()
            try {
                string _srcFile = Dts.Variables["CurrentFile"].Value.ToString();
                string _tmpFile = _srcFile + ".tmp";
                string _inputLine = "";
                string _outputLine = "";

                //build the regex
                string _expression = String.Format("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",Regex.Escape(","), Regex.Escape("\""));
                RegexOptions _regexOptions = RegexOptions.Compiled | RegexOptions.Multiline;
                Regex _regex = new Regex(_expression, _regexOptions);

                //rename the file to tmp
                File.Move(_srcFile, _tmpFile);

                using (StreamReader sr = new StreamReader(_tmpFile))
                      //open the output file
                    using (StreamWriter sw = new StreamWriter(_srcFile))
                            //read the file in line by line
                        while (sr.Peek() >= 0)
                            _inputLine = sr.ReadLine();

                                  try {
                                        //rip, strip and replace
                                        _outputLine = string.Join("|", _regex.Split(_inputLine));
                                  } catch (Exception p_Ex) {

                            //close the output file

                      //close the input file

                //delete the input file
            } catch (Exception p_Ex) {
                Dts.Log(p_Ex.Message,0,new byte[0]);
                Dts.TaskResult = (int)ScriptResults.Failure;

           Dts.TaskResult = (int)ScriptResults.Success;
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now