Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 913
  • Last Modified:

SSIS CSV conversion to Pipe Flat File

Implemented the solution below to convert extra csv commas to a pipe flat file, now getting an error regarding commas at the end of the file or outside of the delimiters:  



Solution for SSIS’s Flat File Connection which does not support alternating text qualifiers.
Replace the comma as our delimiter to the vertical pipe (|) character thus leaving commas intact. The quotes become unnecessary. For Tab Use:      '''' + REPLACE(REPLACE([Column1], '''', ''''''), CHAR(13) + CHAR(10), CHAR(9)) + ''''

Created a new DFT to reformat the input file
 
Configured the Flat File Source to “see” the input file as a single column file:
 
Added the output that will be the modified row:
 
      
     Imports System
     Imports System.Data
     Imports System.Math
     Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
     Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
     
    Public Class ScriptMain
        Inherits UserComponent
     
   Public Overrides Sub UnTransformedInput_ProcessInputRow(ByVal Row As UnTransformedInputBuffer)
            '
            ' Unquote and  replace the comma to pipe (|) to become the delimiter
            '
     
            Dim raw_string As String = String.Empty
            Dim seacrhedPos As Int32
            Dim firstCommaPos As Int32
            Dim nextQuotePos As Int32
            Dim section_to_modify As String
     
            raw_string = Row.UnTransformedOrgRow
     
            'MesssageBox to see the initial value
            System.Windows.Forms.MessageBox.Show(raw_string)
     
            If raw_string.Length > 0 Then
     
                seacrhedPos = raw_string.IndexOf(",""", 0)
    
                While seacrhedPos <> -1
                    firstCommaPos = raw_string.IndexOf(",", seacrhedPos + 1)
     
                    nextQuotePos = raw_string.IndexOf(""",", firstCommaPos + 1)
     
                    If nextQuotePos > 0 Then
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - (raw_string.Length - nextQuotePos))
     
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
     
                        'Replace 
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify & _
                        raw_string.Substring(nextQuotePos + 1, raw_string.Length - nextQuotePos - 1)
     
                    Else
                        section_to_modify = raw_string.Substring(seacrhedPos + 1, raw_string.Length - seacrhedPos - 1)
     
                        section_to_modify = section_to_modify.Replace(",", "~").Replace("""", "")
     
                        'Replace 
                        raw_string = raw_string.Substring(0, seacrhedPos + 1) & section_to_modify
     
                    End If
     
                    'Get the next set
                    seacrhedPos = raw_string.IndexOf(",""", 0)
                End While
     
       raw_string = raw_string.Replace(",", "|").Replace("~", ",")
     
            End If
     
           'MesssageBox to see the final value
           System.Windows.Forms.MessageBox.Show(raw_string)
     
            Row.TransformedRow = raw_string
        End Sub
     
    End Class

Open in new window

0
JGH5
Asked:
JGH5
1 Solution
 
JGH5Author Commented:
Name             Number
Should Be:        Apple Inc            12345
Getting Results Of:       Apple                 Inc
0
 
ValentinoVBI ConsultantCommented:
Sounds to me like you're making this way too complicated.  If all you want/need to do is replace the comma separator with a pipe then I wouldn't bother implementing a data flow transformation.  Just use a Script Task in the Control Flow with following code (or similar).  It assumes that there are two package variables: Input that contains full path + name of input file and Output with full path + name for output file.

//additional using
using System.IO;

        public void Main()
        {
            string input = Dts.Variables["Input"].Value.ToString();
            string output = Dts.Variables["Output"].Value.ToString();

            string fileContent = File.ReadAllText(input);
            fileContent = fileContent.Replace(",", "|");
            File.WriteAllText(output, fileContent);

            Dts.TaskResult = (int)ScriptResults.Success;
        }

Open in new window

BTW: I don't have the time to analyze your code but have the impression, based on the length and the comment at the top, that it's doing more than just replacing comma with pipe.  You should be able to achieve this through my code as well, by adding additional calls to .Replace.  Please post more details (in words and with examples, not code) on what you need to achieve if you need further assistance with this.
0
 
Rainer JeschorCommented:
Hi,
which version of SQL Server / SSIS?

Can you perhaps please explain your original issue?
As far as I understand, you have a CSV file with comma as delimiter and the quotes as text separators to encapsulate commas inside text columns.
You want now to import / read this CSV and export it as flat file with the pipe as delimiter without text separators - correct?

Thanks.
Rainer
0
 
JGH5Author Commented:
@Rainer,

2008 R2

Yes read CSV and export to a pipe delimiter without text separators.
0
 
tbruinsmaCommented:
This solution has one failure point, Escaped text qualifiers (0,"This is a ""test""!", 0.00) will cause it to fail. I put this script in the middle of a for each file loop to process all of the files before i began the import. Not the most elegant, but it solves the problem for me.

using System;
using System.IO;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;

        public void Main()
        {
            try {
                string _srcFile = Dts.Variables["CurrentFile"].Value.ToString();
                string _tmpFile = _srcFile + ".tmp";
                string _inputLine = "";
                string _outputLine = "";

                //build the regex
                string _expression = String.Format("{0}(?=(?:[^{1}]*{1}[^{1}]*{1})*(?![^{1}]*{1}))",Regex.Escape(","), Regex.Escape("\""));
                RegexOptions _regexOptions = RegexOptions.Compiled | RegexOptions.Multiline;
                Regex _regex = new Regex(_expression, _regexOptions);

                //rename the file to tmp
                File.Move(_srcFile, _tmpFile);

                using (StreamReader sr = new StreamReader(_tmpFile))
                {
                      //open the output file
                    using (StreamWriter sw = new StreamWriter(_srcFile))
                    {
                            //read the file in line by line
                        while (sr.Peek() >= 0)
                        {
                            _inputLine = sr.ReadLine();

                                  try {
                                        //rip, strip and replace
                                        _outputLine = string.Join("|", _regex.Split(_inputLine));
                                sw.WriteLine(_outputLine);
                                  } catch (Exception p_Ex) {
                                        Console.WriteLine(p_Ex);
                                Console.WriteLine(_inputLine);
                                  }
                            }

                            //close the output file
                        sw.Close();
                      }

                      //close the input file
                      sr.Close();
                }

                //delete the input file
            } catch (Exception p_Ex) {
                Dts.Log(p_Ex.Message,0,new byte[0]);
                Dts.TaskResult = (int)ScriptResults.Failure;
                return;
            }

           Dts.TaskResult = (int)ScriptResults.Success;
            return;
        }
    }
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now