?
Solved

Parse a text file based on column numbers c#

Posted on 2009-03-31
2
Medium Priority
?
732 Views
Last Modified: 2013-12-17
Hello Experts,

I have a large text file of data which is not delimited.  It appears that each field begins a specific number of characters into each line.

Like so:
Field001             Field002            Field003Field004       Field005
Field0010003     Field00234        Field003Field004       Field005
Field00123         Field0030          Field004Field004       Field005

I am comfortable workng with CSV and other delimited files.  What is the best way to clean this file up using C#?

Thanks for any help.
0
Comment
Question by:soapygus
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 24035266
0
 
LVL 6

Accepted Solution

by:
HarryNS earned 2000 total points
ID: 24037195
Check this code. I have got this sometime back from online.

DataSet data = BuildDataSet("C:\\Test.txt","Table",",");
 #region BuildDataSet
        /// <summary>
        /// method to read a text file into a DataSet
        /// </summary>
        /// <param name="file">file to read from</param>
        /// <param name="tableName">name of the DataTable we want to add</param>
        /// <param name="delimeter">delimiter to split on</param>
        /// <returns>a populated DataSet</returns>
        public DataSet BuildDataSet(string file, string tableName, string delimeter)
        {
            //create our DataSet
            DataSet domains = new DataSet();
            //add our table
            domains.Tables.Add(tableName);
            try
            {
                //first make sure the file exists
                if (File.Exists(file))
                {
                    //create a StreamReader and open our text file
                    StreamReader reader = new StreamReader(file);
                    //read the first line in and split it into columns
                    string[] columns = reader.ReadLine().Split(delimeter.ToCharArray());
                    //now add our columns (we will check to make sure the column doesnt exist before adding it)
                    foreach (string col in columns)
                    {
                        //variable to determine if a column has been added
                        bool added = false;
                        string next = "";
                        //our counter
                        int i = 0;
                        while (!(added))
                        {
                            string columnName = col;
                            //now check to see if the column already exists in our DataTable
                            if (!(domains.Tables[tableName].Columns.Contains(columnName)))
                            {
                                //since its not in our DataSet we will add it
                                domains.Tables[tableName].Columns.Add(columnName, typeof(string));
                                added = true;
                            }
                            else
                            {
                                //we didnt add the column so increment out counter
                                i++;
                            }
                        }
                    }
                    //now we need to read the rest of the text file
                    string data = reader.ReadToEnd();
                    //now we will split the file on the carriage return/line feed
                    //and toss it into a string array
                    string[] rows = data.Split("\r".ToCharArray());
                    //now we will add the rows to our DataTable
                    foreach (string r in rows)
                    {
                        string[] items = r.Split(delimeter.ToCharArray());
                        //split the row at the delimiter
                        domains.Tables[tableName].Rows.Add(items);
                    }
                }
                else
                {
                    throw new FileNotFoundException("The file " + file + " could not be found");
                }
 
            }
            catch (FileNotFoundException ex)
            {
                //_message = ex.Message;
                return null;
            }
            catch (Exception ex)
            {
                //_message = ex.Message;
                return null;
            }
 
            //now return the DataSet
            return domains;
        }
        #endregion

Open in new window

0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
The purpose of this article is to demonstrate how we can use conditional statements using Python.
The viewer will learn how to implement Singleton Design Pattern in Java.
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…
Suggested Courses

770 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question