Solved

Parse a text file based on column numbers c#

Posted on 2009-03-31
2
726 Views
Last Modified: 2013-12-17
Hello Experts,

I have a large text file of data which is not delimited.  It appears that each field begins a specific number of characters into each line.

Like so:
Field001             Field002            Field003Field004       Field005
Field0010003     Field00234        Field003Field004       Field005
Field00123         Field0030          Field004Field004       Field005

I am comfortable workng with CSV and other delimited files.  What is the best way to clean this file up using C#?

Thanks for any help.
0
Comment
Question by:soapygus
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 86

Expert Comment

by:Mike Tomlinson
ID: 24035266
0
 
LVL 6

Accepted Solution

by:
HarryNS earned 500 total points
ID: 24037195
Check this code. I have got this sometime back from online.

DataSet data = BuildDataSet("C:\\Test.txt","Table",",");
 #region BuildDataSet
        /// <summary>
        /// method to read a text file into a DataSet
        /// </summary>
        /// <param name="file">file to read from</param>
        /// <param name="tableName">name of the DataTable we want to add</param>
        /// <param name="delimeter">delimiter to split on</param>
        /// <returns>a populated DataSet</returns>
        public DataSet BuildDataSet(string file, string tableName, string delimeter)
        {
            //create our DataSet
            DataSet domains = new DataSet();
            //add our table
            domains.Tables.Add(tableName);
            try
            {
                //first make sure the file exists
                if (File.Exists(file))
                {
                    //create a StreamReader and open our text file
                    StreamReader reader = new StreamReader(file);
                    //read the first line in and split it into columns
                    string[] columns = reader.ReadLine().Split(delimeter.ToCharArray());
                    //now add our columns (we will check to make sure the column doesnt exist before adding it)
                    foreach (string col in columns)
                    {
                        //variable to determine if a column has been added
                        bool added = false;
                        string next = "";
                        //our counter
                        int i = 0;
                        while (!(added))
                        {
                            string columnName = col;
                            //now check to see if the column already exists in our DataTable
                            if (!(domains.Tables[tableName].Columns.Contains(columnName)))
                            {
                                //since its not in our DataSet we will add it
                                domains.Tables[tableName].Columns.Add(columnName, typeof(string));
                                added = true;
                            }
                            else
                            {
                                //we didnt add the column so increment out counter
                                i++;
                            }
                        }
                    }
                    //now we need to read the rest of the text file
                    string data = reader.ReadToEnd();
                    //now we will split the file on the carriage return/line feed
                    //and toss it into a string array
                    string[] rows = data.Split("\r".ToCharArray());
                    //now we will add the rows to our DataTable
                    foreach (string r in rows)
                    {
                        string[] items = r.Split(delimeter.ToCharArray());
                        //split the row at the delimiter
                        domains.Tables[tableName].Rows.Add(items);
                    }
                }
                else
                {
                    throw new FileNotFoundException("The file " + file + " could not be found");
                }
 
            }
            catch (FileNotFoundException ex)
            {
                //_message = ex.Message;
                return null;
            }
            catch (Exception ex)
            {
                //_message = ex.Message;
                return null;
            }
 
            //now return the DataSet
            return domains;
        }
        #endregion

Open in new window

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

More often than not, we developers are confronted with a need: a need to make some kind of magic happen via code. Whether it is for a client, for the boss, or for our own personal projects, the need must be satisfied. Most of the time, the Framework…
A long time ago (May 2011), I have written an article showing you how to create a DLL using Visual Studio 2005 to be hosted in SQL Server 2005. That was valid at that time and it is still valid if you are still using these versions. You can still re…
The viewer will learn how to implement Singleton Design Pattern in Java.
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question