parser in C#

I would like to use C# to parse lines with the following  format:
 
#Tom,234234,345,456

Every line start with the symbol # then a name and then some numbers.
Between every "information" (name,number or #) there is a comma.

How to start with this issue?
Tom3333Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SammyCommented:
Try this

var text = "#Tom,234234,345,456"
var values = text.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

values[0] will have the #Tom 
values[1] will have 234234
values[2] will have 345
values [3] will have 456

Open in new window

0
Jaime OlivaresSoftware ArchitectCommented:
Indeed it can be as simple as:
var values = text.Split(',')

Open in new window

I wouldn't recomment to use RemoveEmptyEntries as you will lose the absolute position of the parsed elements.
0
Tom3333Author Commented:
if one of the parameters are in the format 020314 which is the date mean the date 02/03/14 , how to convert it ?

for example : "#Tom,020314,345,456"
the first parameter is Tom
the second is: 020314 i would like to receive it as 02 03 14

how to do it?
0
Cloud Class® Course: Microsoft Windows 7 Basic

This introductory course to Windows 7 environment will teach you about working with the Windows operating system. You will learn about basic functions including start menu; the desktop; managing files, folders, and libraries.

Fernando SotoRetiredCommented:
Hi Tom3333;

This will do what you need. The array result has all four elements in it.

// Test string
var input = "#Tom,020314,345,456";
// Convert a comma separated string into an array of 4 strings
var result = input.Split(",".ToCharArray());
// The first element in the array is Tom, remove the # sign
result[0] = result[0].TrimStart("#".ToCharArray());
// The second element is the date and this reformats it mm dd yy
result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
// The last two elements need not be formatted so they are as is

Open in new window

0
Jaime OlivaresSoftware ArchitectCommented:
What about using DateTime.ParseExact() to interpret the date field?

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy);
0
Tom3333Author Commented:
when i used
 result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy);
i get the error :
Error      1      The name 'CultureInfo' does not exist in the current context
0
Fernando SotoRetiredCommented:
In order to use that function you will need to add a using statement as follows.

using System.Globalization;

To the top of your code file.
0
Tom3333Author Commented:
i added the statements and the program is compiled but when i tried to run is i get the following :
An unhandled exception of type 'System.FormatException' occurred in mscorlib.dll

the format of text which i used is :
#Tom,200914,345,456
0
Fernando SotoRetiredCommented:
Hi Tom;

There is a syntax error in the line you posted, it is missing a double quote at the end. It should be as shown below.

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy");
 
In a previous post you stated the following, "the second is: 020314 i would like to receive it as 02 03 14 ", if that is the case you would need to change that line like this.

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM dd yy");
0
Tom3333Author Commented:
I put the missing double.
in the text i have both format :
#Tom,020314,165053.00.

The 020314 --> means      02/03/2014 (date)
       165053.00. -->means 16:50:53  (time)

i used the
result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM dd yy")

Open in new window


as i said, is compiled but when i run it i see the following messages

FormatException was unhandled
An unhandled exception of type 'System.FormatException' occurred in mscorlib.dll
Additional information: The DateTime represented by the string is not supported in calendar System.Globalization.GregorianCalendar.
0
Fernando SotoRetiredCommented:
From what the error is saying is, "The DateTime represented by the string", in this case your data of 020314 ,  is not supported in the GregorianCalendar. you will need to put it in the correct format for your local or do it in a different way.

I originally suggested using this line of code.

result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");

so that it put the string into the format you wanted without having any dependency on other things like what part of the world you were implementing the code in.

By the way the Regex class needs the following using statement.

using System.Text.RegularExpressions;
0
Tom3333Author Commented:
i tried this way and is work fine. what about the time?
0
Fernando SotoRetiredCommented:
What Time??
0
Tom3333Author Commented:
now is working. ignore my last post.
in case which i have ,,   (this means that some data is missing) with the code which i use is skipped but i need to this case to have empty  string.
f.e #Tom,010214,,123
i need to have :
result[0]= Tom
result[1]= 01 02 14
result[2]= 123

but i need to have
result[0]= Tom
result[1]= 01 02 14
result[2]=
result[3]= 123

how to do this ?
0
Tom3333Author Commented:
i mean
i have :
result[0]= Tom
result[1]= 01 02 14
result[2]= 123

but i need to have
result[0]= Tom
result[1]= 01 02 14
result[2]=
result[3]= 123


how to do it ?
0
Fernando SotoRetiredCommented:
Please post the code from your project so I know what code from previous posts you are using.

But basically for every comma, ",", in the string you will have an array element in the array unless you added the StringSplitOptions.RemoveEmptyEntries or the original string does not contain a comma for each field.
0
Tom3333Author Commented:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;
using System.Text.RegularExpressions;

namespace try_parser_1
{
    class Program
    {
        static void Main(string[] args)
        {
            var text = "#Tom,200914,165053.00,,345,456";
           
            var result= text.Split(",#".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

            
            Console.WriteLine(result[0]);
            result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
            Console.WriteLine(result[1]);
            result[2] = Regex.Replace(result[2], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
			Console.WriteLine(result[3]);
            Console.WriteLine(result[4]);
			Console.WriteLine(result[5]);
            Console.Read();
        }
    }
}

Open in new window



in this case result[3] have to be empty but instead result[3]=345
0
Fernando SotoRetiredCommented:
Hi Tom;

Please see the comments I made in your code and changes. Also be aware that in this test string you have more then the 4 original fields. This will be fine as long as the first 4 are in the correct order otherwise this will not work for you.

var text = "#Tom,200914,165053.00,,345,456";

// Making the split on both # and , will have an empty 0 array element moving Tom to element 1 
// This next commented out line will remove empty elements as you stated you do not want
// var result= text.Split(",#".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
var result= text.Split(",".ToCharArray());
// Now you can strip the # from Tom's name.
result[0] = result[0].TrimStart("#".ToCharArray());

Console.WriteLine(result[0]);
result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
Console.WriteLine(result[1]);
Console.WriteLine(result[2]);
Console.WriteLine(result[3]);
Console.WriteLine(result[4]);
Console.WriteLine(result[5]);
Console.Read();

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Tom3333Author Commented:
ok now is working fine.

Final question if i want to read from a file with multiple lines with the same format how to convert the code ?
0
Fernando SotoRetiredCommented:
You do not need to convert the code. You open the file then you create a while loop which reads the next line in the file and place your code we worked out in the loop. then the next thing in the while is to process the information just parsed and back up to the top of the while loop.
0
Fernando SotoRetiredCommented:
If this question has been answered please close the question out.

Thank you.
0
Tom3333Author Commented:
Helpful answer
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C#

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.