Solved

parser in C#

Posted on 2014-09-05
22
203 Views
Last Modified: 2014-09-25
I would like to use C# to parse lines with the following  format:
 
#Tom,234234,345,456

Every line start with the symbol # then a name and then some numbers.
Between every "information" (name,number or #) there is a comma.

How to start with this issue?
0
Comment
Question by:Tom3333
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 9
  • 2
  • +1
22 Comments
 
LVL 27

Expert Comment

by:Sammy
ID: 40306309
Try this

var text = "#Tom,234234,345,456"
var values = text.Split(",".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

values[0] will have the #Tom 
values[1] will have 234234
values[2] will have 345
values [3] will have 456

Open in new window

0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 40307522
Indeed it can be as simple as:
var values = text.Split(',')

Open in new window

I wouldn't recomment to use RemoveEmptyEntries as you will lose the absolute position of the parsed elements.
0
 

Author Comment

by:Tom3333
ID: 40334684
if one of the parameters are in the format 020314 which is the date mean the date 02/03/14 , how to convert it ?

for example : "#Tom,020314,345,456"
the first parameter is Tom
the second is: 020314 i would like to receive it as 02 03 14

how to do it?
0
Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40334912
Hi Tom3333;

This will do what you need. The array result has all four elements in it.

// Test string
var input = "#Tom,020314,345,456";
// Convert a comma separated string into an array of 4 strings
var result = input.Split(",".ToCharArray());
// The first element in the array is Tom, remove the # sign
result[0] = result[0].TrimStart("#".ToCharArray());
// The second element is the date and this reformats it mm dd yy
result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
// The last two elements need not be formatted so they are as is

Open in new window

0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 40334917
What about using DateTime.ParseExact() to interpret the date field?

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy);
0
 

Author Comment

by:Tom3333
ID: 40335447
when i used
 result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy);
i get the error :
Error      1      The name 'CultureInfo' does not exist in the current context
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335452
In order to use that function you will need to add a using statement as follows.

using System.Globalization;

To the top of your code file.
0
 

Author Comment

by:Tom3333
ID: 40335475
i added the statements and the program is compiled but when i tried to run is i get the following :
An unhandled exception of type 'System.FormatException' occurred in mscorlib.dll

the format of text which i used is :
#Tom,200914,345,456
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335486
Hi Tom;

There is a syntax error in the line you posted, it is missing a double quote at the end. It should be as shown below.

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM/dd/yy");
 
In a previous post you stated the following, "the second is: 020314 i would like to receive it as 02 03 14 ", if that is the case you would need to change that line like this.

result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM dd yy");
0
 

Author Comment

by:Tom3333
ID: 40335511
I put the missing double.
in the text i have both format :
#Tom,020314,165053.00.

The 020314 --> means      02/03/2014 (date)
       165053.00. -->means 16:50:53  (time)

i used the
result[1] = DateTime.ParseExact(result[1], "MMddyy", CultureInfo.InvariantCulture).ToString("MM dd yy")

Open in new window


as i said, is compiled but when i run it i see the following messages

FormatException was unhandled
An unhandled exception of type 'System.FormatException' occurred in mscorlib.dll
Additional information: The DateTime represented by the string is not supported in calendar System.Globalization.GregorianCalendar.
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335540
From what the error is saying is, "The DateTime represented by the string", in this case your data of 020314 ,  is not supported in the GregorianCalendar. you will need to put it in the correct format for your local or do it in a different way.

I originally suggested using this line of code.

result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");

so that it put the string into the format you wanted without having any dependency on other things like what part of the world you were implementing the code in.

By the way the Regex class needs the following using statement.

using System.Text.RegularExpressions;
0
 

Author Comment

by:Tom3333
ID: 40335545
i tried this way and is work fine. what about the time?
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335548
What Time??
0
 

Author Comment

by:Tom3333
ID: 40335562
now is working. ignore my last post.
in case which i have ,,   (this means that some data is missing) with the code which i use is skipped but i need to this case to have empty  string.
f.e #Tom,010214,,123
i need to have :
result[0]= Tom
result[1]= 01 02 14
result[2]= 123

but i need to have
result[0]= Tom
result[1]= 01 02 14
result[2]=
result[3]= 123

how to do this ?
0
 

Author Comment

by:Tom3333
ID: 40335564
i mean
i have :
result[0]= Tom
result[1]= 01 02 14
result[2]= 123

but i need to have
result[0]= Tom
result[1]= 01 02 14
result[2]=
result[3]= 123


how to do it ?
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335580
Please post the code from your project so I know what code from previous posts you are using.

But basically for every comma, ",", in the string you will have an array element in the array unless you added the StringSplitOptions.RemoveEmptyEntries or the original string does not contain a comma for each field.
0
 

Author Comment

by:Tom3333
ID: 40335591
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;
using System.Text.RegularExpressions;

namespace try_parser_1
{
    class Program
    {
        static void Main(string[] args)
        {
            var text = "#Tom,200914,165053.00,,345,456";
           
            var result= text.Split(",#".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);

            
            Console.WriteLine(result[0]);
            result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
            Console.WriteLine(result[1]);
            result[2] = Regex.Replace(result[2], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
			Console.WriteLine(result[3]);
            Console.WriteLine(result[4]);
			Console.WriteLine(result[5]);
            Console.Read();
        }
    }
}

Open in new window



in this case result[3] have to be empty but instead result[3]=345
0
 
LVL 63

Accepted Solution

by:
Fernando Soto earned 500 total points
ID: 40335620
Hi Tom;

Please see the comments I made in your code and changes. Also be aware that in this test string you have more then the 4 original fields. This will be fine as long as the first 4 are in the correct order otherwise this will not work for you.

var text = "#Tom,200914,165053.00,,345,456";

// Making the split on both # and , will have an empty 0 array element moving Tom to element 1 
// This next commented out line will remove empty elements as you stated you do not want
// var result= text.Split(",#".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
var result= text.Split(",".ToCharArray());
// Now you can strip the # from Tom's name.
result[0] = result[0].TrimStart("#".ToCharArray());

Console.WriteLine(result[0]);
result[1] = Regex.Replace(result[1], @"(\d\d)(\d\d)(\d\d)", @"$1 $2 $3");
Console.WriteLine(result[1]);
Console.WriteLine(result[2]);
Console.WriteLine(result[3]);
Console.WriteLine(result[4]);
Console.WriteLine(result[5]);
Console.Read();

Open in new window

0
 

Author Comment

by:Tom3333
ID: 40335685
ok now is working fine.

Final question if i want to read from a file with multiple lines with the same format how to convert the code ?
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40335775
You do not need to convert the code. You open the file then you create a while loop which reads the next line in the file and place your code we worked out in the loop. then the next thing in the while is to process the information just parsed and back up to the top of the while loop.
0
 
LVL 63

Expert Comment

by:Fernando Soto
ID: 40341431
If this question has been answered please close the question out.

Thank you.
0
 

Author Closing Comment

by:Tom3333
ID: 40344600
Helpful answer
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
Performance in games development is paramount: every microsecond counts to be able to do everything in less than 33ms (aiming at 16ms). C# foreach statement is one of the worst performance killers, and here I explain why.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question