Solved

C# remove trailing commas from csv file

Posted on 2016-10-26
10
164 Views
Last Modified: 2016-10-27
We get csv files that have been written by macros in Excel and have no control over the content.  so some rows have trailing commas, some have leading commas and some have both.

My goal is to remove these but I am having an issue either reading or writing the file and then applying the trimend, trimstart, etc.

So I get the file as a byte but you cannot perform those functions on a byte.  I get the file to a String array String mystring[] but you cannot perform that either once you File.ReadAllLines into the String array....StringBuilder has no "row" so it isn't like a DataTable where I can do a foreach (DataRow DR in DT.Rows.Count)
DR[0]....TrimEnd(Char) etc.

There HAS to be an easy way to do this right?  I am using file I/O native to the .NET framework version 4.51

Any help would be greatly appreciated.

Thanks
0
Comment
Question by:cyimxtck
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +2
10 Comments
 
LVL 40

Expert Comment

by:Kyle Abrahams
ID: 41861240
    string line;
    while ((line = reader.ReadLine()) != null)
    {
        line = line.Trim(new char[] {','});
        //output line to a NEW file.
    }

Open in new window

1
 
LVL 13

Expert Comment

by:Dustin Saunders
ID: 41861243
Can you give an example of a file and the expected output?

I'm not entirely sure I understand the reasoning, the commas indicate vacant spaces in a table structure, removing them destroys the ability to associate them with a specific column value.  Maybe there is a different way to accomplish your objective, with an example it might be more clear.
1
 

Author Comment

by:cyimxtck
ID: 41861257
I get what you are saying about the integrity of the file but the macro creates it with a header which is correct and then a bunch of "phantom" columns that are nothing more than commas that do not exist in header.

I just left the office so don't have a sample handy.

Kyle I will try your solution tomorrow morning to see what happens.

Thanks for the speedy replies
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 40

Assisted Solution

by:Kyle Abrahams
Kyle Abrahams earned 125 total points
ID: 41861272
Saunders brings up a good point though.  Trimming won't be enough.

After you trim you should get a count of commas, and ensure that you have the same count in the header row.  If not, you'll need to determine which commas to re-add.  (Probably taking away from the end first).

EG:
Header with 3 columns
A,          B,       C,
1,          2,       3,
,1,          2,       3,,
,,            ,          3,,,,,

ETC.
1
 
LVL 30

Assisted Solution

by:anarki_jimbel
anarki_jimbel earned 125 total points
ID: 41861463
I believe that there is some chance that commas at the end and at the start are valid - if "empty" values are valid. E.g.:
lastname, firstname, middlename
John, Smith,

Above the middlename is an empty string. This may also happen for "inside" values:
 firstname, middlename,lastname
John, ,Smith

If "empty" values are not allowed the task becomes easier. May be something like below:

        static void Main(string[] args)
        {
             string[] csvLines = new string[]{"name,price,quantity","plate,5,6",",spoon,2,6","kettle,33,1,",",knife,4,2,", ",bowl,,1,,7,,"};
            printSplit(csvLines);
            Console.ReadLine();

        }
        private static void printSplit(string[] csvLines)
        {
            foreach (string s in csvLines)
            {
                string[] values = s.Split(new char[]{','}, StringSplitOptions.RemoveEmptyEntries);
                foreach (string value in values)
                {
                    Console.Write(value + "\t");
                }
                Console.WriteLine();
            }
        }

Open in new window


0
 
LVL 44

Expert Comment

by:AndyAinscow
ID: 41861584
I agree with anarki_jimbel.
If you think about it empty cells still have to be exported to the csv file.
consider the following
A,B,C,D,E
2;3
p,q,r,s

What is the first column in the excel sheet?
A  2  p
A      p
A
   2  p
The correct answer from the excel sheet I envisage is in fact
       p

but all of the others were valid possibilities
0
 

Author Comment

by:cyimxtck
ID: 41862253
I totally get what you guys are saying and you have seemingly valid concerns BUT the array is jagged and the software we are handing this into doesn't take trailing commas; the process fails.

Here is the file which when opened in Excel looks fine.  Open it in notepad and you can see these trailing commas.  The number of trailing commas can vary but the issue is that if any trailing commas exist the entire process fails and managers are alerted.

We just need to remove the nth number of commas from this csv file and write it out "corrected" so we can process it.

PROP TYPE,TICKET,AUDIT,AUDIT TYPE,METERED ID,MONTH,TOTAL COST,STARTED,ENDED,FINALIZED
FUEL,ARK-2112,BYRON,EDIT,SWPA512,1/1/2017,91899,1/1/2017,12/31/2017,1102788
,,,,SWPA513,2/1/2017,91899,,,
,,,,SWPA514,3/1/2017,91899,,,
,,,,SWPA515,4/1/2017,91899,,,
,,,,SWPA516,5/1/2017,91899,,,
,,,,SWPA517,6/1/2017,91899,,,
,,,,SWPA518,7/1/2017,91899,,,
,,,,SWPA519,8/1/2017,91899,,,
,,,,SWPA520,9/1/2017,91899,,,
,,,,SWPA521,10/1/2017,91899,,,
,,,,SWPA522,11/1/2017,91899,,,
,,,,SWPA523,12/1/2017,91899,,,
FUEL,ARK-2112,ARLEEN,INSERT,NEPA12,1/1/2017,91899,1/1/2017,12/31/2017,1102788
,,,,NEPA13,2/1/2017,91899,,,
,,,,NEPA14,3/1/2017,91899,,,
,,,,NEPA15,4/1/2017,91899,,,
,,,,NEPA16,5/1/2017,91899,,,
,,,,NEPA17,6/1/2017,91899,,,
,,,,NEPA18,7/1/2017,91899,,,
,,,,NEPA19,8/1/2017,91899,,,
,,,,NEPA20,9/1/2017,91899,,,
,,,,NEPA21,10/1/2017,91899,,,
,,,,NEPA22,11/1/2017,91899,,,
,,,,NEPA23,12/1/2017,91899,,,

Leading commas are fine after further investigation (and the point to which you folks alluded just in a different way).

This upload can have 20 rows or 1K rows and is very similar as above.

Thanks for all the help so far!
0
 
LVL 13

Accepted Solution

by:
Dustin Saunders earned 250 total points
ID: 41862325
So it sounds like there are a certain number of columns that come with every record, and then some have multiple start/end/finalized dates that others don't?  If that's the case, then it can be that the trailing commas technically aren't needed.  But that seems like some bad code you might want to address in the macro.  You should push back on those coders.

Removing only the trailing commas is pretty simple, just use TrimEnd.

string line = ",,,,abc, def, efg, hik, jkl, mno, pqr,,,,";
            line = line.TrimEnd(',');

Open in new window


Result:
,,,,abc, def, efg, hik, jkl, mno, pqr

Open in new window


If you need to know how many commas were trimmed for whatever reason, you could do something like:
string line = ",,,,abc, def, efg, hik, jkl, mno, pqr,,,,";
            int i = line.Length;
            line = line.TrimEnd(',');
            i = i - line.Length;

Open in new window


or another way:
 int x = 0;
            while (line.Substring(line.Length - 1) == ",")
            {
                line = line.Substring(0, line.Length - 1);
                x++;
            }

Open in new window

0
 
LVL 30

Expert Comment

by:anarki_jimbel
ID: 41862946
PROP TYPE,TICKET,AUDIT,AUDIT TYPE,METERED ID,MONTH,TOTAL COST,STARTED,ENDED,FINALIZED
FUEL,ARK-2112,BYRON,EDIT,SWPA512,1/1/2017,91899,1/1/2017,12/31/2017,1102788
,,,,SWPA513,2/1/2017,91899,,,
,,,,SWPA514,3/1/2017,91899,,,
,,,,SWPA518,7/1/2017,91899,,,
,,,,SWPA519,8/1/2017,91899,,,
,,,,SWPA520,9/1/2017,91899,,,
,,,,SWPA521,10/1/2017,91899,,,
,,,,SWPA522,11/1/2017,91899,,,
,,,,SWPA523,12/1/2017,91899,,,
FUEL,ARK-2112,ARLEEN,INSERT,NEPA12,1/1/2017,91899,1/1/2017,12/31/2017,1102788
,,,,NEPA13,2/1/2017,91899,,,
,,,,NEPA14,3/1/2017,91899,,,

Hmmmmm...

May be I'm missing something but the above looks as a valid csv with 10 columns, and trailing commas   just show empty values...
1
 
LVL 13

Expert Comment

by:Dustin Saunders
ID: 41862978
@anarki I agree with you, the example provided is a normal CSV structure, and I think it would be better to pull these into a datatable or array of classes.  The only reason to trim would be if there are extra commas at the end that get added beyond the normal amount by the macro.

Good thing about CSV at least is that if only trailing commas are lost, you don't lose any information in terms of column correlation (any following columns would be NULL)
1

Featured Post

Is Your Team Achieving Their Full Potential?

74% of employees feel they are not achieving their full potential. With Linux Academy, not only will you strengthen your team's core competencies but also their knowledge of of the newest IT topics.

With new material every week, we'll make sure that you stay ahead of the game.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

More often than not, we developers are confronted with a need: a need to make some kind of magic happen via code. Whether it is for a client, for the boss, or for our own personal projects, the need must be satisfied. Most of the time, the Framework…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
Do you want to know how to make a graph with Microsoft Access? First, create a query with the data for the chart. Then make a blank form and add a chart control. This video also shows how to change what data is displayed on the graph as well as form…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question