Solved

How do I parse quotation marks in C#?

Posted on 2009-04-04
13
1,332 Views
Last Modified: 2013-12-17
I'm trying to read a CSV file that may have quotation marks in its values.  This affects cases where the quotation marks denote a string, which could have commas in it.

Is this the correct way of detecting quotation marks in C#?

while (!lineValues[j].Contains("\""))

That's throwing an out of index exception, so I want to make sure I'm correctly  saying "until we find a field that has quotation marks in it...", especially since the file is formatted correctly.
while (!lineValues[j].Contains("\""))

Open in new window

0
Comment
Question by:hyliandanny
  • 7
  • 3
  • 2
  • +1
13 Comments
 
LVL 23

Expert Comment

by:Tiggerito
ID: 24070716
You probably also need to stop your loop when you reach teh last line


while (j < lineValues.Count && !lineValues[j].Contains("\""))

Open in new window

0
 
LVL 6

Expert Comment

by:hehdaddy
ID: 24070727
This is probably information overload, but take a look at this. It will handle reading in the CSV with all the scenarios that you are facing. It's kind of nice to use some pre-written code to make your life easier.

http://www.codeproject.com/KB/database/CsvReader.aspx

0
 

Author Comment

by:hyliandanny
ID: 24070783
@ Triggerito:

I can see why you would say that.  Here's the rest of the code.  Perhaps I'm just tired and my logic is indeed faulty, but I want the while loop to go on until quotation marks are found because this loop only happens when there are more commas then there should be.  Only when there are strings with commas in them (within quotation marks) will this loop be entered.
using (StreamReader reader = new StreamReader(@"C:\sample.csv")) {

                string buffer;

                int violatingRow = 0;
 

                // Iterate through all lines

                while ((buffer = reader.ReadLine()) != null) {

                    string[] lineValues = buffer.Split(',');

                            

                    // Size of the row array.  Can't change the member, so use a local variable.

                    int trueCSVLength = lineValues.Length;
 

                    //  It's possible that the number of strings in the "split" array is > n , where n is the 

                    // number of column headers. We expect the rest of the comma-separated values to behave

                    // according to the guidelines; that is, there are (n - (this column's index)) values

                    // left to be read.  Those strings with quotation marks tell us where a value truly begins

                    // and ends; combine them so indices continue to fall neatly as designated.
 

                    if (lineValues.Length > (int)EColumns.e_colCount) {

                            // First string is in Name index.

                            int i = (int) EColumns.e_colName;
 

                            try {

                                // Find the value with the first set of quotation marks.

                                    

                                while (!lineValues[i].Contains("\""))

                                        i++;

                                    

                                //  We've got our index.  Now find the next set, but put together the single field

                                // along the way

                                int j = i;

                                while (!lineValues[j].Contains("\"")) {

                                    lineValues[i].Insert(lineValues[i].Length, lineValues[++j]);

                                        

                                    // Set the remaining fields back one index

                                    for (int k = j; k < lineValues.Length; k++) {

                                            lineValues.SetValue(lineValues[k + 1], k);

                                        }

                                        trueCSVLength--;

                                }

                            }

                            catch (Exception ex) {

                                    // If there's any problems, we're not going to read the file anymore. Report. Bail.

                                    csv_labelWarning.Text = ex.Message;

                                    return;

                            }
 

                        }
 

...

Open in new window

0
 

Author Comment

by:hyliandanny
ID: 24070788
@ hehdaddy

I've seen that project; in addition to information overload (the Web namespace usually intermingles and is used for whatever I am trying to do), 2008 C# Express doesn't seem to convert it properly since it was created with an older version of Visual Studio.
0
 
LVL 6

Accepted Solution

by:
Ramone_Hamilton earned 50 total points
ID: 24070841
I just ran a couple of test and  .Contains("\"") works as far as recognizing quotation marks.
0
 

Author Closing Comment

by:hyliandanny
ID: 31566709
Thank you.  It's probably my logic, then...
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 23

Expert Comment

by:Tiggerito
ID: 24071280
I'm a little confused. The question was not if the .Contains("\"") works, but why the code provided was returning an exception.

How did Ramone_Hamilton's answer solve the problem?
0
 

Author Comment

by:hyliandanny
ID: 24093421
Hi, Tiggerito.

I wanted to rule out the actual C#-specific element: the escape operator, backslash (\).  This is because I'm a C# newbie; if I know that it's actually correctly detecting quotation marks, then my own logic is probably at fault, which I actually did not expect further assistance with.

The program is probably entering this case incorrectly; since the backslash operator is the correct way of detection, then the code is returning an exception because I must have done something foolish when entering this case.

If you're curious to actually help me further, a larger chunk of code will be necessary.  That would be amazing, but it wasn't what I was expecting.  Thanks for your concern!
0
 
LVL 6

Expert Comment

by:hehdaddy
ID: 24093962
Most likely your out of index exception is being thrown because you are inserting values into the array that you are looping through. You need to create a 2nd array to use for inserting, etc., so that your lineValues don't go out of index.
0
 

Author Comment

by:hyliandanny
ID: 24094192
I was about to put your theory to the test, hehdaddy, but the portion before I insert into lineValues is actually throwing the out of bounds exception (as well?).  Since this is where I find the comma-containing strings I'll be joining together, I can't really make a secondary array to verify the behavior you're suggesting.

That is the code in lines 25-26 in the earlier post.  I'll copy/paste for convenience.

Additionally, I'll supply my sample file I'm using when causing this exception to be thrown so we can figure this out.
while (!lineValues[i].Contains("\""))

    i++;

Open in new window

sample-txt-for-upload.txt
0
 

Author Comment

by:hyliandanny
ID: 24094266
Perhaps also relevant is the fact that when I trace through the program while debugging, what would be the quotation marks (") appear as squares, almost as if it does not recognize that character.

If it indeed is some exceptional character situation, then this would all make sense.
0
 
LVL 23

Expert Comment

by:Tiggerito
ID: 24095257
Your sample text contains two different types of double quotes which are not the type your code uses. Maybe this is related to the problem.

 Quotes (||)
"

Open Quotes (\\)


Close Quotes (//)


0
 

Author Comment

by:hyliandanny
ID: 24103957
Tiggerito, I think you're on to it.  I posted a question which specifically addresses the patterns you've informed me about.

http://www.experts-exchange.com/Programming/Languages/C_Sharp/Q_24308232.html
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
work allocation; web development; vba; access; 4 46
Hide Tab Page 3 19
cs.Designer Issue(2) 2 21
Getfiles in vb.net 28 18
For those of you who don't follow the news, or just happen to live under rocks, Microsoft Research released a beta SDK (http://www.microsoft.com/en-us/download/details.aspx?id=27876) for the Xbox 360 Kinect. If you don't know what a Kinect is (http:…
Today I had a very interesting conundrum that had to get solved quickly. Needless to say, it wasn't resolved quickly because when we needed it we were very rushed, but as soon as the conference call was over and I took a step back I saw the correct …
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
This tutorial demonstrates a quick way of adding group price to multiple Magento products.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

24 Experts available now in Live!

Get 1:1 Help Now