Parsing a CSV

i am reading in a text file line by line -- i need to parse each line by commas

i am running into trouble when i try to grab an element out of the line

buffer[1] returns the first character not everything between the commas.

what i would like is the ability to read each line and reference the entire element as buffer[1] buffer[2] etc

code is below

#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>


int main ()

{
   char buffer[500];
   char dataline[500];

   int strcmp( const char* s1,
                     const char* s2 );
   
   int count = 0;
   
   ifstream thefile;

   thefile.open ("c:/file.txt", ios::in);    
                                                   
                                                   
   if (! thefile.is_open()) {
      cout << "Error opening file";
      exit (1);
   }


   while (! thefile.eof() ) {
     
       thefile.getline(buffer, 499, '/o');
      
       cout << buffer << endl;

     }

  thefile.close();
  return 0;
}
tpiazzaAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

imladrisCommented:
char buffer[500]; represents a single line of 500 characters.
The getline method reads a single line.

To read multiple lines into memory and be able to address them you would need a two dimensional array:

char buffer[10][500];

That declaration represents 10 lines of 500 characters. You could read into it like:

i=0;
while (! thefile.eof() ) {
     
      thefile.getline(buffer[i++], 499, '/o');
     
      cout << buffer << endl;

     }

However that would, of course, run into trouble after 10 lines. Along this route you would have to read the whole file into buffer, which would mean you would have to know or find out beforehand how many lines there are in the file.

It is more common to read and process one line at a time, similar to what you are doing now.

0
tpiazzaAuthor Commented:
need to do it line by line -- the files range in size
0
dhyaneshCommented:
Hi

I think this should be something like:

while (! thefile.eof() ) {
     
     thefile.getline(dataline, 499);         //  You do not need pass the third argument it is optional
     
      cout << dataline << endl;

    }

Now to get data between commas you could use strtok() function. It gets all data until a delimiter.

You will have to declare buffer something like:

char (*buffer)[15];                //If you have 15 fields at max

This makes buffer as array of 15 pointers to characters.

In strtok() you have to pass dataline as first argument and second argument will be delimiter. It will return a pointer to first field i.e. all characters before the first comma.

First call to strtok() makes it return a pointer to string before the first delimiter. It also puts a '\0' just before the delimiter.

Subsequent calls to strtok() with NULL as first argument and delimiter as second argument will parse the string and return the subsequent fields until the end. When no more fields are left NULL is returned.

Dhyanesh


0
Bootstrap 4: Exploring New Features

Learn how to use and navigate the new features included in Bootstrap 4, the most popular HTML, CSS, and JavaScript framework for developing responsive, mobile-first websites.

tpiazzaAuthor Commented:
mind posting some code with the strtok()

mine keeps erroring out
0
imladrisCommented:
char *p;

while (! thefile.eof() ) {
     
      thefile.getline(buffer, 499);
      p=strtok(buffer,",");
      if(p!=NULL)
      {   // process first field
      }
      while((p=strtok(buffer,NULL))!=NULL)
      {   // process next field
      }
     
      cout << buffer << endl;

     }
0
merphleCommented:
Or, if the code to process the first field and all subsequent fields is the same:

char *p;
while (! thefile.eof() ) {

      thefile.getline(buffer, 499);
      for (p=strtok(buffer,","); p != NULL; p=strtok(buffer,NULL)) {
           // process field
      }

}
0
dhyaneshCommented:
Hi

I do not think strtok() works the way it is posted above.

As given in documentation of Turbo C++ it should be something like:

char *p;

while (! thefile.eof() ) {
     
     thefile.getline(buffer, 499);
      p=strtok(buffer,",");
     if(p!=NULL)
     {   // process first field
     }
     while((p=strtok(NULL,","))!=NULL)   //first argument should be NULL and not buffer and 2nd argument should be the delimiter
     {   // process next fields
     }
     
     cout << buffer << endl;

    }


Dhyanesh
0
dhyaneshCommented:
Hi

Also if you to reference each field like buf[0], buf[1] then you would have to do something like:

int i;
char *p;
char (*buf)[15];

while (! thefile.eof() ) {
   
     thefile.getline(buffer, 499);
      p=strtok(buffer,",");
    if(p!=NULL)
    {   buf[0] = p;
    }
    i = 1;
    while((p=strtok(NULL,","))!=NULL) //first argument should be NULL and not 'buffer' and 2nd argument should be the delimiter
    {  
          buf[i++] = p;
    }
   
   }

After using strtok() the original string i.e. 'buffer' will have a '\0' placed just before each delimiter. So if you do

cout << buffer <<endl;


You will see only the first field. However you can access the other fields by buf[0], buf[1], buf[2], .....

Dhyanesh
0
tpiazzaAuthor Commented:
i keep getting the follwoing error

C:\Program Files\Microsoft Visual Studio\MyProjects\parse\parse.cpp(36) : error C2440: '=' : cannot convert from 'char *' to 'char [500]'
        There are no conversions to array types, although there are conversions to references or pointers to arrays

#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <string.h>


int main ()

{
   char buffer[500];
   
    int i;
      char *p;
      char (*buf)[500];

   ifstream thefile;

   thefile.open ("c:/file.txt", ios::in);    
                                                   
                                                   
   if (! thefile.is_open()) {
      cout << "Error opening file";
      exit (1);
   }


   while (! thefile.eof() ) {
   
     thefile.getline(buffer, 499);
      p=strtok(buffer,",");
    if(p!=NULL)
    {   buf[0] = p;
    }
    i = 1;
    while((p=strtok(NULL,","))!=NULL)
   {  
          buf[i++] = p;
    }
     
      cout << buffer << endl;
        cout << buf[0] << endl;

     }

  thefile.close();
  return 0;
}
0
imladrisCommented:
This line:

char (*buf)[500];

declares a pointer named buf, which points to an array of 500 characters.

Thus buf[0] will point to the "first" array of 500 characters, and buf[1] will point to the "second" array of 500 characters. So

buf[0]=p;

where p is a pointer to a single character is going to cause a conversion error. If you want to save the pointers to the tokens you find in buffer you could declare:

char *buf[500];

This is an array of 500 pointers to character. So buf[0] is a pointer to character, just like p is, so the assignment will now work.

Note also that strtok changes buffer, so you will not be able to use buffer to emit the line to cout at the end.
0
tpiazzaAuthor Commented:
that gets me the output i want -- however buf[0] returns the first element once and buf[1], buf[2], etc returns each element twice.

also

i = 1;
    while((p=strtok(NULL,","))!=NULL)
   {  
          buf[i++] = p;
    }
     
      if i change i=0 i get the second element in the list

    please explain how this iterates



0
tpiazzaAuthor Commented:
actually it does buf[0] twice -- it only displays output once
0
imladrisCommented:
I would expect each element of buf to contain 1 token.

I'm not sure what you mean by "buf[1] returns each element twice".

I would expect:

for(int j=0; j<i; ++j)
{   cout << buf[j] << endl;
}

to show a list of the tokens that were found.

If that didn't clear it up, please post the code you are using, and explain exactly what output you are getting.
0
tpiazzaAuthor Commented:
each linein the text file is like

  ktc,163008,A,3458.8221,N

with the following code if i ask for buf[1]

i get

163008
163008





#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <string.h>

int main ()

{
   char buffer[500];
   
    int i;
      char *p;
      char *buf[500];

   ifstream thefile;

   thefile.open ("c:/file.txt", ios::in);    
                                                   
                                                   
   if (! thefile.is_open()) {
      cout << "Error opening file";
      exit (1);
   }


   while (! thefile.eof() ) {
   
     thefile.getline(buffer, 499);

      p=strtok(buffer,",");
   
        if(p!=NULL)
    {   buf[0] = p;
    }
   
        i = 1;
   
      while((p=strtok(NULL,","))!=NULL)
    {  
          buf[i++] = p;
    }
     
        cout  << buf[0] <<endl;

     }

  thefile.close();
  return 0;
}


0
tpiazzaAuthor Commented:
if i move  

while((p=strtok(NULL,","))!=NULL)
    {  
          buf[i++] = p;
    }
     
       
     }

cout  << buf[0] <<endl;

  thefile.close();

i dont get any results for buf[0]
 
0
tpiazzaAuthor Commented:
ok this is odd -- i originally only had one line of text it the file -- works like a champ with more than one line -- if it has only one line is when you get the before mentioned output
0
imladrisCommented:
Ah, I see.

For a single line the loop will proceed as follows:

while (! thefile.eof() ) {
     thefile.getline(buffer, 499);
     
     //process line

     cout  << buf[0] <<endl;
}

end of file yet, no
get next line
process line
output first token
go back to top of loop
end of file yet, no
get next line
(end of file condition is now raised)
process contents of buffer (still contains first line)
output first token

So you see, the loop is going one iteration too far. You need something like:

while(!thefile.eof())
{   thefile.getline(buffer,499);
     if(!thefile.eof())
     {   //processs line
          cout << buf[0] << endl;
     }
}

0
dhyaneshCommented:
Hi

Sorry for my mistake with declaration of buf.

It should be char *buf[500] as imladris pointed out

Dhyanesh
0
tpiazzaAuthor Commented:
thanks so much -- appreciate the detailed explanations

last question

if i want to only output the line where buf[0] = ktc how would i go about it

my file contains info in the following form

ktc,163008,A,3458.8221,N,08200.8754,W,61.7,137.3,120603,6.2,W,A*24
gmg,163008,A,3458.8221,N,08200.8754,W,61.7,137.3,120603,6.2,W,A*24
ktc,163008,A,3458.8221,N,08200.8754,W,61.7,137.3,120603,6.2,W,A*24
gmg,163008,A,3458.8221,N,08200.8754,W,61.7,137.3,120603,6.2,W,A*24

i only need the ktc

if i throw

 if(!thefile.eof())
     {  
           
             if(buf[0] = "ktc")
             {
             cout  << buf[0] << "  " << buf[1] <<endl;
             }
     }
      
      
or move it on top of the while statement it still outputs everyline with buf[o] as ktc and then buf[1] as what its supposed to be on every other line
0
imladrisCommented:
buf[0]="ktc" will produce some kind of assignment. The equality operator is '=='. But even that doesn't work for character arrays. Assuming you want to compare the token that buf[0] points to with "ktc" you should use strcmp:

if(strcmp(buf[0],"ktc")==0)

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
tpiazzaAuthor Commented:
thank you for your help
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C++

From novice to tech pro — start learning today.