Solved

Reading from a csv file

Posted on 1998-07-29
10
213 Views
Last Modified: 2010-04-02
Hi,
i'm actually confronted to the problem to read data from an exsiting csv file into a memory structure. Has anyone already developed routines which are able to read from a csv file? It also have to deal with quoting like ...,...,"...,..." and "" for a simple double quote mark. I would be very happy if someone else has the source code. Any help would be appreciated.
Many thanks in advance
0
Comment
Question by:trouvain
10 Comments
 
LVL 2

Expert Comment

by:rayb
ID: 1168910
Look into creating a text file based ODBC datasource.  This will help you a great deal.  It's very flexible, powerful and it comes in a can!  It will save you much work and headaches.


0
 

Author Comment

by:trouvain
ID: 1168911
Thank you for your answer but unfortunately i didn't get the message where to look for creating text file based ODBC datasources. Are there source code routines to find? Please give me a more specific clue.
Thanx in advance
0
 
LVL 4

Expert Comment

by:erajoj
ID: 1168912
How does/would your memory structures look like?
Exactly what kind of functionality are you looking for?
/// John
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 

Author Comment

by:trouvain
ID: 1168913
The given CSV file is a data matrix which contains a column header and a row header. In the cells of this matrix are the values. I want to read the data into an array of array of valuetype (i.e. float). With the given column header I can determine the the number of columns. The functionality I look for is as stated in my question whether anyone has a link to or the source code by itself how to read in data from a CSV file. This routine should be able to deal with quoted strings ("...,...") and quoted quotation marks (...""...). I hope this is enough to answer your question.
0
 
LVL 1

Expert Comment

by:slinky
ID: 1168914
Use strtok with a comma as the token
0
 
LVL 3

Accepted Solution

by:
stefanr earned 200 total points
ID: 1168915
Try something like this:

#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
#include <assert.h>

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               bQuote = false; // Quoted column string terminated.
               bSkipToNextColumn = true;
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

int main()
{
   try
   {
      fstream fs("Test.CSV", ios::in | ios::nocreate);

      int nLineCount = 0;
      char szLine[1024] = { 0 };
      char** ppHeader = NULL; // Pointer to array of strings that contains the names of the columns.
      char*** pppRecord = NULL; // Pointer to arrays of record strings.
      int nColumnCount = 0; // When reading the first line that is supposed to contain the names of the columns the number of columns is determined.
      int nRecordCount = 0; // Count of records read.

      while (fs.getline(szLine, sizeof(szLine)))
      {
         if (0 == nLineCount)
         {
            ProcessLine(ppHeader, nColumnCount, szLine);
         }
         else
         {
            char** ppRecord = NULL;
            int nTemp = 0;
            ProcessLine(ppRecord, nTemp, szLine);
            assert(nTemp == nColumnCount);
            nRecordCount++;
            pppRecord = (char***) ::realloc(pppRecord, nRecordCount * sizeof(char**));
            pppRecord[nRecordCount-1] = ppRecord;
         }

         nLineCount++;
      }

      fs.close();

      for (int i = 0; i < nColumnCount; i++)
      {
         cout << ppHeader[i] << ";"; // Print column names.
         ::free(ppHeader[i]);
      }
      cout << endl;
      ::free(ppHeader);

      for (i = 0; i < nRecordCount; i++)
      {
         for (int j = 0; j < nColumnCount; j++)
         {
            cout << pppRecord[i][j] << ";"; // Print content of each column.
            ::free(pppRecord[i][j]);
         }
         cout << endl; // Prepare to print the next records column content.
         ::free(pppRecord[i]);
      }
      ::free(pppRecord);
   }
   catch (...)
   {
      return EXIT_FAILURE;
   }

   cout << "Press any key to exit..." << endl;
   ::_getch();
   return EXIT_SUCCESS;
}

0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168916
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168917
Hi Stefan,

thank you very much for your efford. I will try it out immediatly.
0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168918
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168919
Thank you for your work :-)
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
returning a dereferenced pts in C++ 10 158
Writing a parser for java language 4 83
Installshield for Embarcadero EX 10.1 Berlin 4 60
Android development question 2 53
Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism The trick is to virtually inherit from a base class…
Many modern programming languages support the concept of a property -- a class member that combines characteristics of both a data member and a method.  These are sometimes called "smart fields" because you can add logic that is applied automaticall…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question