Solved

Reading from a csv file

Posted on 1998-07-29
10
212 Views
Last Modified: 2010-04-02
Hi,
i'm actually confronted to the problem to read data from an exsiting csv file into a memory structure. Has anyone already developed routines which are able to read from a csv file? It also have to deal with quoting like ...,...,"...,..." and "" for a simple double quote mark. I would be very happy if someone else has the source code. Any help would be appreciated.
Many thanks in advance
0
Comment
Question by:trouvain
10 Comments
 
LVL 2

Expert Comment

by:rayb
ID: 1168910
Look into creating a text file based ODBC datasource.  This will help you a great deal.  It's very flexible, powerful and it comes in a can!  It will save you much work and headaches.


0
 

Author Comment

by:trouvain
ID: 1168911
Thank you for your answer but unfortunately i didn't get the message where to look for creating text file based ODBC datasources. Are there source code routines to find? Please give me a more specific clue.
Thanx in advance
0
 
LVL 4

Expert Comment

by:erajoj
ID: 1168912
How does/would your memory structures look like?
Exactly what kind of functionality are you looking for?
/// John
0
Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

 

Author Comment

by:trouvain
ID: 1168913
The given CSV file is a data matrix which contains a column header and a row header. In the cells of this matrix are the values. I want to read the data into an array of array of valuetype (i.e. float). With the given column header I can determine the the number of columns. The functionality I look for is as stated in my question whether anyone has a link to or the source code by itself how to read in data from a CSV file. This routine should be able to deal with quoted strings ("...,...") and quoted quotation marks (...""...). I hope this is enough to answer your question.
0
 
LVL 1

Expert Comment

by:slinky
ID: 1168914
Use strtok with a comma as the token
0
 
LVL 3

Accepted Solution

by:
stefanr earned 200 total points
ID: 1168915
Try something like this:

#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
#include <assert.h>

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               bQuote = false; // Quoted column string terminated.
               bSkipToNextColumn = true;
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

int main()
{
   try
   {
      fstream fs("Test.CSV", ios::in | ios::nocreate);

      int nLineCount = 0;
      char szLine[1024] = { 0 };
      char** ppHeader = NULL; // Pointer to array of strings that contains the names of the columns.
      char*** pppRecord = NULL; // Pointer to arrays of record strings.
      int nColumnCount = 0; // When reading the first line that is supposed to contain the names of the columns the number of columns is determined.
      int nRecordCount = 0; // Count of records read.

      while (fs.getline(szLine, sizeof(szLine)))
      {
         if (0 == nLineCount)
         {
            ProcessLine(ppHeader, nColumnCount, szLine);
         }
         else
         {
            char** ppRecord = NULL;
            int nTemp = 0;
            ProcessLine(ppRecord, nTemp, szLine);
            assert(nTemp == nColumnCount);
            nRecordCount++;
            pppRecord = (char***) ::realloc(pppRecord, nRecordCount * sizeof(char**));
            pppRecord[nRecordCount-1] = ppRecord;
         }

         nLineCount++;
      }

      fs.close();

      for (int i = 0; i < nColumnCount; i++)
      {
         cout << ppHeader[i] << ";"; // Print column names.
         ::free(ppHeader[i]);
      }
      cout << endl;
      ::free(ppHeader);

      for (i = 0; i < nRecordCount; i++)
      {
         for (int j = 0; j < nColumnCount; j++)
         {
            cout << pppRecord[i][j] << ";"; // Print content of each column.
            ::free(pppRecord[i][j]);
         }
         cout << endl; // Prepare to print the next records column content.
         ::free(pppRecord[i]);
      }
      ::free(pppRecord);
   }
   catch (...)
   {
      return EXIT_FAILURE;
   }

   cout << "Press any key to exit..." << endl;
   ::_getch();
   return EXIT_SUCCESS;
}

0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168916
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168917
Hi Stefan,

thank you very much for your efford. I will try it out immediatly.
0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168918
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168919
Thank you for your work :-)
0

Featured Post

Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Quicksort a dynamic deque 33 72
Making a Sample Win32 DLL project Using Visual Studio 2010 Professional 4 95
Best book to learn C++ 4 79
Dialogbox API leak? 18 94
This article will show you some of the more useful Standard Template Library (STL) algorithms through the use of working examples.  You will learn about how these algorithms fit into the STL architecture, how they work with STL containers, and why t…
Many modern programming languages support the concept of a property -- a class member that combines characteristics of both a data member and a method.  These are sometimes called "smart fields" because you can add logic that is applied automaticall…
The viewer will learn how to pass data into a function in C++. This is one step further in using functions. Instead of only printing text onto the console, the function will be able to perform calculations with argumentents given by the user.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question