Solved

Reading from a csv file

Posted on 1998-07-29
10
214 Views
Last Modified: 2010-04-02
Hi,
i'm actually confronted to the problem to read data from an exsiting csv file into a memory structure. Has anyone already developed routines which are able to read from a csv file? It also have to deal with quoting like ...,...,"...,..." and "" for a simple double quote mark. I would be very happy if someone else has the source code. Any help would be appreciated.
Many thanks in advance
0
Comment
Question by:trouvain
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
10 Comments
 
LVL 2

Expert Comment

by:rayb
ID: 1168910
Look into creating a text file based ODBC datasource.  This will help you a great deal.  It's very flexible, powerful and it comes in a can!  It will save you much work and headaches.


0
 

Author Comment

by:trouvain
ID: 1168911
Thank you for your answer but unfortunately i didn't get the message where to look for creating text file based ODBC datasources. Are there source code routines to find? Please give me a more specific clue.
Thanx in advance
0
 
LVL 4

Expert Comment

by:erajoj
ID: 1168912
How does/would your memory structures look like?
Exactly what kind of functionality are you looking for?
/// John
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:trouvain
ID: 1168913
The given CSV file is a data matrix which contains a column header and a row header. In the cells of this matrix are the values. I want to read the data into an array of array of valuetype (i.e. float). With the given column header I can determine the the number of columns. The functionality I look for is as stated in my question whether anyone has a link to or the source code by itself how to read in data from a CSV file. This routine should be able to deal with quoted strings ("...,...") and quoted quotation marks (...""...). I hope this is enough to answer your question.
0
 
LVL 1

Expert Comment

by:slinky
ID: 1168914
Use strtok with a comma as the token
0
 
LVL 3

Accepted Solution

by:
stefanr earned 200 total points
ID: 1168915
Try something like this:

#include <iostream.h>
#include <fstream.h>
#include <stdlib.h>
#include <conio.h>
#include <string.h>
#include <assert.h>

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               bQuote = false; // Quoted column string terminated.
               bSkipToNextColumn = true;
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

int main()
{
   try
   {
      fstream fs("Test.CSV", ios::in | ios::nocreate);

      int nLineCount = 0;
      char szLine[1024] = { 0 };
      char** ppHeader = NULL; // Pointer to array of strings that contains the names of the columns.
      char*** pppRecord = NULL; // Pointer to arrays of record strings.
      int nColumnCount = 0; // When reading the first line that is supposed to contain the names of the columns the number of columns is determined.
      int nRecordCount = 0; // Count of records read.

      while (fs.getline(szLine, sizeof(szLine)))
      {
         if (0 == nLineCount)
         {
            ProcessLine(ppHeader, nColumnCount, szLine);
         }
         else
         {
            char** ppRecord = NULL;
            int nTemp = 0;
            ProcessLine(ppRecord, nTemp, szLine);
            assert(nTemp == nColumnCount);
            nRecordCount++;
            pppRecord = (char***) ::realloc(pppRecord, nRecordCount * sizeof(char**));
            pppRecord[nRecordCount-1] = ppRecord;
         }

         nLineCount++;
      }

      fs.close();

      for (int i = 0; i < nColumnCount; i++)
      {
         cout << ppHeader[i] << ";"; // Print column names.
         ::free(ppHeader[i]);
      }
      cout << endl;
      ::free(ppHeader);

      for (i = 0; i < nRecordCount; i++)
      {
         for (int j = 0; j < nColumnCount; j++)
         {
            cout << pppRecord[i][j] << ";"; // Print content of each column.
            ::free(pppRecord[i][j]);
         }
         cout << endl; // Prepare to print the next records column content.
         ::free(pppRecord[i]);
      }
      ::free(pppRecord);
   }
   catch (...)
   {
      return EXIT_FAILURE;
   }

   cout << "Press any key to exit..." << endl;
   ::_getch();
   return EXIT_SUCCESS;
}

0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168916
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168917
Hi Stefan,

thank you very much for your efford. I will try it out immediatly.
0
 
LVL 3

Expert Comment

by:stefanr
ID: 1168918
To achieve the escaped " facility, replace the ProcessLine above with:

bool ProcessLine(char**& ppColumn, int& nColumnCount, const char* pszLine)
{
   bool bProcessingColumn = false; // true from the start to the next ',' or end of line.
   bool bQuote = false; // true if processing a quoted column.
   bool bSkipToNextColumn = false; // true to skip to the next ',' or end of line during processing.
   bool bEscapedQuote = false;

   char szColumn[1024] = { 0 };
   unsigned nIndex = 0;

   for (unsigned i = 0; i <= ::strlen(pszLine); i++)
   {
      if (bProcessingColumn)
      {
         if (bQuote)
         {
            if ('"' == pszLine[i])
            {
               if (bEscapedQuote)
               {
                  bEscapedQuote = false;
                  szColumn[nIndex] = pszLine[i];
                  nIndex++;
               }
               else if (i < ::strlen(pszLine) && '"' == pszLine[i+1])
               {
                  bEscapedQuote = true;
               }
               else
               {
                  bQuote = false; // Quoted column string terminated.
                  bSkipToNextColumn = true;
               }
            }
            else
            {
               szColumn[nIndex] = pszLine[i];
               nIndex++;
            }
         }
         else if (',' == pszLine[i] || 0 == pszLine[i])
         {
            bProcessingColumn = false;
            bSkipToNextColumn = false;

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if (!bSkipToNextColumn)
         {
            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
      else
      {
         nIndex = 0;

         if (',' == pszLine[i] || 0 == pszLine[i])
         {
            // Column contains empty string.

            szColumn[nIndex] = 0; // Terminate temporary column string.

            nColumnCount++;
            ppColumn = (char**) ::realloc(ppColumn, nColumnCount * sizeof(char*)); // Adjust size of array of pointers.
            ppColumn[nColumnCount-1] = ::strdup(szColumn); // Duplicate new string.
         }
         else if ('"' == pszLine[i])
         {
            bProcessingColumn = true;
            bQuote = true;
         }
         else
         {
            bProcessingColumn = true;

            szColumn[nIndex] = pszLine[i];
            nIndex++;
         }
      }
   }

   return true;
}

0
 

Author Comment

by:trouvain
ID: 1168919
Thank you for your work :-)
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

756 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question