?
Solved

File I/O, and some pointer or stack problem..

Posted on 2003-02-26
13
Medium Priority
?
250 Views
Last Modified: 2010-04-01
I have two questions here.

1.Because of some work, I need to collect all keywords that scattered in many files. Basically, I have to read all these files and extract those useful keywords. I know how to deal with it from one source file. Can anybody tell me how to deal with multiple files in this situation?

2. I have a file that contains all the keywords. I need a program to make the keywords unique and sorted alphabetically.

I think I have to read these keywords one by one from the file. Then each time compare the new one first with the smallest one, then those have been sorted.

My questions are: How shall I deal with the sorted results? How can I retrieve them when I need? How shall I save them? use pointer, or stack?

Thank you very much!
0
Comment
Question by:anxx0018
  • 7
  • 5
13 Comments
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8031479
My realistic answer for number one is to use 'cat' to get all your data flowing into the stdin of the program.  If that doesn't trip your trigger, you'd have to give more details about your specifications, like what OS.
For two, I think your first assumption is a good one.  The second point can be made extremely easy by using an STL map to store your words after you've parsed them.
brian
0
 

Expert Comment

by:phildsp
ID: 8031795
The cat command would only be available on UNIX and might be slow compared to c++ code.  But it would be a quick way to implement the search.  Otherwise why wouldn't you just cycle through the files, closing and deleting the I/O object for the last file processed and creating a new I/O object for the new file?

If you have a lot of keywords or will be doing a lot of access on them you might consider using a hash table.  That's an extremely efficient means of storing and sorting that's flexible.  It's a commonly enough used object that should be available in development libraries such as Rogue Wave for example.
0
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8032040
I guarantee that the unix method I mentioned would be faster.  Otherwise, yeah.
Exactly which STL container has the hashtable?  An important lesson to be learned from the principles of eXtreme Programming is to give the most simple solution for a given problem.  Is his program realistically going to take even a second to run?  I doubt it.  I didn't hear execution speed mentioned as the primary requirement.  Therefore, why not let STL do all the work for you and move on to solving interesting problems instead of reinventing the wheel each time?
brian
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 2

Expert Comment

by:bkrahmer
ID: 8032068
I might also point out that using a map solves 4 of the points of the problem very well.  In addition, because of it's associate property, one could extend it with a couple more lines of code to show how many times a keyword was found, or print the map sorted by how many times the keyword was present.
brian
0
 

Author Comment

by:anxx0018
ID: 8034800
Yes, I am using Linux redhat system. So I think Brian's method will work.

Could you please give me some simple examples for the methods you mentioned above? They all sound strange to me.

Thanks a lot!
0
 

Author Comment

by:anxx0018
ID: 8037750
Brian, Can you help me check the following program. I really do not understand why it does not go into the loop.

Thank you very much!
#include <stdio.h>
#include <stdlib.h>
#include <fstream>

#include <iostream>
#include <string>
#include <map>

using namespace std;

//Min is to return the smaller value of m1 and m2.
int Min(int m1, int m2)
{
  if (m1<m2)
    return m1;
  else
    return m2;
}

//WordCmp is used to compare two strings, if k1 is alphabetically smaller than k2, return 0; if k1 is bigger than k2, return 1; if k1 and k2 are equal, return 2.

int WordCmp(string k1, string k2)
{
  int len1,len2;

  char *s1 = new char[strlen(k1.c_str()+1)];
  strcpy(s1,k1.c_str());

  char *s2 = new char[strlen(k2.c_str()+1)];
  strcpy(s2,k2.c_str());
 
  int n = Min(strlen(k1.c_str()), strlen(k2.c_str()));
 
   for (int i= 0; i<n; i++)
    {
      if (s1[i]<s2[i])
       {
       return 0;
       break;
       }
      else if (s1[i]>s2[i])
     {      
       return 1;
       break;
     }
    }
   return 2;
}


//This function is to sort names alphabetically. First, delete "END" and "̼ @P" from inflowing stream. Then if read-in is smaller than the first name of the array, put it at the first place and put others larger. If read-in is larger than the biggest one, put it at the end of the array. Otherwise, check where is a good place for the new read-in, and then put it there.

int main()
{
  ifstream source("/tmp_mnt/home/anqian/keywords.txt");
  map<int, string> key_array;

  string str_Line;

  if(!source)
    {
      cerr<<"Error opening the File!"<<endl;
      return 1;
    }

  else
    {
      getline(source,str_Line);
      key_array[0]=str_Line;
      getline(source,str_Line);

      while(!source.eof() && str_Line.compare("END")!=0 && str_Line.compare("̼ @P")!=0)
     {
       if (WordCmp(str_Line,key_array[0])==0)
           {
             cout<<key_array.size()<<endl;
             for (int i = key_array.size(); i<1; i--)
               {
                 cout<<"I am in for loop!"<<endl;
                 key_array[i]=key_array[i-1];
               }
             key_array[0]=str_Line;
             getline(source,str_Line);
           }

            else if(WordCmp(str_Line,key_array[key_array.size()-1])==1)
           {
             key_array[key_array.size()]=str_Line;
             getline(source,str_Line);
           }
           
            else
           {
             for (int i=key_array.size(); i=2; i--)
               {
                 if (WordCmp(key_array[i-1],str_Line)==1 && WordCmp(key_array[i-2],str_Line)==0)
                {
                  cout<<"Yes! I am here"<<endl;
                  for(int j=key_array.size(); j<=i; j--)
                    {
                      key_array[j]=key_array[j-1];
                    }
                  key_array[i-1]=str_Line;
                  break;
                }
               }
           
           getline(source,str_Line);
           }
     }
      return 0;
    }

  for(int k=0; k<key_array.size();k++)
    {
      cout<<key_array[k]<<endl;
    }
}


Source File:
SIMPLE  
BITPIX  
NAXIS  
EXTEND  
NEXTEND
DATE    
FILENAME
FILETYPE
TELESCOP
INSTRUME
EQUINOX
ROOTNAME
PRIMESI
TARGNAME
RA_TARG
DEC_TARG
PROPOSID
LINENUM
PR_INV_L
PR_INV_F
PR_INV_M
TDATEOBS
TTIMEOBS
TEXPSTRT
TEXPEND
TEXPTIME
POSTARG1
POSTARG2
OVERFLOW
CAL_VER
PROCTIME
CFSTATUS
OBSTYPE
OBSMODE
PHOTMOD


Thank you so much!


0
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8039201
How's this??  :)  If you are serious about coding, please take note of the OO approach, readable code, well-named variables, simple algorithms, nice use of typedefs, and making the language and the libraries available work as much for you as possible.  There's further polishing that could be done...but I digress.  Also, if this is for use in a class, please do not cut and paste.  Even retyping someone elses code can help you learn it.
brian


#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>

using namespace std;

typedef map<string, string> StringStringMap;
typedef StringStringMap::iterator StringStringMapIter;

class StringSorter
{
public:
   StringSorter(const string &filename) : m_InputFilename(filename)
   {      
   }

   bool ProcessFile()
   {
      bool retval = true;
      string inputBuffer;

      m_InputFile.open(m_InputFilename.c_str());
      if (m_InputFile.good())
      {
         while (!m_InputFile.eof())
         {
            getline(m_InputFile, inputBuffer);
            if (inputBuffer == "END" || inputBuffer == "L< @P")
            {
               break;
            }
           
            ProcessRecord(inputBuffer);
         }
         m_InputFile.close();
      }
      else
      {
         retval = false;
      }
      return retval;
   }

   void DumpSortedKeywords()
   {
      StringStringMapIter iter;
      for(iter = m_KeywordMap.begin(); iter != m_KeywordMap.end(); iter++)
      {
         cout << (*iter).first << endl;
      }
   }

private:
   void ProcessRecord(const string &data)
   {
      m_KeywordMap[data] = data;
   }

   string m_InputFilename;
   ifstream m_InputFile;
   StringStringMap m_KeywordMap;
};

int main()
{
   StringSorter sorter("D:\\Brian\\test_code\\stringsort\\keywords.txt");
   if (sorter.ProcessFile())
   {
      sorter.DumpSortedKeywords();
   }
   return 0;
}

0
 

Author Comment

by:anxx0018
ID: 8044261
Brian,

Yes, you did privide an extremely efficient program. I am serious about learning coding.

Can you explain me more about how ProcessRecord works, and typedef?

Can you explain what is OO approach, readable code? Thank you very much!

Or May I know your email address? Thank you again!
0
 

Author Comment

by:anxx0018
ID: 8044266
again, why mine does not work?
0
 

Author Comment

by:anxx0018
ID: 8044712
Brian,

Is the map very useful class?

typedef map<string, string> StringStringMap;

what does "(*iter).first" mean?

what does "m_KeywordMap[data] = data" mean?

I didn't find the real sorting programs. I found the very tricky stuff is the m_KeywordMap[data]=data, while I would use m_KeywordMap[int i]=data. Does m_KeywordMap[data]=data contribute to sorting?

Where could I find a good C++ book or a website which describes some useful functions and classes?

Thank you a lot!



0
 

Author Comment

by:anxx0018
ID: 8044791
Now I think m_KeywordMap[data]=data did the sorting. Because the "[data]" must be unique and sorted. Am I right?

Can you give me a simple example how can I cat a lot of files since I have to read these keywords from more than one file.

Thank you!
0
 
LVL 2

Accepted Solution

by:
bkrahmer earned 80 total points
ID: 8049817
anxx0018, I will try to answer some of your questions.  First, I started to work with your code, but ended up scrapping it.  I don't remember the exact reasons it didn't work.  Maps are very useful for easily storing data associated by a key.  The important point to understand is that the map keeps the data sorted by key, which is why I used the string as the key as well as the value.  The STL string class has a default sort method, so I didn't have to provide one.  If I wanted my data sorted differently, I would have had to have provided a sort function.  There are many books about STL, there is also http://www.sgi.com/tech/stl/.  You can read about iterators there.  My 'object-oriented' approach (OO) is a book in itself.  Notice how each function does one and only one thing.  That's a part of OO.  I also use consistent variable names, like m_XXX for member variables.  Typedefs are useful for template classes because it cuts down on typing, and I think they're easier to read.
brian
0
 

Author Comment

by:anxx0018
ID: 8054531
Thank you very much!!
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

  Included as part of the C++ Standard Template Library (STL) is a collection of generic containers. Each of these containers serves a different purpose and has different pros and cons. It is often difficult to decide which container to use and …
This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects. A brief on problem: Lets take example problem for simplicity: - I have a G…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question