?
Solved

File I/O, and some pointer or stack problem..

Posted on 2003-02-26
13
Medium Priority
?
244 Views
Last Modified: 2010-04-01
I have two questions here.

1.Because of some work, I need to collect all keywords that scattered in many files. Basically, I have to read all these files and extract those useful keywords. I know how to deal with it from one source file. Can anybody tell me how to deal with multiple files in this situation?

2. I have a file that contains all the keywords. I need a program to make the keywords unique and sorted alphabetically.

I think I have to read these keywords one by one from the file. Then each time compare the new one first with the smallest one, then those have been sorted.

My questions are: How shall I deal with the sorted results? How can I retrieve them when I need? How shall I save them? use pointer, or stack?

Thank you very much!
0
Comment
Question by:anxx0018
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
13 Comments
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8031479
My realistic answer for number one is to use 'cat' to get all your data flowing into the stdin of the program.  If that doesn't trip your trigger, you'd have to give more details about your specifications, like what OS.
For two, I think your first assumption is a good one.  The second point can be made extremely easy by using an STL map to store your words after you've parsed them.
brian
0
 

Expert Comment

by:phildsp
ID: 8031795
The cat command would only be available on UNIX and might be slow compared to c++ code.  But it would be a quick way to implement the search.  Otherwise why wouldn't you just cycle through the files, closing and deleting the I/O object for the last file processed and creating a new I/O object for the new file?

If you have a lot of keywords or will be doing a lot of access on them you might consider using a hash table.  That's an extremely efficient means of storing and sorting that's flexible.  It's a commonly enough used object that should be available in development libraries such as Rogue Wave for example.
0
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8032040
I guarantee that the unix method I mentioned would be faster.  Otherwise, yeah.
Exactly which STL container has the hashtable?  An important lesson to be learned from the principles of eXtreme Programming is to give the most simple solution for a given problem.  Is his program realistically going to take even a second to run?  I doubt it.  I didn't hear execution speed mentioned as the primary requirement.  Therefore, why not let STL do all the work for you and move on to solving interesting problems instead of reinventing the wheel each time?
brian
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 2

Expert Comment

by:bkrahmer
ID: 8032068
I might also point out that using a map solves 4 of the points of the problem very well.  In addition, because of it's associate property, one could extend it with a couple more lines of code to show how many times a keyword was found, or print the map sorted by how many times the keyword was present.
brian
0
 

Author Comment

by:anxx0018
ID: 8034800
Yes, I am using Linux redhat system. So I think Brian's method will work.

Could you please give me some simple examples for the methods you mentioned above? They all sound strange to me.

Thanks a lot!
0
 

Author Comment

by:anxx0018
ID: 8037750
Brian, Can you help me check the following program. I really do not understand why it does not go into the loop.

Thank you very much!
#include <stdio.h>
#include <stdlib.h>
#include <fstream>

#include <iostream>
#include <string>
#include <map>

using namespace std;

//Min is to return the smaller value of m1 and m2.
int Min(int m1, int m2)
{
  if (m1<m2)
    return m1;
  else
    return m2;
}

//WordCmp is used to compare two strings, if k1 is alphabetically smaller than k2, return 0; if k1 is bigger than k2, return 1; if k1 and k2 are equal, return 2.

int WordCmp(string k1, string k2)
{
  int len1,len2;

  char *s1 = new char[strlen(k1.c_str()+1)];
  strcpy(s1,k1.c_str());

  char *s2 = new char[strlen(k2.c_str()+1)];
  strcpy(s2,k2.c_str());
 
  int n = Min(strlen(k1.c_str()), strlen(k2.c_str()));
 
   for (int i= 0; i<n; i++)
    {
      if (s1[i]<s2[i])
       {
       return 0;
       break;
       }
      else if (s1[i]>s2[i])
     {      
       return 1;
       break;
     }
    }
   return 2;
}


//This function is to sort names alphabetically. First, delete "END" and "̼ @P" from inflowing stream. Then if read-in is smaller than the first name of the array, put it at the first place and put others larger. If read-in is larger than the biggest one, put it at the end of the array. Otherwise, check where is a good place for the new read-in, and then put it there.

int main()
{
  ifstream source("/tmp_mnt/home/anqian/keywords.txt");
  map<int, string> key_array;

  string str_Line;

  if(!source)
    {
      cerr<<"Error opening the File!"<<endl;
      return 1;
    }

  else
    {
      getline(source,str_Line);
      key_array[0]=str_Line;
      getline(source,str_Line);

      while(!source.eof() && str_Line.compare("END")!=0 && str_Line.compare("̼ @P")!=0)
     {
       if (WordCmp(str_Line,key_array[0])==0)
           {
             cout<<key_array.size()<<endl;
             for (int i = key_array.size(); i<1; i--)
               {
                 cout<<"I am in for loop!"<<endl;
                 key_array[i]=key_array[i-1];
               }
             key_array[0]=str_Line;
             getline(source,str_Line);
           }

            else if(WordCmp(str_Line,key_array[key_array.size()-1])==1)
           {
             key_array[key_array.size()]=str_Line;
             getline(source,str_Line);
           }
           
            else
           {
             for (int i=key_array.size(); i=2; i--)
               {
                 if (WordCmp(key_array[i-1],str_Line)==1 && WordCmp(key_array[i-2],str_Line)==0)
                {
                  cout<<"Yes! I am here"<<endl;
                  for(int j=key_array.size(); j<=i; j--)
                    {
                      key_array[j]=key_array[j-1];
                    }
                  key_array[i-1]=str_Line;
                  break;
                }
               }
           
           getline(source,str_Line);
           }
     }
      return 0;
    }

  for(int k=0; k<key_array.size();k++)
    {
      cout<<key_array[k]<<endl;
    }
}


Source File:
SIMPLE  
BITPIX  
NAXIS  
EXTEND  
NEXTEND
DATE    
FILENAME
FILETYPE
TELESCOP
INSTRUME
EQUINOX
ROOTNAME
PRIMESI
TARGNAME
RA_TARG
DEC_TARG
PROPOSID
LINENUM
PR_INV_L
PR_INV_F
PR_INV_M
TDATEOBS
TTIMEOBS
TEXPSTRT
TEXPEND
TEXPTIME
POSTARG1
POSTARG2
OVERFLOW
CAL_VER
PROCTIME
CFSTATUS
OBSTYPE
OBSMODE
PHOTMOD


Thank you so much!


0
 
LVL 2

Expert Comment

by:bkrahmer
ID: 8039201
How's this??  :)  If you are serious about coding, please take note of the OO approach, readable code, well-named variables, simple algorithms, nice use of typedefs, and making the language and the libraries available work as much for you as possible.  There's further polishing that could be done...but I digress.  Also, if this is for use in a class, please do not cut and paste.  Even retyping someone elses code can help you learn it.
brian


#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>

using namespace std;

typedef map<string, string> StringStringMap;
typedef StringStringMap::iterator StringStringMapIter;

class StringSorter
{
public:
   StringSorter(const string &filename) : m_InputFilename(filename)
   {      
   }

   bool ProcessFile()
   {
      bool retval = true;
      string inputBuffer;

      m_InputFile.open(m_InputFilename.c_str());
      if (m_InputFile.good())
      {
         while (!m_InputFile.eof())
         {
            getline(m_InputFile, inputBuffer);
            if (inputBuffer == "END" || inputBuffer == "L< @P")
            {
               break;
            }
           
            ProcessRecord(inputBuffer);
         }
         m_InputFile.close();
      }
      else
      {
         retval = false;
      }
      return retval;
   }

   void DumpSortedKeywords()
   {
      StringStringMapIter iter;
      for(iter = m_KeywordMap.begin(); iter != m_KeywordMap.end(); iter++)
      {
         cout << (*iter).first << endl;
      }
   }

private:
   void ProcessRecord(const string &data)
   {
      m_KeywordMap[data] = data;
   }

   string m_InputFilename;
   ifstream m_InputFile;
   StringStringMap m_KeywordMap;
};

int main()
{
   StringSorter sorter("D:\\Brian\\test_code\\stringsort\\keywords.txt");
   if (sorter.ProcessFile())
   {
      sorter.DumpSortedKeywords();
   }
   return 0;
}

0
 

Author Comment

by:anxx0018
ID: 8044261
Brian,

Yes, you did privide an extremely efficient program. I am serious about learning coding.

Can you explain me more about how ProcessRecord works, and typedef?

Can you explain what is OO approach, readable code? Thank you very much!

Or May I know your email address? Thank you again!
0
 

Author Comment

by:anxx0018
ID: 8044266
again, why mine does not work?
0
 

Author Comment

by:anxx0018
ID: 8044712
Brian,

Is the map very useful class?

typedef map<string, string> StringStringMap;

what does "(*iter).first" mean?

what does "m_KeywordMap[data] = data" mean?

I didn't find the real sorting programs. I found the very tricky stuff is the m_KeywordMap[data]=data, while I would use m_KeywordMap[int i]=data. Does m_KeywordMap[data]=data contribute to sorting?

Where could I find a good C++ book or a website which describes some useful functions and classes?

Thank you a lot!



0
 

Author Comment

by:anxx0018
ID: 8044791
Now I think m_KeywordMap[data]=data did the sorting. Because the "[data]" must be unique and sorted. Am I right?

Can you give me a simple example how can I cat a lot of files since I have to read these keywords from more than one file.

Thank you!
0
 
LVL 2

Accepted Solution

by:
bkrahmer earned 80 total points
ID: 8049817
anxx0018, I will try to answer some of your questions.  First, I started to work with your code, but ended up scrapping it.  I don't remember the exact reasons it didn't work.  Maps are very useful for easily storing data associated by a key.  The important point to understand is that the map keeps the data sorted by key, which is why I used the string as the key as well as the value.  The STL string class has a default sort method, so I didn't have to provide one.  If I wanted my data sorted differently, I would have had to have provided a sort function.  There are many books about STL, there is also http://www.sgi.com/tech/stl/.  You can read about iterators there.  My 'object-oriented' approach (OO) is a book in itself.  Notice how each function does one and only one thing.  That's a part of OO.  I also use consistent variable names, like m_XXX for member variables.  Typedefs are useful for template classes because it cuts down on typing, and I think they're easier to read.
brian
0
 

Author Comment

by:anxx0018
ID: 8054531
Thank you very much!!
0

Featured Post

New benefit for Premium Members - Upgrade now!

Ready to get started with anonymous questions today? It's easy! Learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
IntroductionThis article is the second in a three part article series on the Visual Studio 2008 Debugger.  It provides tips in setting and using breakpoints. If not familiar with this debugger, you can find a basic introduction in the EE article loc…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.
Suggested Courses

777 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question