anxx0018
asked on
File I/O, and some pointer or stack problem..
I have two questions here.
1.Because of some work, I need to collect all keywords that scattered in many files. Basically, I have to read all these files and extract those useful keywords. I know how to deal with it from one source file. Can anybody tell me how to deal with multiple files in this situation?
2. I have a file that contains all the keywords. I need a program to make the keywords unique and sorted alphabetically.
I think I have to read these keywords one by one from the file. Then each time compare the new one first with the smallest one, then those have been sorted.
My questions are: How shall I deal with the sorted results? How can I retrieve them when I need? How shall I save them? use pointer, or stack?
Thank you very much!
1.Because of some work, I need to collect all keywords that scattered in many files. Basically, I have to read all these files and extract those useful keywords. I know how to deal with it from one source file. Can anybody tell me how to deal with multiple files in this situation?
2. I have a file that contains all the keywords. I need a program to make the keywords unique and sorted alphabetically.
I think I have to read these keywords one by one from the file. Then each time compare the new one first with the smallest one, then those have been sorted.
My questions are: How shall I deal with the sorted results? How can I retrieve them when I need? How shall I save them? use pointer, or stack?
Thank you very much!
The cat command would only be available on UNIX and might be slow compared to c++ code. But it would be a quick way to implement the search. Otherwise why wouldn't you just cycle through the files, closing and deleting the I/O object for the last file processed and creating a new I/O object for the new file?
If you have a lot of keywords or will be doing a lot of access on them you might consider using a hash table. That's an extremely efficient means of storing and sorting that's flexible. It's a commonly enough used object that should be available in development libraries such as Rogue Wave for example.
If you have a lot of keywords or will be doing a lot of access on them you might consider using a hash table. That's an extremely efficient means of storing and sorting that's flexible. It's a commonly enough used object that should be available in development libraries such as Rogue Wave for example.
I guarantee that the unix method I mentioned would be faster. Otherwise, yeah.
Exactly which STL container has the hashtable? An important lesson to be learned from the principles of eXtreme Programming is to give the most simple solution for a given problem. Is his program realistically going to take even a second to run? I doubt it. I didn't hear execution speed mentioned as the primary requirement. Therefore, why not let STL do all the work for you and move on to solving interesting problems instead of reinventing the wheel each time?
brian
Exactly which STL container has the hashtable? An important lesson to be learned from the principles of eXtreme Programming is to give the most simple solution for a given problem. Is his program realistically going to take even a second to run? I doubt it. I didn't hear execution speed mentioned as the primary requirement. Therefore, why not let STL do all the work for you and move on to solving interesting problems instead of reinventing the wheel each time?
brian
I might also point out that using a map solves 4 of the points of the problem very well. In addition, because of it's associate property, one could extend it with a couple more lines of code to show how many times a keyword was found, or print the map sorted by how many times the keyword was present.
brian
brian
ASKER
Yes, I am using Linux redhat system. So I think Brian's method will work.
Could you please give me some simple examples for the methods you mentioned above? They all sound strange to me.
Thanks a lot!
Could you please give me some simple examples for the methods you mentioned above? They all sound strange to me.
Thanks a lot!
ASKER
Brian, Can you help me check the following program. I really do not understand why it does not go into the loop.
Thank you very much!
#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>
using namespace std;
//Min is to return the smaller value of m1 and m2.
int Min(int m1, int m2)
{
if (m1<m2)
return m1;
else
return m2;
}
//WordCmp is used to compare two strings, if k1 is alphabetically smaller than k2, return 0; if k1 is bigger than k2, return 1; if k1 and k2 are equal, return 2.
int WordCmp(string k1, string k2)
{
int len1,len2;
char *s1 = new char[strlen(k1.c_str()+1)] ;
strcpy(s1,k1.c_str());
char *s2 = new char[strlen(k2.c_str()+1)] ;
strcpy(s2,k2.c_str());
int n = Min(strlen(k1.c_str()), strlen(k2.c_str()));
for (int i= 0; i<n; i++)
{
if (s1[i]<s2[i])
{
return 0;
break;
}
else if (s1[i]>s2[i])
{
return 1;
break;
}
}
return 2;
}
//This function is to sort names alphabetically. First, delete "END" and "̼@P" from inflowing stream. Then if read-in is smaller than the first name of the array, put it at the first place and put others larger. If read-in is larger than the biggest one, put it at the end of the array. Otherwise, check where is a good place for the new read-in, and then put it there.
int main()
{
ifstream source("/tmp_mnt/home/anqi an/keyword s.txt");
map<int, string> key_array;
string str_Line;
if(!source)
{
cerr<<"Error opening the File!"<<endl;
return 1;
}
else
{
getline(source,str_Line);
key_array[0]=str_Line;
getline(source,str_Line);
while(!source.eof() && str_Line.compare("END")!=0 && str_Line.compare("̼@P")!=0)
{
if (WordCmp(str_Line,key_arra y[0])==0)
{
cout<<key_array.size()<<en dl;
for (int i = key_array.size(); i<1; i--)
{
cout<<"I am in for loop!"<<endl;
key_array[i]=key_array[i-1 ];
}
key_array[0]=str_Line;
getline(source,str_Line);
}
else if(WordCmp(str_Line,key_ar ray[key_ar ray.size() -1])==1)
{
key_array[key_array.size() ]=str_Line ;
getline(source,str_Line);
}
else
{
for (int i=key_array.size(); i=2; i--)
{
if (WordCmp(key_array[i-1],st r_Line)==1 && WordCmp(key_array[i-2],str _Line)==0)
{
cout<<"Yes! I am here"<<endl;
for(int j=key_array.size(); j<=i; j--)
{
key_array[j]=key_array[j-1 ];
}
key_array[i-1]=str_Line;
break;
}
}
getline(source,str_Line);
}
}
return 0;
}
for(int k=0; k<key_array.size();k++)
{
cout<<key_array[k]<<endl;
}
}
Source File:
SIMPLE
BITPIX
NAXIS
EXTEND
NEXTEND
DATE
FILENAME
FILETYPE
TELESCOP
INSTRUME
EQUINOX
ROOTNAME
PRIMESI
TARGNAME
RA_TARG
DEC_TARG
PROPOSID
LINENUM
PR_INV_L
PR_INV_F
PR_INV_M
TDATEOBS
TTIMEOBS
TEXPSTRT
TEXPEND
TEXPTIME
POSTARG1
POSTARG2
OVERFLOW
CAL_VER
PROCTIME
CFSTATUS
OBSTYPE
OBSMODE
PHOTMOD
Thank you so much!
Thank you very much!
#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>
using namespace std;
//Min is to return the smaller value of m1 and m2.
int Min(int m1, int m2)
{
if (m1<m2)
return m1;
else
return m2;
}
//WordCmp is used to compare two strings, if k1 is alphabetically smaller than k2, return 0; if k1 is bigger than k2, return 1; if k1 and k2 are equal, return 2.
int WordCmp(string k1, string k2)
{
int len1,len2;
char *s1 = new char[strlen(k1.c_str()+1)]
strcpy(s1,k1.c_str());
char *s2 = new char[strlen(k2.c_str()+1)]
strcpy(s2,k2.c_str());
int n = Min(strlen(k1.c_str()), strlen(k2.c_str()));
for (int i= 0; i<n; i++)
{
if (s1[i]<s2[i])
{
return 0;
break;
}
else if (s1[i]>s2[i])
{
return 1;
break;
}
}
return 2;
}
//This function is to sort names alphabetically. First, delete "END" and "̼@P" from inflowing stream. Then if read-in is smaller than the first name of the array, put it at the first place and put others larger. If read-in is larger than the biggest one, put it at the end of the array. Otherwise, check where is a good place for the new read-in, and then put it there.
int main()
{
ifstream source("/tmp_mnt/home/anqi
map<int, string> key_array;
string str_Line;
if(!source)
{
cerr<<"Error opening the File!"<<endl;
return 1;
}
else
{
getline(source,str_Line);
key_array[0]=str_Line;
getline(source,str_Line);
while(!source.eof() && str_Line.compare("END")!=0
{
if (WordCmp(str_Line,key_arra
{
cout<<key_array.size()<<en
for (int i = key_array.size(); i<1; i--)
{
cout<<"I am in for loop!"<<endl;
key_array[i]=key_array[i-1
}
key_array[0]=str_Line;
getline(source,str_Line);
}
else if(WordCmp(str_Line,key_ar
{
key_array[key_array.size()
getline(source,str_Line);
}
else
{
for (int i=key_array.size(); i=2; i--)
{
if (WordCmp(key_array[i-1],st
{
cout<<"Yes! I am here"<<endl;
for(int j=key_array.size(); j<=i; j--)
{
key_array[j]=key_array[j-1
}
key_array[i-1]=str_Line;
break;
}
}
getline(source,str_Line);
}
}
return 0;
}
for(int k=0; k<key_array.size();k++)
{
cout<<key_array[k]<<endl;
}
}
Source File:
SIMPLE
BITPIX
NAXIS
EXTEND
NEXTEND
DATE
FILENAME
FILETYPE
TELESCOP
INSTRUME
EQUINOX
ROOTNAME
PRIMESI
TARGNAME
RA_TARG
DEC_TARG
PROPOSID
LINENUM
PR_INV_L
PR_INV_F
PR_INV_M
TDATEOBS
TTIMEOBS
TEXPSTRT
TEXPEND
TEXPTIME
POSTARG1
POSTARG2
OVERFLOW
CAL_VER
PROCTIME
CFSTATUS
OBSTYPE
OBSMODE
PHOTMOD
Thank you so much!
How's this?? :) If you are serious about coding, please take note of the OO approach, readable code, well-named variables, simple algorithms, nice use of typedefs, and making the language and the libraries available work as much for you as possible. There's further polishing that could be done...but I digress. Also, if this is for use in a class, please do not cut and paste. Even retyping someone elses code can help you learn it.
brian
#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>
using namespace std;
typedef map<string, string> StringStringMap;
typedef StringStringMap::iterator StringStringMapIter;
class StringSorter
{
public:
StringSorter(const string &filename) : m_InputFilename(filename)
{
}
bool ProcessFile()
{
bool retval = true;
string inputBuffer;
m_InputFile.open(m_InputFi lename.c_s tr());
if (m_InputFile.good())
{
while (!m_InputFile.eof())
{
getline(m_InputFile, inputBuffer);
if (inputBuffer == "END" || inputBuffer == "L< @P")
{
break;
}
ProcessRecord(inputBuffer) ;
}
m_InputFile.close();
}
else
{
retval = false;
}
return retval;
}
void DumpSortedKeywords()
{
StringStringMapIter iter;
for(iter = m_KeywordMap.begin(); iter != m_KeywordMap.end(); iter++)
{
cout << (*iter).first << endl;
}
}
private:
void ProcessRecord(const string &data)
{
m_KeywordMap[data] = data;
}
string m_InputFilename;
ifstream m_InputFile;
StringStringMap m_KeywordMap;
};
int main()
{
StringSorter sorter("D:\\Brian\\test_co de\\string sort\\keyw ords.txt") ;
if (sorter.ProcessFile())
{
sorter.DumpSortedKeywords( );
}
return 0;
}
brian
#include <stdio.h>
#include <stdlib.h>
#include <fstream>
#include <iostream>
#include <string>
#include <map>
using namespace std;
typedef map<string, string> StringStringMap;
typedef StringStringMap::iterator StringStringMapIter;
class StringSorter
{
public:
StringSorter(const string &filename) : m_InputFilename(filename)
{
}
bool ProcessFile()
{
bool retval = true;
string inputBuffer;
m_InputFile.open(m_InputFi
if (m_InputFile.good())
{
while (!m_InputFile.eof())
{
getline(m_InputFile, inputBuffer);
if (inputBuffer == "END" || inputBuffer == "L< @P")
{
break;
}
ProcessRecord(inputBuffer)
}
m_InputFile.close();
}
else
{
retval = false;
}
return retval;
}
void DumpSortedKeywords()
{
StringStringMapIter iter;
for(iter = m_KeywordMap.begin(); iter != m_KeywordMap.end(); iter++)
{
cout << (*iter).first << endl;
}
}
private:
void ProcessRecord(const string &data)
{
m_KeywordMap[data] = data;
}
string m_InputFilename;
ifstream m_InputFile;
StringStringMap m_KeywordMap;
};
int main()
{
StringSorter sorter("D:\\Brian\\test_co
if (sorter.ProcessFile())
{
sorter.DumpSortedKeywords(
}
return 0;
}
ASKER
Brian,
Yes, you did privide an extremely efficient program. I am serious about learning coding.
Can you explain me more about how ProcessRecord works, and typedef?
Can you explain what is OO approach, readable code? Thank you very much!
Or May I know your email address? Thank you again!
Yes, you did privide an extremely efficient program. I am serious about learning coding.
Can you explain me more about how ProcessRecord works, and typedef?
Can you explain what is OO approach, readable code? Thank you very much!
Or May I know your email address? Thank you again!
ASKER
again, why mine does not work?
ASKER
Brian,
Is the map very useful class?
typedef map<string, string> StringStringMap;
what does "(*iter).first" mean?
what does "m_KeywordMap[data] = data" mean?
I didn't find the real sorting programs. I found the very tricky stuff is the m_KeywordMap[data]=data, while I would use m_KeywordMap[int i]=data. Does m_KeywordMap[data]=data contribute to sorting?
Where could I find a good C++ book or a website which describes some useful functions and classes?
Thank you a lot!
Is the map very useful class?
typedef map<string, string> StringStringMap;
what does "(*iter).first" mean?
what does "m_KeywordMap[data] = data" mean?
I didn't find the real sorting programs. I found the very tricky stuff is the m_KeywordMap[data]=data, while I would use m_KeywordMap[int i]=data. Does m_KeywordMap[data]=data contribute to sorting?
Where could I find a good C++ book or a website which describes some useful functions and classes?
Thank you a lot!
ASKER
Now I think m_KeywordMap[data]=data did the sorting. Because the "[data]" must be unique and sorted. Am I right?
Can you give me a simple example how can I cat a lot of files since I have to read these keywords from more than one file.
Thank you!
Can you give me a simple example how can I cat a lot of files since I have to read these keywords from more than one file.
Thank you!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you very much!!
For two, I think your first assumption is a good one. The second point can be made extremely easy by using an STL map to store your words after you've parsed them.
brian