Solved

URGENT HELP NEEDED ....INPUTTING WORDS FROM A FILE

Posted on 2003-10-24
7
342 Views
Last Modified: 2008-02-01
hi, i have this program. It is supposed to open a file, and output  word statistics. I have the hash table implemented, need help in setting up a function called nextWord,  which will input the words in to the table. Here is the pseduo-code for the code (which i think is confusing, but someone else might have an idea) http://k4shif.netfirms.com/program4/prog4.html . I am including in here the hashTable.h and hashtest.cpp (Please let me know if there are any errors with the hashTable). Thanks for your help in advance , i need to get this done real fast.

hashTable.h
=================================================================================================================
#ifndef _hashTable_h
#define _hashTable_h

#include <iostream>
#include <string>
#include <cassert>

using namespace std;

template<class KeyType, class DataType>
struct mypair
{   KeyType key;//words
    DataType data;//frequencies
};

template<class KeyType, class DataType>
class hashTable
{  
public:
       
           hashTable(long int tSize = 0);
        //Constructor function. Dynamically allocates the arrays, table
        //and status, with dimensions of the next prime number > tSize.
        //Initializes the status array to show all positions in the table
        //are empty and initializes size, tableSize, numSearches and
        //numProbes. Uses non-member function nextPrime.
       
        ~hashTable();
       
    void find(const KeyType& k, long int& i, bool& found);
        //Looks for key, k, in table. If found then sets i to the subscript
        //where the key is located and sets found to true. If not found
        //then sets i to the subscript where the key could be inserted
        //and sets found to false. Uses linear probing for collision
        //resolution with inc determined from the hash value as follows:
        //i = hash(...); inc = i; if (inc == 0) inc = tableSize - 1 ; This  
        //is a little better than always taking inc = 1. Uses non-member
        //function hash.
 
    void insert(const KeyType& k, const DataType& d);
        //Inserts a pair with key k and data d into the table. If a pair
        //with the same key part already occurs in the table then it prints
        //an error message and does not insert the pair. If the table is
        //now more than 75% full then it rehashes into a larger table. Uses
        //the public member function find and the private member function
        //rehash.
       
 //************************************************************************
 //The following functions provide simialr capabilties to iterators in STL.
 //They allow one to examine all elements of the hash table. You could use
 //these for example to output all elements of the hash table or to copy all
 //elements to a vector outside the hashTable class for sorting, etc.
       
    const mypair<KeyType, DataType>& element(long int i) const;
        //Returns the pair at subscript i in the table. Cannot be used
        //to assign a new value to the pair.
       
    DataType& data(long int i) const;
        //Returns a reference to the data at subscript i in the table.
        //Can be used to assign a new value to the data.
       
    long int begin() const;
        //Returns the subscript of the first element in the table; returns
        //the same value as end(), i.e. tableSize, if the table is empty.
       
    long int end() const;
        //Returns subscript just past end of table, i.e. tableSize.
           
    long int next(long int i) const;
        //Returns the subscript of the next element after i; if there is no
        //next element then returns the same as end(), i.e. tableSize.
      
      

//*************************************************************************
       
    float averageSearchLength() const;
        //returns the average length of all searches numProbes/numSearches.
   
       
       
private:
    mypair<KeyType, DataType>* table; //dynamically allocated array of pairs
    short int* status;   //dynamically allocated status array; contains
                         //staus info: 0=empty, 1=occupied
    long int size; //number of elements in table
    long int tableSize; //size of table array
    long int numSearches; //total number of searches done in looking up keys.
                          //Increment numSearches each time find called.
    long int numProbes;   //total number of probes done during all searches.
                          //Increment each time find looks at a spot.
                                 
    void rehash();
        //The rehash function remembers the current table, status and tableSize
        //by assigning them to local variables, oldTable, oldStatus and
        //oldTableSize. It then dynamically allocates new table and status
        //arrays of size the next larger prime than oldTableSize and sets
        //tableSize to the new size. It then goes through the old table
        //and rehashes each element into the new table using the insert
        //function. Uses non-member function nextPrime.
};

template<class KeyType, class DataType>
const mypair<KeyType, DataType>&
hashTable<KeyType, DataType>::element(long int i) const
{   return table[i];
}
   
template<class KeyType, class DataType>
DataType& hashTable<KeyType, DataType>::data(long int i) const
{   return table[i].data;
}  

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::begin() const
{   long int i = 0;

    while(i < tableSize && status[i] != 1)
        i++;
    return i;
}

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::end() const
{   return tableSize;
}

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::next(long int i) const
{  
    i++;
    while(i < tableSize && status[i] != 1)
        i++;
    return i;
}

template<class KeyType, class DataType>
float hashTable<KeyType, DataType>::averageSearchLength()const
{
      assert(numSearches != 0);//assert that numSearches isnt 0
      return float(numProbes/numSearches);
}


template<class KeyType, class DataType>
hashTable<KeyType, DataType>::hashTable(long int tSize=0)
{
      tableSize=nextPrime(tSize);
      table = new mypair<KeyType, DataType>[tableSize];
      status=new short int [tableSize];
      for(int k=0;k<tableSize;k++)status[k]=0;
      size=0;
      numSearches=0;
      numProbes=0;

}

template<class KeyType, class DataType>
hashTable<KeyType, DataType>::~hashTable()
{
      delete [] table;
      delete [] status;
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::insert(const KeyType& k, const DataType& d)
{
      
                  bool found=false;
                  long int i=0;
                  
                  find( k,  i, found);
                  

                  if(found==false)
                  {      
                        table[i].data=d;
                        table[i].key=k;
                        status[i]=1;
                        size++;
                  }
            if(size > .75*(tableSize))rehash();
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::find(const KeyType& k, long int& i, bool& found)
{
      numSearches++;
      i=hash(k, tableSize);
      long int inc = i;

      cout<<"Now in find"<<endl;

      if(inc == 0)inc = tableSize -1;

      for( int g =0; g < tableSize; g++)
      {
            if(status[g] ==1)
            {
                  if(table[g].key == k)
                  {
                        found = true;
                        table[g].data++;
                        break;
                  }
            }
            numProbes++;
      }
      
      if(found == false)//now determine where I can put in in.
      {
            while(status[inc] != 0)
            {
                  inc++;
                  numProbes++;
                  if(inc == tableSize)inc=0;
            }
            i=inc;

      }
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::rehash()
{
      mypair<KeyType, DataType>* oldTable=table;
      short int* oldStatus = status;
      long int oldTableSize = tableSize;
      bool doesIt=false;

      table = new mypair<KeyType, DataType>[nextPrime(oldTableSize)];
      status = new short int[nextPrime(oldTableSize)];
      tableSize=nextPrime(oldTableSize);

      for(int h =0; h < tableSize;h++)status[h]=0;

      for(int j=0; j < oldTableSize;j++)
      {
            while(oldStatus[j] != 0)
            {
                  insert(oldTable[j].key, oldTable[j].data);
            }

      }

      delete[] oldTable;
      delete[] oldStatus;
}



   

//************************************************************************
long int hash(string s, int tableSize)
{   unsigned long int j;
    unsigned long int sum = 0;

    for (j = 0; j < s.length(); j++)
        sum = (sum << 5) + s[j];
       
    return sum % tableSize;
}

long int nextPrime(long int n)
{   long int i;
    long int p[18] = {      31,      61,     127,     251,     509,    1021,                        
                          2039,    4093,    8191,   16381,   32749,   65521,  
                        131071,  262139,  524287, 1048573, 2097143, 4194301
                     };

    for (i = 0; i < 18; i++)
       if (p[i] > n) return p[i];
       
    cerr<<"Next prime too large"<<endl;
    return p[17];
}

bool operator < (const mypair<string, int>& p,const mypair<string, int>& q)
{
      if(p.data < q.data)return true;

       if ( p.data > q.data)return false;

      else
            return (p.key < q.key);
}



//************************************************************************
#endif


=================================================================================================================


hashTest.cpp
=================================================================================================================
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <fstream>
#include <cctype>
#include <cstdlib>
#include "hashTable.h"

using namespace std;


void nextWord(ifstream& ifs, string& word, bool& found);

int main()
{
      char fname[20];//array to store the name of the file
      ifstream infile1;//the infile stream
      string word;//this while store the word
      
      hashTable<string, int>ht(0);//start off to a table of size 31
      vector< mypair<string, int> >h;//the vector
      
      cout<< "Enter File Name:";
      cin>>fname;

      //after this point the whole nextWord crap occurs
      //and the hash table is formed


      //this bit of code to copy the hash table into the vector for sorting

      infile1.close();

      for(long int l = ht.begin(); l!= ht.end(); l = ht.next(l))
      {
            h.push_back(ht.element(l));
      }

      //once this is done we sort using the generic sort function of STL.

      sort(&h[0],&h[h.size()-1]);

      //and now we output
      cout<<"Average search length in hash table = "<<ht.averageSearchLength()<<endl;
      cout<<"Frequency"<<'\t'<<"Word"<<endl;
      for(unsigned long int u =0; u < h.size(); u++)
      {
            cout<<h[u].data<<'\t'<<h[u].key<<endl;
      }
      system("PAUSE");
      return 0;

}


=================================================================================================================
0
Comment
Question by:wassup84
7 Comments
 

Expert Comment

by:Parchandri
ID: 9618868
while (infile >> nextword) {
  h.insert_element(nextword); //or whatever
}
0
 

Expert Comment

by:Parchandri
ID: 9618876
oh, sorry, the above won't do, I didn't read full problem description.

#include <cctype>

char c;
string s = "";
while (infile1 >> c) {
  if (isalpha(c)) s += tolower(c);
  else if (s != "") {
     h.insert_element(s);
     s = "";
  }
}
0
 
LVL 2

Assisted Solution

by:MeiaDose
MeiaDose earned 250 total points
ID: 9619228
"
#include <string>
#include <iostream>
#include <fstream>

using namespace std;

typedef string::iterator Sitr;
typedef string::const_iterator CSitr;

void formatWord(string &word){
      for(Sitr itr=word.begin();itr!=word.end();++itr){
            if(isupper(*itr)){
                  char letter=*itr;
                  letter=tolower(letter);
                  Sitr nextItr=word.erase(itr);            
                  word.insert(nextItr,letter);

            }
      }
}

string getWord(ifstream& ifs, Sitr &itr,string &line, bool &endOfFile){
      string myWord("");
      while(*itr!=' ')
      {
            if(itr==line.end()){
                  endOfFile=true;
                  formatWord(myWord);
                  return myWord;
            }
            myWord+=*itr;
            ++itr;
      }
      // Searching for the next word;
      while(isalpha(*itr)==0)
            ++itr;
      formatWord(myWord);
      return myWord;
}

void main(){
      ifstream textFile("texto.txt");
      string linha("");
      bool end=false;
      while(getline(textFile, linha)){
            Sitr itr=linha.begin();
            while(!end){
                  string a=getWord(textFile,itr,linha,end);
            }
            end=false;
      }
}
"
0
What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 2

Expert Comment

by:MeiaDose
ID: 9619234
The code above uses a function, that i called, getWord(...).

This function retrieves a word, with all characters tolower case [see formatWord(...)],  from a line of text.
It uses an iterator to get the characters and to find if it has reached the end of the line, and if so, it afects the enOfFile flag.

All you have to do is use the getline function to get a line from your text file, and while it returns true, you just have to call getWord(...) and store the word that is returned.

Tell me if this worked in your project.

Good luck,

MeiaDose




0
 

Author Comment

by:wassup84
ID: 9619520
neither of the codes look anything like the pseudo code or like nextWord(ifstream& ifs, string& word, boll& found) . I have to stick by those guidelines ( u know how it is with these proffs). and plus I cant add the other function, he said work with all the functions that are there, nothing more. :(    
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 250 total points
ID: 9620451
It depends on your definition of a word.  In this code,
I assumed that any string of characters that is not a space
is part of a word.  isspace() considedrs taba sand CRlR LF as spaces.

If you want to exclude other characters (such as digits or <>".  -- whatever)
then replace calls to isspace() with calls to isUnwantedChar(c) and write that
function yourself.  Have that fn return true for characters that you don't want
to consider as part of a word.

=-=-=-=-=-=--==-=-=-=-=-=-=-=-=-=- actual, tested code follows
#include<iostream>
#include<fstream>
using namespace std;

void nextWord(ifstream& ifs, string& word, bool& found)
{
      char c;
      found= false;
      word="";

      ifs.get(c);
      while( isspace(c) && !ifs.eof() ) {  // get past leading spaces
            ifs.get(c);
      }
      if ( ifs.eof()) {
            return; // word="" and found is false
      }
      found= true;   // got at leaset one letter
      while ( ! ifs.eof() && !isspace(c) ) {  // collect up to next delimiter
            word += c;      
            ifs.get(c);
      }
}

//------------------------------- validation code
void main(void)
{
      ifstream fIn;
      fIn.open( "c:\\temp\\TheData.txt", ios_base::binary | ios_base::in );

      bool fFound= true;
      string sWord;

      while ( fFound ) {
            nextWord( fIn, sWord, fFound );
            cout << sWord.c_str() << endl;
      }
}
0
 

Expert Comment

by:Parchandri
ID: 9620944
It should be pretty easy to modify my code to a function as reqired, if you understand what's going on. I didn't intend to entirely do your homework for you.
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

When writing generic code, using template meta-programming techniques, it is sometimes useful to know if a type is convertible to another type. A good example of when this might be is if you are writing diagnostic instrumentation for code to generat…
Templates For Beginners Or How To Encourage The Compiler To Work For You Introduction This tutorial is targeted at the reader who is, perhaps, familiar with the basics of C++ but would prefer a little slower introduction to the more ad…
The goal of the tutorial is to teach the user how to use functions in C++. The video will cover how to define functions, how to call functions and how to create functions prototypes. Microsoft Visual C++ 2010 Express will be used as a text editor an…
The viewer will be introduced to the member functions push_back and pop_back of the vector class. The video will teach the difference between the two as well as how to use each one along with its functionality.

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now