Solved

URGENT HELP NEEDED ....INPUTTING WORDS FROM A FILE

Posted on 2003-10-24
7
345 Views
Last Modified: 2008-02-01
hi, i have this program. It is supposed to open a file, and output  word statistics. I have the hash table implemented, need help in setting up a function called nextWord,  which will input the words in to the table. Here is the pseduo-code for the code (which i think is confusing, but someone else might have an idea) http://k4shif.netfirms.com/program4/prog4.html . I am including in here the hashTable.h and hashtest.cpp (Please let me know if there are any errors with the hashTable). Thanks for your help in advance , i need to get this done real fast.

hashTable.h
=================================================================================================================
#ifndef _hashTable_h
#define _hashTable_h

#include <iostream>
#include <string>
#include <cassert>

using namespace std;

template<class KeyType, class DataType>
struct mypair
{   KeyType key;//words
    DataType data;//frequencies
};

template<class KeyType, class DataType>
class hashTable
{  
public:
       
           hashTable(long int tSize = 0);
        //Constructor function. Dynamically allocates the arrays, table
        //and status, with dimensions of the next prime number > tSize.
        //Initializes the status array to show all positions in the table
        //are empty and initializes size, tableSize, numSearches and
        //numProbes. Uses non-member function nextPrime.
       
        ~hashTable();
       
    void find(const KeyType& k, long int& i, bool& found);
        //Looks for key, k, in table. If found then sets i to the subscript
        //where the key is located and sets found to true. If not found
        //then sets i to the subscript where the key could be inserted
        //and sets found to false. Uses linear probing for collision
        //resolution with inc determined from the hash value as follows:
        //i = hash(...); inc = i; if (inc == 0) inc = tableSize - 1 ; This  
        //is a little better than always taking inc = 1. Uses non-member
        //function hash.
 
    void insert(const KeyType& k, const DataType& d);
        //Inserts a pair with key k and data d into the table. If a pair
        //with the same key part already occurs in the table then it prints
        //an error message and does not insert the pair. If the table is
        //now more than 75% full then it rehashes into a larger table. Uses
        //the public member function find and the private member function
        //rehash.
       
 //************************************************************************
 //The following functions provide simialr capabilties to iterators in STL.
 //They allow one to examine all elements of the hash table. You could use
 //these for example to output all elements of the hash table or to copy all
 //elements to a vector outside the hashTable class for sorting, etc.
       
    const mypair<KeyType, DataType>& element(long int i) const;
        //Returns the pair at subscript i in the table. Cannot be used
        //to assign a new value to the pair.
       
    DataType& data(long int i) const;
        //Returns a reference to the data at subscript i in the table.
        //Can be used to assign a new value to the data.
       
    long int begin() const;
        //Returns the subscript of the first element in the table; returns
        //the same value as end(), i.e. tableSize, if the table is empty.
       
    long int end() const;
        //Returns subscript just past end of table, i.e. tableSize.
           
    long int next(long int i) const;
        //Returns the subscript of the next element after i; if there is no
        //next element then returns the same as end(), i.e. tableSize.
      
      

//*************************************************************************
       
    float averageSearchLength() const;
        //returns the average length of all searches numProbes/numSearches.
   
       
       
private:
    mypair<KeyType, DataType>* table; //dynamically allocated array of pairs
    short int* status;   //dynamically allocated status array; contains
                         //staus info: 0=empty, 1=occupied
    long int size; //number of elements in table
    long int tableSize; //size of table array
    long int numSearches; //total number of searches done in looking up keys.
                          //Increment numSearches each time find called.
    long int numProbes;   //total number of probes done during all searches.
                          //Increment each time find looks at a spot.
                                 
    void rehash();
        //The rehash function remembers the current table, status and tableSize
        //by assigning them to local variables, oldTable, oldStatus and
        //oldTableSize. It then dynamically allocates new table and status
        //arrays of size the next larger prime than oldTableSize and sets
        //tableSize to the new size. It then goes through the old table
        //and rehashes each element into the new table using the insert
        //function. Uses non-member function nextPrime.
};

template<class KeyType, class DataType>
const mypair<KeyType, DataType>&
hashTable<KeyType, DataType>::element(long int i) const
{   return table[i];
}
   
template<class KeyType, class DataType>
DataType& hashTable<KeyType, DataType>::data(long int i) const
{   return table[i].data;
}  

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::begin() const
{   long int i = 0;

    while(i < tableSize && status[i] != 1)
        i++;
    return i;
}

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::end() const
{   return tableSize;
}

template<class KeyType, class DataType>
long int hashTable<KeyType, DataType>::next(long int i) const
{  
    i++;
    while(i < tableSize && status[i] != 1)
        i++;
    return i;
}

template<class KeyType, class DataType>
float hashTable<KeyType, DataType>::averageSearchLength()const
{
      assert(numSearches != 0);//assert that numSearches isnt 0
      return float(numProbes/numSearches);
}


template<class KeyType, class DataType>
hashTable<KeyType, DataType>::hashTable(long int tSize=0)
{
      tableSize=nextPrime(tSize);
      table = new mypair<KeyType, DataType>[tableSize];
      status=new short int [tableSize];
      for(int k=0;k<tableSize;k++)status[k]=0;
      size=0;
      numSearches=0;
      numProbes=0;

}

template<class KeyType, class DataType>
hashTable<KeyType, DataType>::~hashTable()
{
      delete [] table;
      delete [] status;
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::insert(const KeyType& k, const DataType& d)
{
      
                  bool found=false;
                  long int i=0;
                  
                  find( k,  i, found);
                  

                  if(found==false)
                  {      
                        table[i].data=d;
                        table[i].key=k;
                        status[i]=1;
                        size++;
                  }
            if(size > .75*(tableSize))rehash();
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::find(const KeyType& k, long int& i, bool& found)
{
      numSearches++;
      i=hash(k, tableSize);
      long int inc = i;

      cout<<"Now in find"<<endl;

      if(inc == 0)inc = tableSize -1;

      for( int g =0; g < tableSize; g++)
      {
            if(status[g] ==1)
            {
                  if(table[g].key == k)
                  {
                        found = true;
                        table[g].data++;
                        break;
                  }
            }
            numProbes++;
      }
      
      if(found == false)//now determine where I can put in in.
      {
            while(status[inc] != 0)
            {
                  inc++;
                  numProbes++;
                  if(inc == tableSize)inc=0;
            }
            i=inc;

      }
}

template<class KeyType, class DataType>
void hashTable<KeyType, DataType>::rehash()
{
      mypair<KeyType, DataType>* oldTable=table;
      short int* oldStatus = status;
      long int oldTableSize = tableSize;
      bool doesIt=false;

      table = new mypair<KeyType, DataType>[nextPrime(oldTableSize)];
      status = new short int[nextPrime(oldTableSize)];
      tableSize=nextPrime(oldTableSize);

      for(int h =0; h < tableSize;h++)status[h]=0;

      for(int j=0; j < oldTableSize;j++)
      {
            while(oldStatus[j] != 0)
            {
                  insert(oldTable[j].key, oldTable[j].data);
            }

      }

      delete[] oldTable;
      delete[] oldStatus;
}



   

//************************************************************************
long int hash(string s, int tableSize)
{   unsigned long int j;
    unsigned long int sum = 0;

    for (j = 0; j < s.length(); j++)
        sum = (sum << 5) + s[j];
       
    return sum % tableSize;
}

long int nextPrime(long int n)
{   long int i;
    long int p[18] = {      31,      61,     127,     251,     509,    1021,                        
                          2039,    4093,    8191,   16381,   32749,   65521,  
                        131071,  262139,  524287, 1048573, 2097143, 4194301
                     };

    for (i = 0; i < 18; i++)
       if (p[i] > n) return p[i];
       
    cerr<<"Next prime too large"<<endl;
    return p[17];
}

bool operator < (const mypair<string, int>& p,const mypair<string, int>& q)
{
      if(p.data < q.data)return true;

       if ( p.data > q.data)return false;

      else
            return (p.key < q.key);
}



//************************************************************************
#endif


=================================================================================================================


hashTest.cpp
=================================================================================================================
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <fstream>
#include <cctype>
#include <cstdlib>
#include "hashTable.h"

using namespace std;


void nextWord(ifstream& ifs, string& word, bool& found);

int main()
{
      char fname[20];//array to store the name of the file
      ifstream infile1;//the infile stream
      string word;//this while store the word
      
      hashTable<string, int>ht(0);//start off to a table of size 31
      vector< mypair<string, int> >h;//the vector
      
      cout<< "Enter File Name:";
      cin>>fname;

      //after this point the whole nextWord crap occurs
      //and the hash table is formed


      //this bit of code to copy the hash table into the vector for sorting

      infile1.close();

      for(long int l = ht.begin(); l!= ht.end(); l = ht.next(l))
      {
            h.push_back(ht.element(l));
      }

      //once this is done we sort using the generic sort function of STL.

      sort(&h[0],&h[h.size()-1]);

      //and now we output
      cout<<"Average search length in hash table = "<<ht.averageSearchLength()<<endl;
      cout<<"Frequency"<<'\t'<<"Word"<<endl;
      for(unsigned long int u =0; u < h.size(); u++)
      {
            cout<<h[u].data<<'\t'<<h[u].key<<endl;
      }
      system("PAUSE");
      return 0;

}


=================================================================================================================
0
Comment
Question by:wassup84
7 Comments
 

Expert Comment

by:Parchandri
ID: 9618868
while (infile >> nextword) {
  h.insert_element(nextword); //or whatever
}
0
 

Expert Comment

by:Parchandri
ID: 9618876
oh, sorry, the above won't do, I didn't read full problem description.

#include <cctype>

char c;
string s = "";
while (infile1 >> c) {
  if (isalpha(c)) s += tolower(c);
  else if (s != "") {
     h.insert_element(s);
     s = "";
  }
}
0
 
LVL 2

Assisted Solution

by:MeiaDose
MeiaDose earned 250 total points
ID: 9619228
"
#include <string>
#include <iostream>
#include <fstream>

using namespace std;

typedef string::iterator Sitr;
typedef string::const_iterator CSitr;

void formatWord(string &word){
      for(Sitr itr=word.begin();itr!=word.end();++itr){
            if(isupper(*itr)){
                  char letter=*itr;
                  letter=tolower(letter);
                  Sitr nextItr=word.erase(itr);            
                  word.insert(nextItr,letter);

            }
      }
}

string getWord(ifstream& ifs, Sitr &itr,string &line, bool &endOfFile){
      string myWord("");
      while(*itr!=' ')
      {
            if(itr==line.end()){
                  endOfFile=true;
                  formatWord(myWord);
                  return myWord;
            }
            myWord+=*itr;
            ++itr;
      }
      // Searching for the next word;
      while(isalpha(*itr)==0)
            ++itr;
      formatWord(myWord);
      return myWord;
}

void main(){
      ifstream textFile("texto.txt");
      string linha("");
      bool end=false;
      while(getline(textFile, linha)){
            Sitr itr=linha.begin();
            while(!end){
                  string a=getWord(textFile,itr,linha,end);
            }
            end=false;
      }
}
"
0
Free Tool: Subnet Calculator

The subnet calculator helps you design networks by taking an IP address and network mask and returning information such as network, broadcast address, and host range.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
LVL 2

Expert Comment

by:MeiaDose
ID: 9619234
The code above uses a function, that i called, getWord(...).

This function retrieves a word, with all characters tolower case [see formatWord(...)],  from a line of text.
It uses an iterator to get the characters and to find if it has reached the end of the line, and if so, it afects the enOfFile flag.

All you have to do is use the getline function to get a line from your text file, and while it returns true, you just have to call getWord(...) and store the word that is returned.

Tell me if this worked in your project.

Good luck,

MeiaDose




0
 

Author Comment

by:wassup84
ID: 9619520
neither of the codes look anything like the pseudo code or like nextWord(ifstream& ifs, string& word, boll& found) . I have to stick by those guidelines ( u know how it is with these proffs). and plus I cant add the other function, he said work with all the functions that are there, nothing more. :(    
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 250 total points
ID: 9620451
It depends on your definition of a word.  In this code,
I assumed that any string of characters that is not a space
is part of a word.  isspace() considedrs taba sand CRlR LF as spaces.

If you want to exclude other characters (such as digits or <>".  -- whatever)
then replace calls to isspace() with calls to isUnwantedChar(c) and write that
function yourself.  Have that fn return true for characters that you don't want
to consider as part of a word.

=-=-=-=-=-=--==-=-=-=-=-=-=-=-=-=- actual, tested code follows
#include<iostream>
#include<fstream>
using namespace std;

void nextWord(ifstream& ifs, string& word, bool& found)
{
      char c;
      found= false;
      word="";

      ifs.get(c);
      while( isspace(c) && !ifs.eof() ) {  // get past leading spaces
            ifs.get(c);
      }
      if ( ifs.eof()) {
            return; // word="" and found is false
      }
      found= true;   // got at leaset one letter
      while ( ! ifs.eof() && !isspace(c) ) {  // collect up to next delimiter
            word += c;      
            ifs.get(c);
      }
}

//------------------------------- validation code
void main(void)
{
      ifstream fIn;
      fIn.open( "c:\\temp\\TheData.txt", ios_base::binary | ios_base::in );

      bool fFound= true;
      string sWord;

      while ( fFound ) {
            nextWord( fIn, sWord, fFound );
            cout << sWord.c_str() << endl;
      }
}
0
 

Expert Comment

by:Parchandri
ID: 9620944
It should be pretty easy to modify my code to a function as reqired, if you understand what's going on. I didn't intend to entirely do your homework for you.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
Many modern programming languages support the concept of a property -- a class member that combines characteristics of both a data member and a method.  These are sometimes called "smart fields" because you can add logic that is applied automaticall…
The goal of the video will be to teach the user the concept of local variables and scope. An example of a locally defined variable will be given as well as an explanation of what scope is in C++. The local variable and concept of scope will be relat…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question