?
Solved

Pointers, arrays,string manipulations and word counts

Posted on 1999-06-30
31
Medium Priority
?
396 Views
Last Modified: 2013-12-14
Determine the frequency of the words that appear in the next paragraph.Paragraph has to be read from a file and the output has to be stored is a file too. The paragraph says thew following:

    in these dying days of the dallas dynasty it is getting
harder and harder for the Cowboys defense to carry an
aging offense it was the second straight week Dallas had
come back only to fail in the end and again it was the
Dallas offense that failed getting into the end zone only
once and remaining the only NFL team without a rushing
touchdown that has been typical of the Cowboys troubles
they have just four touchdowns on their last twenty trips
inside their opponents twenty

so far i have made this but it doesnt give me the output that i want. I thinki it has to do with some of the source for file I/O or with the array dimensions. Here is my code:#include<iostream.h>
#include<fstream.h>
#include<string.h>
main()
{
char s[9][200];
char  words[100][100] = {""},*temp;
int count[100] = {0};
fstream escribir,leer;
escribir.open("read_fil.doc",ios::out);
leer.open("readfile.doc",ios::in);
if(!escribir||!leer)
cout<<"dont open";
/*cout << "Enter lines of text:" << endl;
  for (int i = 0; i <=8; i++)
     {
      cin.getline(&s[i][0], 200);
     read<<&s[i][0]<<"\n";
     }*/


     for (int i = 0; i <=500; i++)
     {
       leer>>&s[i][0];//<<"\n";
     }
  for (i = 0; i <=500; i++) {
      temp = strtok(&s[i][0], ". \n");

      while (temp) {

       for (int j = 0; words[j][0] && strcmp(temp, &words[j][0]) != 0; j++)
          ;  // empty body

       ++count[j];

       if (!words[j][0])
          strcpy(&words[j][0], temp);

       temp = strtok(NULL, ". \n");
      }
   }

   cout.put('\n');

   for (int j = 0; words[j][0] != '\0' && j <= 99; j++)
      {
      cout << "\"" << &words[j][0] << "\" appeared " << count[j]
         << " time(s)" << endl;
      escribir<< "\"" << &words[j][0] << "\" appeared " << count[j]
         << " time(s)" << endl;
      }
   return 0;
}

What I'm doing wrong?

0
Comment
Question by:milalik
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 13
  • 11
  • 5
  • +2
31 Comments
 
LVL 7

Expert Comment

by:KangaRoo
ID: 1198830
some notes:
>>    read<<&s[i][0]<<"\n";
whats read?

>> for (i = 0; i <=500; i++) {
>>      temp = strtok(&s[i][0], ". \n");
where s[9][100] is declared....

>>char  words[100][100] = {""};
>> for (int j = 0; words[j][0] && strcmp(temp, &words[j][0]) != 0; j++) ;
don't think words is initialized properly to provide end of loop condition....Not sure, never use arrays like that (not C++). Next statement will always increment some counter.....

0
 
LVL 7

Expert Comment

by:KangaRoo
ID: 1198831
Hmm that's ok it seems.
0
 
LVL 7

Expert Comment

by:KangaRoo
ID: 1198832
>> temp = strtok(&s[i][0], ". \n");
is done twice in the for statement

why don't you use strings to store the words?
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 2

Expert Comment

by:cpopin
ID: 1198833
>> char s[9][200];
...
>> for (i = 0; i <=500; i++) {
>>     temp = strtok(&s[i][0], ". \n");

First of all, you allocation of char s[9][200] is too small for the for loop taking it up to 500.
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198834
I think there is a much simpler solution - I've done this before...

#include <iostream>
#include <fstream>
#include <string>
#include <map>

using namespace std;

int main()
  {
  map<string, int> occurrences;
  string current_word;

  ifstream fin("in.txt");
  while (fin >> current_word)
    {
    if(occurrences.find(current_word) == occurrences.end())
       // if the word is not in the map
       occurrences[current_word] = 1;
    else
       // word is in the map
       ++occurrences[current_word];
    }

  ofstream fout("out.txt");

  map<string, int>::iterator itr = occurrences.begin();

  while (itr != occurrences.end())
     {
     // print word-number pairs out to stream
     fout << occurrences.first << "\t\t\t" << flush;
     fout << occurrences.second << endl;
     ++itr;
     }

  // streams close automatically on destruction
  // of local objects (at end of function)

  return 0;
  }

and that's it... I love code reuse!
0
 
LVL 7

Expert Comment

by:KangaRoo
ID: 1198835
Ahh, much better :)
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198836
The map is also called an "associative array"
It lets you index the array with any type you want, in this case a string, and the elements of the array are ints.  In the Standard Library map the array is really a balanced binary tree in the underlying implementation, so when you are inserting a new word, you are putting it in the tree and access is in O(log n) time - i.e. pretty fast.  The other advantage is the structure is dynamic, which means it is "growable" to any size, given you have enough memory.  Result - you don't have to specify any sizes - meaning you can work with any text file of any size.  The map or associative array is one of the most useful things to have as a programmer.

Remember "Good programmers know what to write. Great ones know what to reuse."  If you have any questions, let me know.

Some C++ implementations make you declare the map as:
   map<string, int, less<string> > occurrences;
I know borland 5.02 makes you do that, but VC5/6 and the latest g++ will work with the way I wrote it originally.
0
 

Author Comment

by:milalik
ID: 1198837
I can use the string, iostream and fstream libraries. This is for tomorrow. The proposed answer has like 20 errors in it. How im supposed to work with it if i dont know what some of the errors are about
0
 

Author Comment

by:milalik
ID: 1198838
VEngineer: I have Borland 4.52 and the code you gave me up there gave me too many errors. Explain more what map does....I really need help.!!!Thanx
0
 

Author Comment

by:milalik
ID: 1198839
VEngineer: I have Borland 4.52 and the code you gave me up there gave me too many errors. Explain more what map does....I really need help.!!!Thanx
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198840
Ergh. Borland 4.52 might not support the standard library fully.  In other words, the map might not be defined in that version of the compiler, hence the errors.  I know another solution that doesn't use the map but is about the same length of code.  Let me look that up and get back to you.

For now, get rid of "using namespace std;" and doing:
#include <iostream.h>
#include <fstream.h>
#include <string.h>
#include <map.h>

replace map<string, int> with
map<string, int, less<string> >
(note the space between the two angle brackets)

let me know if that works and I'll get back to you on the alternate solution.


0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198841
oops - I meant to do that as a comment...
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198842

map essentially stores pairs of data, a mapping from a set of keys to a set of values associated with those keys.

map <string, int>, or more generally map<key, value> means that you are storing pairs of keys and values, in this case strings and integers - like ("cow", 1) ("dog", 6) where the string is representing some word in the file and the int represents the number of occurrences.

access of the map looks like array access:

occurrences["cow"] = 1;      // a
++occurrences["cow"];        // b
int i = occurrences["dog"];

so on the encounter of some string in the file, we check to see if the string exists in the map.  If it doesn't we add and entry by doing "a" above.  If it does exist, we increment the integer corresponding to that string as in "b" above.

To print out the map, we create an iterator corresponding to the map, essentially a wrapped pointer, and we iterate through the elements in the map, printing out their key and value as a pair (first, second), hence a list of string and integer pairs.
0
 

Author Comment

by:milalik
ID: 1198843
cant use map.h and borland 4.52 still gives me error. Are you understanding what i want the program to do?
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198844

An alternate solution without a map is less flexible and less efficient, but here's one way:

#include <iostream.h>
#include <fstream.h>
#include <string.h>

// bool is not defined in 4.52

enum bool {false = 0, true};

struct WORD
   {
   string text;   // each word has a text representation
   int number;    // each word has a number of occurrences
   };

const int MAX_WORDS = 1000;   // arbitrary constant

int main()
   {
   WORD occurrences[MAX_WORDS];     // array of WORDs
   int number_of_words = 0;

   bool found = false;
   string current_word;  // current word read from file
   
   ifstream fin("in.txt");
   while (fin >> current_word)
      {
      found = false;
      for (int i = 0; i < number_of_words; ++i)
         {
         if (occurrences[i].text == current_word)
            {
            // if word exists in array, increment number
            ++occurrences[i].number;
            found = true;
            break;
            }
         }
      if (!found)
         {
         // since current word is not in array, add it
         occurrences[number_of_words].text = current_word;
         occurrences[number_of_words].number = 1;
         ++number_of_words;
         }
      }
       
   // print it out  
   ofstream fout("out.txt");
   for (int i = 0; i < number_of_words; ++i)
      {
      cout << occurrences[i].text << "\t\t" << flush;
      cout << occurrences[i].number << endl;
      }

   return 0;
   }
   
and that's it without the map - not quite as elegant I suppose, but it'll work.
0
 

Author Comment

by:milalik
ID: 1198845
No structure or object oriented stuff. just simple array, pointers, string manipulations found in the iostream.h, string. h and fstream.h libraries
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198846
oops.. it should be cstring.h instead of string.h that you are including.. I haven't worked with 4.52 in a few years.
0
 

Author Comment

by:milalik
ID: 1198847
A friend and I came with this sort of solution by luck!! There is an array called word that we dont know what it does. But if we take it out the output is less similar to what we want. Here is what we have:#include<iostream.h>
#include<fstream.h>
#include<string.h>
main()
{
char s[9][200];
char  words[100][100] = {""},*temp;
int count[100] = {0};
fstream escribir,leer;
escribir.open("read_fil.doc",ios::out);
leer.open("readfile.doc",ios::in);
if(!escribir||!leer)
cout<<"dont open";
/*cout << "Enter lines of text:" << endl;
  for (int i = 0; i <=8; i++)
     {
      cin.getline(&s[i][0], 200);
     read<<&s[i][0]<<"\n";
     }*/


     for (int i = 0; i <=86; i++)
     {
       leer>>&s[i][0];//<<"\n";
     }
  for (i = 0; i <=86; i++) {
      temp = strtok(&s[i][0], " ");

      while (temp) {

       for (int j = 0; words[j][0] && strcmp(temp, &words[j][0]) != 0; j++)
          ;  // empty body

       ++count[j];

       if (!words[j][0])
          strcpy(&words[j][0], temp);

       temp = strtok(NULL, ". \n");
      }
   }

   cout.put('\n');

   for (int j = 0; words[j][0] != '\0' && j <= 99; j++)
      {
      cout << "\"" << &words[j][0] << "\" appeared " << count[j]
         << " time(s)" << endl;
      escribir<< "\"" << &words[j][0] << "\" appeared " << count[j]
         << " time(s)" << endl;
      }
   return 0;
}

We want to know what the word array does? apart from this when you run it all the words appear in the output minus the word is. Why is this so?


0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198848
I think I've offered all I can for today..

In all the examples I gave you have to include <cstring.h> instead of <string.h> in Borland 4.52.

You say no OOP, no structs, no ANSI strings.  I say you have your hands tied behind your back and a blindfold on too.
When you decide to come out of the 1980s, let us know.  In the meantime, go ahead and replace the couts with printfs in that already disgustingly unreadable and unmaintainable code.  Play your luck for what it's worth because luck is something programmers cannot depend upon.
0
 

Author Comment

by:milalik
ID: 1198849
I quote from VEngineer:"You say no OOP, no structs, no ANSI strings.  I say you have your hands tied behind your back and a blindfold on too.
When you decide to come out of the 1980s, let us know.  In the meantime, go ahead and replace the couts with printfs in that already disgustingly unreadable and unmaintainable code.  Play your luck for what it's worth because luck is something programmers cannot depend upon."

I say: Nice way of calling someone dumb.
I say too: I havent give OOP,structs or ANSI strings(havent arrive their yet!), only pointers, arrays and string manipulations found on string .h library like strtok,strcmp,strcpy, etc. So the person who assigned me this expect no use of OOP or structs.

I appreciate ur help.

0
 

Author Comment

by:milalik
ID: 1198850
Please someone help cuz im stuck and dont know what to do next!!
0
 

Author Comment

by:milalik
ID: 1198851
Adjusted points to 175
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198852
If this is an assignment, I'm sorry I ever helped.  We are not supposed to give answers to assignments, but I figured that this problem shows up every once in a while when working with string processing.

I think you read a little too much into my comment.  But do let your _instructor_ know that we are in 1999, not 1985 - this isn't "Back to the Future."  Send him some email - chances are he is watching Three's Company or something.
0
 

Author Comment

by:milalik
ID: 1198853
I have come here and they have help me taking me step by step and i end up doing the program by myself. I didnt asked for an answer i just posted a code i had that is half working to see if with ur advice and expert guidance i could finish the program by myself like i have done so many times. About if it is an assigment yes it is. They give you the 1980's C++ like ya call it before they give you the 1990's-2000's one. And if you or someone still can help i will really appreciate it.
0
 

Author Comment

by:milalik
ID: 1198854
I have done this so far but it doesnt work for the paragraph i posted a while ago. What i want to know is what im doing wrong?
#include <iostream.h>
#include <string.h>

const int SIZE = 80;

main()
{
   char text[3][SIZE], *temp, words[100][20] = {""};
   int count[100] = {0};
   
   cout << "Enter three lines of text:" << endl;
   
   for (int i = 0; i <= 2; i++)
      cin.getline(&text[i][0], SIZE);
     
   for (i = 0; i <= 2; i++) {
      temp = strtok(&text[i][0], ". \n");
     
      while (temp) {
         
         for (int j = 0; words[j][0] && strcmp(temp, &words[j][0]) != 0; j++)
            ;  // empty body
         
         ++count[j];
         
         if (!words[j][0])
            strcpy(&words[j][0], temp);
           
         temp = strtok(NULL, ". \n");
      }
   }
   
   cout.put('\n');
   
   for (int j = 0; words[j][0] != '\0' && j <= 99; j++)
      cout << "\"" << &words[j][0] << "\" appeared " << count[j]
           << " time(s)" << endl;

   return 0;
}




0
 
LVL 7

Expert Comment

by:KangaRoo
ID: 1198855
Don't take the 80's stuff as offensive, though I do agree with VEngineer about using modern C++. Its OOP stuff isn't in it for marketing but to be used, and so is STL. You are probable right about your mentor not asking you to use STL. Too bad, it will be harder to learn if you've learned to do things the C way.
0
 
LVL 9

Expert Comment

by:jasonclarke
ID: 1198856
Here is an updated version that points to some of the things you need to do, (and if you insist on using only C style functionality), tidy it up, use functions to delegate responsibility for stuff, make code explicit - many C books encourage use of assumptions about return types, and terse code - ignore them, go for easy to read code.

I have changed the memory management a little, because I think doing it this way makes things a bit more explicit...

#include <iostream.h>
#include <string.h>

const int SIZE = 80;

void UpdateWordCount(char* thisword, char** words,
                               int* count)
{
    for (int i=0; i<100; i++)
    {
        if (words[i] == 0)
        {
            words[i] = new char[strlen(thisword)+1];
            strcpy(words[i],thisword);
            ++count[i];
            return;
        }
        if (strcmp(thisword, words[i]) == 0)
        {
            ++count[i];
            return;
        }
    }
    // Array overflow, should bomb out here...
}

main()
{
    char text[SIZE], *temp, **words;
    int count[100] = {0};

    words = new char*[100];
    memset(words,0,100*sizeof(char *));
   
    cout << "Enter three lines of text:" << endl;
   
    for (int i = 0; i <= 2; i++) {
      cin.getline(text, SIZE);
       
      temp = strtok(text, ". \n");
       
      while (temp) {
          UpdateWordCount(temp, (char **) words, count);
             
          temp = strtok(NULL, ". \n");
      }
   }
   
   cout.put('\n');
   
   for (int j = 0; words[j] != 0 && j <= 99; j++) {
      cout << "\"" << words[j] << "\" appeared " << count[j]
           << " time(s)" << endl;
       delete [] words[j];
   }

    delete [] words;

   return 0;
}

0
 
LVL 2

Accepted Solution

by:
VEngineer earned 510 total points
ID: 1198857

breaking it out into functions is good as jasonclarke suggests - but I will do it all in main here and leave any modifications to you - this is the simplest way I can think of, no memory from the free-store/heap, minimization of pointers - you should probably add error checking just in case you exceed array bounds:

#include <iostream.h>
#include <fstream.h>
#include <string.h>  // the C string lib for strcpy, etc..

enum bool {false = 0, true};    // for borland 4.52
const int MAX_STRING_LENGTH = 80;  // arbitrary
const int MAX_WORDS = 1000;        // arbitrary

int main() {
   char   words[MAX_WORDS][MAX_STRING_LENGTH];
   int    count[MAX_WORDS];  // two "parallel" arrays

   int number_of_words = 0;  // actual number of words

   ifstream fin("in.txt");
   char current_word[MAX_STRING_LENGTH];

   while (fin >> current_word) {
      found = false;
      for (int i = 0; i < number_of_words; ++i) {
         if (strcmp(current_word, words[i]) == 0) {
            // if word exists in array, increment number
            ++count[i];
            found = true;
            break;
         }
      }
      if (!found)
         {
         // since current word is not in array, add it
         // allocate just enough memory for that string
         strcpy(words[number_of_words], current_word);
         count[number_of_words] = 1;
         ++number_of_words;
         }
      }
       
   // print it out
   ofstream fout("out.txt");
   for (int i = 0; i < number_of_words; ++i) {
      cout << words[i] << "\t\t" << flush;
      cout << count[i] << endl;
   }

   return 0;
}

work off of jasonclarke and my example - I think my method of reading in the words is a little easier than the strtok and getline.  The insertion operator (>>) is overloaded to work with character arrays in C++, so you might as well make your life a little easier.
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198858
OOps - I forgot to initialize my count array to zero - but you probably know better than that... otherwise this is as clear as it gets.  No pointers, no address operator, no fuss, no muss.
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198859
keep in mind that words[i] == &words[i][0]
and disregard that comment in the code that I had about allocating memory - I was working off of jasonclarkes example for a while and got sidetracked
0
 
LVL 2

Expert Comment

by:VEngineer
ID: 1198860
my final note for now - those cout should be fout near the end of my code.  Remember to keep it as simple and readable as possible - keep it short and simple.  Also, for heaven's sake, get rid of those "magic numbers" in the code and replace them with some constants.  It is terrible for a person who is reading the code.  Same goes for the stream names.  Programming takes practice and good practices should start early.  I know when I grade programs for my classes (I'm a TA), I ream them if they have shoddy variable names.  The last thing the world needs is programmers who name their variables after their kids and pets - and it's a sad fact because I've seen it happen in the real world.  This isn't an insult or agressiveness - just a general observation of the world.
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction This article is the first in a series of articles about the C/C++ Visual Studio Express debugger.  It provides a quick start guide in using the debugger. Part 2 focuses on additional topics in breakpoints.  Lastly, Part 3 focuses on th…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
The viewer will learn how to use the return statement in functions in C++. The video will also teach the user how to pass data to a function and have the function return data back for further processing.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

719 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question