Solved

Read a text file and count unique word occurrences then display word count and words

Posted on 2004-08-01
9
803 Views
Last Modified: 2010-04-15
I need to write a program that opens a text file and then reads each string from the file.  It then compares the word to the list of unique words and if it is the same ignores the word but if it is a new word it adds it to the list of unique words and increments the unique word counter.  In addition I want to have it compare words without regard to case variations or punctuation that may be part of the string.
I know this should pretty simple .  I am trying to write it with 2 functions.  One to read the text into my file and the other nested function to compare the word to the existing words stored in the unique word array.  However I cannot figure out how to transition from the read function; int insertWord(char list[][30], FILE*fp) to the int foundWord(char list[][30],char word[], int last) function and then drop back into the read function so that I can display the desired information. As  you can see from my code I am kind of stuck or do I have a Gross Conceptual Error?  Please help!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 30


int insertWord(char list[][MAX_SIZE],FILE*fp);
int  foundWord(char list[][MAX_SIZE],char word[], int last);

int main (void){

char list [1000][30];
char word[];
int wordcount;

foundword();


return 0;
}

int  insertWord (char list[][MAX_SIZE],FILE*fp){
 int n;

FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
    }
foundWord();

fclose(fp);

}

int  foundWord(char list[][MAX_SIZE],char word[], int last);
int i:

for(i=0;i<last;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}



0
Comment
Question by:jholmes9186
  • 5
  • 4
9 Comments
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11688261
Maybe you have to reorganize a bit:

int main (void)
{

char list [1000][30];
char word[];
int wordcount;

/* open file here, in main window */
FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
}

wordcount = InsertWords(list, fp); /* Invoke word insertion function, this function must return resultant word count */
close(fp);


return 0;
}

/* Just read opened file here */
int InsertWords(char list[][MAX_SIZE], FILE *fp)
{
    int count = 0;   /* word count */
    char buffer[30];  /* current word buffer */

    while (!feof(fp)) {
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */

          if (!FoundWord(list, buffer)   /* invoke FoundWord to know if it is a repeated word
                 strcpy (list[count++], buffer, count);    /* insert word in array
    }
    return count;
}

int  foundWord(char list[][MAX_SIZE],char word[], int size)   /* it will be clearer to use a "size" argument */  
{
int i;

for(i=0;i<size;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}

0
 

Author Comment

by:jholmes9186
ID: 11689010
I appreciate your taking the time to help.  I tried to make the changes you suggested and have received the following compiler errors:
(1)it says the exit(1) has no prototype
(2)Where the program invokes the word function to know if it is a repeated word and imediately following where strcpy is invoked the compiler says that the function calls do not meet their prototypes.

Also to eliminate the case distinctions can i use strcasecpy instead of strcpy?

finally, how d i make the program realize that a word already on the list is the same as a subsequent word which is exactly the same but may have a punctuation mark in the string immediately before the '\0'?
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689072
exit function is defined in "string.h" and "process.h", so it must work. Anyway, you can try to add this line at the beggining of your program:

#include <process.h>

Prototypes are the lines before the main() funtion. Since I have made some mods to your funtions you have to change them:

int InsertWords(char list[][MAX_SIZE], FILE *fp);
foundWord(char list[][MAX_SIZE],char word[], int size);

to compare without case use stricmp() comparison functions instead of strcmp() in the foundWord() function implementation.

For your last requirement, previouly to invoke foundWord() function, you have to "trim" undesired characters in "buffer" array.
something like

int i;
for (i=0; buffer[i]; i++) {
   if (!isalpha(buffer[i])) {
         buffer[i] = 0;   // end string here
         break;
   }
}

to use isalpha() function you must include ctype.h header at the beggining:
#include <ctype.h>
0
 

Author Comment

by:jholmes9186
ID: 11689260
Thanks again.  However I still must be missing something.  Here is my consolidated code.  The compiler (Codewarrior IDE) still does not like the function calls previously discussed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>


#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

      char list [1000][30];
      char word[30]; /*Do I even need to declare this array? It doesn't look like it is used anywhere*/

      int wordcount;

      FILE*fp;
      fp=fopen("pgm6.txt","r");
      if (fp==NULL){
          printf("****ERROR**** input file cannot be read");
                exit(1);
      }

      wordcount = insertWord(list, fp);

      fclose(fp);


      return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0,i;  
    char buffer[30];  

    while (!feof(fp)) {
   
    fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
          
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */
      
      
      for (i=0; buffer[i]; i++) {
               if (!isalpha(buffer[i])) {
               buffer[i] = 0;  
               break;
               }
      }
      

        if (!foundWord(list, buffer))   /* Offending function #1*/
            strcpy (list[count++], buffer, count);  /*Offending function #2*/       
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

      int i;

      for(i=0;i<size;i++)
          if (stricmp(list[i],word)==0) return1:
          return 0;
}
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689283
>   fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
fp=  <--- here is the error, no need to assign to fp
But scan(..."%s"....) will read entire line, not word by word as desired.

I recommend to use fgets instead:
Declare a line buffer at the beggining of your function:
char line[255];    /* I have assumed your lines are shorter than 255, could change */

Then instead of fscanf, read an entire line with fgets
fgets(fp, 255, line);

now line is a string (actually a character array) you can process and split in words, store each word in the "buffer" array and invoke foundWord as explained.

0
 

Author Comment

by:jholmes9186
ID: 11689360
I guess I need a little more info on the buffer and fgets and splitting into words.
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689372
Sorry, but I feel I am making all your work. Splitting into words is the main function of your code, so you will have to read some about string manipulation in C.
0
 

Author Comment

by:jholmes9186
ID: 11689391
Now I am completely confused.  I tried to run it using a text file on my computer and all I got was abrief flash of black and then nothing.  DO I need to start over from scratch?  Please help.  Frustration level is rising.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

     char list [1000][30];

     int wordcount;

     FILE*fp;
     fp=fopen("pgm6.txt","r");
     if (fp==NULL){
         printf("****ERROR**** input file cannot be read");
              exit(1);
     }

     wordcount = insertWord(list, fp);

     fclose(fp);

     return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0, i=0;  
    char buffer[30];  
   
   
    while (fscanf(fp,"%s",list[i])!=EOF)i++ ;
   
     for (i=0; buffer[i]; i++) {
             if (!isalpha(buffer[i])) {
                 buffer[i] = 0;  
                 break;
             }
     }
     

        if (!foundWord(list, buffer,count))  
            strcpy (list[count++], buffer);      
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

     int i;

     for(i=0;i<size;i++)
         if (stricmp(list[i],word)==0) return1:
         return 0;
}
0
 
LVL 55

Accepted Solution

by:
Jaime Olivares earned 500 total points
ID: 11689423
I guess you have to make some pause before exiting:

main (...)
{

    .......

     fclose(fp);
     system("Pause");     /* or use this function defined in conio.h:    getch();  */
     return 0;
}

0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand and use structures in the C programming language.
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now