Solved

Read a text file and count unique word occurrences then display word count and words

Posted on 2004-08-01
9
817 Views
Last Modified: 2010-04-15
I need to write a program that opens a text file and then reads each string from the file.  It then compares the word to the list of unique words and if it is the same ignores the word but if it is a new word it adds it to the list of unique words and increments the unique word counter.  In addition I want to have it compare words without regard to case variations or punctuation that may be part of the string.
I know this should pretty simple .  I am trying to write it with 2 functions.  One to read the text into my file and the other nested function to compare the word to the existing words stored in the unique word array.  However I cannot figure out how to transition from the read function; int insertWord(char list[][30], FILE*fp) to the int foundWord(char list[][30],char word[], int last) function and then drop back into the read function so that I can display the desired information. As  you can see from my code I am kind of stuck or do I have a Gross Conceptual Error?  Please help!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 30


int insertWord(char list[][MAX_SIZE],FILE*fp);
int  foundWord(char list[][MAX_SIZE],char word[], int last);

int main (void){

char list [1000][30];
char word[];
int wordcount;

foundword();


return 0;
}

int  insertWord (char list[][MAX_SIZE],FILE*fp){
 int n;

FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
    }
foundWord();

fclose(fp);

}

int  foundWord(char list[][MAX_SIZE],char word[], int last);
int i:

for(i=0;i<last;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}



0
Comment
Question by:jholmes9186
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
9 Comments
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11688261
Maybe you have to reorganize a bit:

int main (void)
{

char list [1000][30];
char word[];
int wordcount;

/* open file here, in main window */
FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
}

wordcount = InsertWords(list, fp); /* Invoke word insertion function, this function must return resultant word count */
close(fp);


return 0;
}

/* Just read opened file here */
int InsertWords(char list[][MAX_SIZE], FILE *fp)
{
    int count = 0;   /* word count */
    char buffer[30];  /* current word buffer */

    while (!feof(fp)) {
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */

          if (!FoundWord(list, buffer)   /* invoke FoundWord to know if it is a repeated word
                 strcpy (list[count++], buffer, count);    /* insert word in array
    }
    return count;
}

int  foundWord(char list[][MAX_SIZE],char word[], int size)   /* it will be clearer to use a "size" argument */  
{
int i;

for(i=0;i<size;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}

0
 

Author Comment

by:jholmes9186
ID: 11689010
I appreciate your taking the time to help.  I tried to make the changes you suggested and have received the following compiler errors:
(1)it says the exit(1) has no prototype
(2)Where the program invokes the word function to know if it is a repeated word and imediately following where strcpy is invoked the compiler says that the function calls do not meet their prototypes.

Also to eliminate the case distinctions can i use strcasecpy instead of strcpy?

finally, how d i make the program realize that a word already on the list is the same as a subsequent word which is exactly the same but may have a punctuation mark in the string immediately before the '\0'?
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689072
exit function is defined in "string.h" and "process.h", so it must work. Anyway, you can try to add this line at the beggining of your program:

#include <process.h>

Prototypes are the lines before the main() funtion. Since I have made some mods to your funtions you have to change them:

int InsertWords(char list[][MAX_SIZE], FILE *fp);
foundWord(char list[][MAX_SIZE],char word[], int size);

to compare without case use stricmp() comparison functions instead of strcmp() in the foundWord() function implementation.

For your last requirement, previouly to invoke foundWord() function, you have to "trim" undesired characters in "buffer" array.
something like

int i;
for (i=0; buffer[i]; i++) {
   if (!isalpha(buffer[i])) {
         buffer[i] = 0;   // end string here
         break;
   }
}

to use isalpha() function you must include ctype.h header at the beggining:
#include <ctype.h>
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:jholmes9186
ID: 11689260
Thanks again.  However I still must be missing something.  Here is my consolidated code.  The compiler (Codewarrior IDE) still does not like the function calls previously discussed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>


#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

      char list [1000][30];
      char word[30]; /*Do I even need to declare this array? It doesn't look like it is used anywhere*/

      int wordcount;

      FILE*fp;
      fp=fopen("pgm6.txt","r");
      if (fp==NULL){
          printf("****ERROR**** input file cannot be read");
                exit(1);
      }

      wordcount = insertWord(list, fp);

      fclose(fp);


      return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0,i;  
    char buffer[30];  

    while (!feof(fp)) {
   
    fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
          
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */
      
      
      for (i=0; buffer[i]; i++) {
               if (!isalpha(buffer[i])) {
               buffer[i] = 0;  
               break;
               }
      }
      

        if (!foundWord(list, buffer))   /* Offending function #1*/
            strcpy (list[count++], buffer, count);  /*Offending function #2*/       
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

      int i;

      for(i=0;i<size;i++)
          if (stricmp(list[i],word)==0) return1:
          return 0;
}
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689283
>   fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
fp=  <--- here is the error, no need to assign to fp
But scan(..."%s"....) will read entire line, not word by word as desired.

I recommend to use fgets instead:
Declare a line buffer at the beggining of your function:
char line[255];    /* I have assumed your lines are shorter than 255, could change */

Then instead of fscanf, read an entire line with fgets
fgets(fp, 255, line);

now line is a string (actually a character array) you can process and split in words, store each word in the "buffer" array and invoke foundWord as explained.

0
 

Author Comment

by:jholmes9186
ID: 11689360
I guess I need a little more info on the buffer and fgets and splitting into words.
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689372
Sorry, but I feel I am making all your work. Splitting into words is the main function of your code, so you will have to read some about string manipulation in C.
0
 

Author Comment

by:jholmes9186
ID: 11689391
Now I am completely confused.  I tried to run it using a text file on my computer and all I got was abrief flash of black and then nothing.  DO I need to start over from scratch?  Please help.  Frustration level is rising.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

     char list [1000][30];

     int wordcount;

     FILE*fp;
     fp=fopen("pgm6.txt","r");
     if (fp==NULL){
         printf("****ERROR**** input file cannot be read");
              exit(1);
     }

     wordcount = insertWord(list, fp);

     fclose(fp);

     return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0, i=0;  
    char buffer[30];  
   
   
    while (fscanf(fp,"%s",list[i])!=EOF)i++ ;
   
     for (i=0; buffer[i]; i++) {
             if (!isalpha(buffer[i])) {
                 buffer[i] = 0;  
                 break;
             }
     }
     

        if (!foundWord(list, buffer,count))  
            strcpy (list[count++], buffer);      
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

     int i;

     for(i=0;i<size;i++)
         if (stricmp(list[i],word)==0) return1:
         return 0;
}
0
 
LVL 55

Accepted Solution

by:
Jaime Olivares earned 500 total points
ID: 11689423
I guess you have to make some pause before exiting:

main (...)
{

    .......

     fclose(fp);
     system("Pause");     /* or use this function defined in conio.h:    getch();  */
     return 0;
}

0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Have you thought about creating an iPhone application (app), but didn't even know where to get started? Here's how: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Important pre-programming comments: I’ve never tri…
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
The goal of this video is to provide viewers with basic examples to understand recursion in the C programming language.

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question