?
Solved

Read a text file and count unique word occurrences then display word count and words

Posted on 2004-08-01
9
Medium Priority
?
838 Views
Last Modified: 2010-04-15
I need to write a program that opens a text file and then reads each string from the file.  It then compares the word to the list of unique words and if it is the same ignores the word but if it is a new word it adds it to the list of unique words and increments the unique word counter.  In addition I want to have it compare words without regard to case variations or punctuation that may be part of the string.
I know this should pretty simple .  I am trying to write it with 2 functions.  One to read the text into my file and the other nested function to compare the word to the existing words stored in the unique word array.  However I cannot figure out how to transition from the read function; int insertWord(char list[][30], FILE*fp) to the int foundWord(char list[][30],char word[], int last) function and then drop back into the read function so that I can display the desired information. As  you can see from my code I am kind of stuck or do I have a Gross Conceptual Error?  Please help!

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 30


int insertWord(char list[][MAX_SIZE],FILE*fp);
int  foundWord(char list[][MAX_SIZE],char word[], int last);

int main (void){

char list [1000][30];
char word[];
int wordcount;

foundword();


return 0;
}

int  insertWord (char list[][MAX_SIZE],FILE*fp){
 int n;

FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
    }
foundWord();

fclose(fp);

}

int  foundWord(char list[][MAX_SIZE],char word[], int last);
int i:

for(i=0;i<last;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}



0
Comment
Question by:jholmes9186
  • 5
  • 4
9 Comments
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11688261
Maybe you have to reorganize a bit:

int main (void)
{

char list [1000][30];
char word[];
int wordcount;

/* open file here, in main window */
FILE*fp;
fp=fileopen("pgm6.txt","r");
if (fp==NULL){
    printf("****ERROR**** input file cannot be read);
    exit(1);
}

wordcount = InsertWords(list, fp); /* Invoke word insertion function, this function must return resultant word count */
close(fp);


return 0;
}

/* Just read opened file here */
int InsertWords(char list[][MAX_SIZE], FILE *fp)
{
    int count = 0;   /* word count */
    char buffer[30];  /* current word buffer */

    while (!feof(fp)) {
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */

          if (!FoundWord(list, buffer)   /* invoke FoundWord to know if it is a repeated word
                 strcpy (list[count++], buffer, count);    /* insert word in array
    }
    return count;
}

int  foundWord(char list[][MAX_SIZE],char word[], int size)   /* it will be clearer to use a "size" argument */  
{
int i;

for(i=0;i<size;i++)
    if (strcmp(list[i],word)==0) return1:
    return 0;
}

0
 

Author Comment

by:jholmes9186
ID: 11689010
I appreciate your taking the time to help.  I tried to make the changes you suggested and have received the following compiler errors:
(1)it says the exit(1) has no prototype
(2)Where the program invokes the word function to know if it is a repeated word and imediately following where strcpy is invoked the compiler says that the function calls do not meet their prototypes.

Also to eliminate the case distinctions can i use strcasecpy instead of strcpy?

finally, how d i make the program realize that a word already on the list is the same as a subsequent word which is exactly the same but may have a punctuation mark in the string immediately before the '\0'?
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689072
exit function is defined in "string.h" and "process.h", so it must work. Anyway, you can try to add this line at the beggining of your program:

#include <process.h>

Prototypes are the lines before the main() funtion. Since I have made some mods to your funtions you have to change them:

int InsertWords(char list[][MAX_SIZE], FILE *fp);
foundWord(char list[][MAX_SIZE],char word[], int size);

to compare without case use stricmp() comparison functions instead of strcmp() in the foundWord() function implementation.

For your last requirement, previouly to invoke foundWord() function, you have to "trim" undesired characters in "buffer" array.
something like

int i;
for (i=0; buffer[i]; i++) {
   if (!isalpha(buffer[i])) {
         buffer[i] = 0;   // end string here
         break;
   }
}

to use isalpha() function you must include ctype.h header at the beggining:
#include <ctype.h>
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:jholmes9186
ID: 11689260
Thanks again.  However I still must be missing something.  Here is my consolidated code.  The compiler (Codewarrior IDE) still does not like the function calls previously discussed.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>


#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

      char list [1000][30];
      char word[30]; /*Do I even need to declare this array? It doesn't look like it is used anywhere*/

      int wordcount;

      FILE*fp;
      fp=fopen("pgm6.txt","r");
      if (fp==NULL){
          printf("****ERROR**** input file cannot be read");
                exit(1);
      }

      wordcount = insertWord(list, fp);

      fclose(fp);


      return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0,i;  
    char buffer[30];  

    while (!feof(fp)) {
   
    fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
          
          /* Read file, line by line, separate words and store each in buffer array. THIS IS YOUR JOB */
      
      
      for (i=0; buffer[i]; i++) {
               if (!isalpha(buffer[i])) {
               buffer[i] = 0;  
               break;
               }
      }
      

        if (!foundWord(list, buffer))   /* Offending function #1*/
            strcpy (list[count++], buffer, count);  /*Offending function #2*/       
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

      int i;

      for(i=0;i<size;i++)
          if (stricmp(list[i],word)==0) return1:
          return 0;
}
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689283
>   fp=fscanf(fp,"%s",list); /* Is this right? It won't compile and says something about "illegal implicit change of int to struct...?*/
fp=  <--- here is the error, no need to assign to fp
But scan(..."%s"....) will read entire line, not word by word as desired.

I recommend to use fgets instead:
Declare a line buffer at the beggining of your function:
char line[255];    /* I have assumed your lines are shorter than 255, could change */

Then instead of fscanf, read an entire line with fgets
fgets(fp, 255, line);

now line is a string (actually a character array) you can process and split in words, store each word in the "buffer" array and invoke foundWord as explained.

0
 

Author Comment

by:jholmes9186
ID: 11689360
I guess I need a little more info on the buffer and fgets and splitting into words.
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11689372
Sorry, but I feel I am making all your work. Splitting into words is the main function of your code, so you will have to read some about string manipulation in C.
0
 

Author Comment

by:jholmes9186
ID: 11689391
Now I am completely confused.  I tried to run it using a text file on my computer and all I got was abrief flash of black and then nothing.  DO I need to start over from scratch?  Please help.  Frustration level is rising.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#define MAX_SIZE 30

int foundWord(char list[][MAX_SIZE],char word[],int size);
int insertWord(char list[][MAX_SIZE],FILE*fp);

int main (void){

     char list [1000][30];

     int wordcount;

     FILE*fp;
     fp=fopen("pgm6.txt","r");
     if (fp==NULL){
         printf("****ERROR**** input file cannot be read");
              exit(1);
     }

     wordcount = insertWord(list, fp);

     fclose(fp);

     return 0;
}


int insertWord(char list[][MAX_SIZE], FILE *fp){
   
    int count = 0, i=0;  
    char buffer[30];  
   
   
    while (fscanf(fp,"%s",list[i])!=EOF)i++ ;
   
     for (i=0; buffer[i]; i++) {
             if (!isalpha(buffer[i])) {
                 buffer[i] = 0;  
                 break;
             }
     }
     

        if (!foundWord(list, buffer,count))  
            strcpy (list[count++], buffer);      
    }
    return count;
}


int  foundWord(char list[][MAX_SIZE],char word[], int size){

     int i;

     for(i=0;i<size;i++)
         if (stricmp(list[i],word)==0) return1:
         return 0;
}
0
 
LVL 55

Accepted Solution

by:
Jaime Olivares earned 2000 total points
ID: 11689423
I guess you have to make some pause before exiting:

main (...)
{

    .......

     fclose(fp);
     system("Pause");     /* or use this function defined in conio.h:    getch();  */
     return 0;
}

0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
The goal of this video is to provide viewers with basic examples to understand how to use strings and some functions related to them in the C programming language.
Suggested Courses

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question