Solved

Re: word count

Posted on 1998-11-24
5
212 Views
Last Modified: 2010-04-15
Sorry about the mix of the points, I hope everything got sorted out.  I liked the code for word count.  

My new question is this: After I open the file and run it thru 'word count' is it possible to show each word and the number of times it appears in a file.

The reason I'm offering this amount of points is I don't come here that often, and the point total continues to grow.  However, the answers are well worth it, plus some.
0
Comment
Question by:tester090797
  • 3
5 Comments
 
LVL 12

Accepted Solution

by:
rwilson032697 earned 400 total points
ID: 1254621
Do something like this:

struct TWord {char *TheWord; int Count};

struct TWord Words[1000]; // We'll imagine a maximum of 1000 different words.
int WordCount;

As you identify each word do this (the word is in TheWord):

{
  int FoundWord = 0;
  for (int i = 0; i < WordCount; i++)
  {
    if (strcmp(TheWord, Words[i].Word) == 0)
    {
      FoundWord = 0;
      break; // ie: out of the loop
    }
  }

  if (FoundWord)
  {
    Words[FoundWord]Count++;
  }
  else
  {
    WordCount++;
    Words[WordCount].Word = TheWord; // Actually you would copy the string - you get the picture...
    Words[WordCount].Count = 0;
  }
}

// When you have processed all words do this

for (int i =0; i < WordCount; i++)
{
  printf("%s occured %d times.\n", Words[i].Word, Words[i].Count);
}

Cheers,
Raymond.
0
 
LVL 5

Expert Comment

by:scrapdog
ID: 1254622
#include<stdio.h>
#include<stdlib.h>
#include<ctype.h>
#include<string.h>

struct node {
  int count;
  char *word;
  node *next;
} *head = NULL;


int file_exists(char *filename);

void addnewword(char *newword, int letter);
main()
{
node *p, *t;
char ch, source[80], newword[100];
int index, letter=0;
long count[64];
int wordflag = 0;
int quotemode = 0;
long words=0;
FILE *fp;

fprintf(stderr, "\Enter source file name: ");
gets(source);

if(!file_exists(source))
{
fprintf(stderr, "\n%s does not exist.\n", source);
exit(1);
}
if((fp = fopen(source, "rb")) == NULL)
{
fprintf(stderr, "\nError opening %s.\n", source);
exit(1);
}

for(index = 21; index < 63; index++)
count[index] = 0;


while(1)
{
ch = fgetc(fp);
if(feof(fp))
break;
if((ch >= 21) && (ch < 63)) {
  if(wordflag) { newword[++letter]='\0'; addnewword((char *)newword,letter); }
  count[ch]++; wordflag = 0; }
if(ch==34) quotemode = (!quotemode);
if((toupper(ch)>='A') && (toupper(ch) <= 'Z') && (wordflag)) {letter++; newword[letter] = ch; }
if((toupper(ch)>='A') && (toupper(ch) <= 'Z')  && (!wordflag) && (!quotemode))
  {wordflag=1; words++; letter=0; newword[0]=ch;}
}

printf("\nChar\t\tCount\n");
for(index = 21; index < 42; index++)  {
  printf("[%c]\t%d    ", index, count[index]);
  printf("[%c]\t%d\n", (index+21), count[index+21]); }
printf("Words:  %d\n\n",words);

// print the words
for(p=head; p != NULL; p=p->next) printf("%s      %d\n",p->word,p->count);

// free the list
for(p=head, t=NULL; p != NULL; t=p, p=p->next)
  {
       free(p->word);
       if(t != NULL) free(t);
  }
if(t != NULL) free(t);



fclose(fp);
return(0);
}


int file_exists(char *filename)
{
FILE *fp;
if ((fp = fopen(filename, "r")) == NULL)
return 0;
else
{
fclose(fp);
return 1;
}
}

void addnewword(char *newword, int letter)
{
      node *newnode;
      node *p=head;
      node *prev=head;
      char *newstring;

      if(p != NULL) {
        int found=0;
        while((p->next != NULL) && !found) {
             if(strcmp(newword, p->word)==0) {p->count++; found=1;}
             else {prev=p; p=p->next;}
        }
        if (!found) {
             newnode=(node *)malloc(sizeof(node));
             newnode->count=1;
             newstring=(char *)malloc(letter+1);
             strcpy(newnode->word, newstring);
             newnode->next=NULL;
             prev->next=newnode;
        }
      }
      else {
        head=(node *)malloc(sizeof(node));
        head->count=1;
        newword=(char *)malloc(letter+1);
        strcpy(head->word, newword);
        head->next=NULL;
      }

}

Here is an idea that builds on your previous program.  You might have to do a little debugging however.
0
 
LVL 5

Expert Comment

by:scrapdog
ID: 1254623
>I liked the code for word count.

The one I gave you?
0
 
LVL 1

Expert Comment

by:FuzzyLogic
ID: 1254624
Scrapdog - I think the best way is to use a Trie datastructure ("Binary Tree" with 26 nodes). When your code will make the job in O(N^2), where Trie can do it in O(L*logN), where L is the words length and N is the number of words.
With large files, the difference becomes significant.
0
 
LVL 5

Expert Comment

by:scrapdog
ID: 1254625
I was just making a change in a program tester submitted in another question.  I did not want to rewrite his/her entire program.

Yep, I know that using trees would be faster, but of course these are a little more difficult to code than a linked list.
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Examines three attack vectors, specifically, the different types of malware used in malicious attacks, web application attacks, and finally, network based attacks.  Concludes by examining the means of securing and protecting critical systems and inf…
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question