Solved

# Re: word count

Posted on 1998-11-24
214 Views
Sorry about the mix of the points, I hope everything got sorted out.  I liked the code for word count.

My new question is this: After I open the file and run it thru 'word count' is it possible to show each word and the number of times it appears in a file.

The reason I'm offering this amount of points is I don't come here that often, and the point total continues to grow.  However, the answers are well worth it, plus some.
0
Question by:tester090797
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 3

LVL 12

Accepted Solution

rwilson032697 earned 400 total points
ID: 1254621
Do something like this:

struct TWord {char *TheWord; int Count};

struct TWord Words[1000]; // We'll imagine a maximum of 1000 different words.
int WordCount;

As you identify each word do this (the word is in TheWord):

{
int FoundWord = 0;
for (int i = 0; i < WordCount; i++)
{
if (strcmp(TheWord, Words[i].Word) == 0)
{
FoundWord = 0;
break; // ie: out of the loop
}
}

if (FoundWord)
{
Words[FoundWord]Count++;
}
else
{
WordCount++;
Words[WordCount].Word = TheWord; // Actually you would copy the string - you get the picture...
Words[WordCount].Count = 0;
}
}

// When you have processed all words do this

for (int i =0; i < WordCount; i++)
{
printf("%s occured %d times.\n", Words[i].Word, Words[i].Count);
}

Cheers,
Raymond.
0

LVL 5

Expert Comment

ID: 1254622
#include<stdio.h>
#include<stdlib.h>
#include<ctype.h>
#include<string.h>

struct node {
int count;
char *word;
node *next;

int file_exists(char *filename);

main()
{
node *p, *t;
char ch, source[80], newword[100];
int index, letter=0;
long count[64];
int wordflag = 0;
int quotemode = 0;
long words=0;
FILE *fp;

fprintf(stderr, "\Enter source file name: ");
gets(source);

if(!file_exists(source))
{
fprintf(stderr, "\n%s does not exist.\n", source);
exit(1);
}
if((fp = fopen(source, "rb")) == NULL)
{
fprintf(stderr, "\nError opening %s.\n", source);
exit(1);
}

for(index = 21; index < 63; index++)
count[index] = 0;

while(1)
{
ch = fgetc(fp);
if(feof(fp))
break;
if((ch >= 21) && (ch < 63)) {
if(wordflag) { newword[++letter]='\0'; addnewword((char *)newword,letter); }
count[ch]++; wordflag = 0; }
if(ch==34) quotemode = (!quotemode);
if((toupper(ch)>='A') && (toupper(ch) <= 'Z') && (wordflag)) {letter++; newword[letter] = ch; }
if((toupper(ch)>='A') && (toupper(ch) <= 'Z')  && (!wordflag) && (!quotemode))
{wordflag=1; words++; letter=0; newword[0]=ch;}
}

printf("\nChar\t\tCount\n");
for(index = 21; index < 42; index++)  {
printf("[%c]\t%d    ", index, count[index]);
printf("[%c]\t%d\n", (index+21), count[index+21]); }
printf("Words:  %d\n\n",words);

// print the words
for(p=head; p != NULL; p=p->next) printf("%s      %d\n",p->word,p->count);

// free the list
for(p=head, t=NULL; p != NULL; t=p, p=p->next)
{
free(p->word);
if(t != NULL) free(t);
}
if(t != NULL) free(t);

fclose(fp);
return(0);
}

int file_exists(char *filename)
{
FILE *fp;
if ((fp = fopen(filename, "r")) == NULL)
return 0;
else
{
fclose(fp);
return 1;
}
}

{
node *newnode;
char *newstring;

if(p != NULL) {
int found=0;
while((p->next != NULL) && !found) {
if(strcmp(newword, p->word)==0) {p->count++; found=1;}
else {prev=p; p=p->next;}
}
if (!found) {
newnode=(node *)malloc(sizeof(node));
newnode->count=1;
newstring=(char *)malloc(letter+1);
strcpy(newnode->word, newstring);
newnode->next=NULL;
prev->next=newnode;
}
}
else {
newword=(char *)malloc(letter+1);
}

}

Here is an idea that builds on your previous program.  You might have to do a little debugging however.
0

LVL 5

Expert Comment

ID: 1254623
>I liked the code for word count.

The one I gave you?
0

LVL 1

Expert Comment

ID: 1254624
Scrapdog - I think the best way is to use a Trie datastructure ("Binary Tree" with 26 nodes). When your code will make the job in O(N^2), where Trie can do it in O(L*logN), where L is the words length and N is the number of words.
With large files, the difference becomes significant.
0

LVL 5

Expert Comment

ID: 1254625
I was just making a change in a program tester submitted in another question.  I did not want to rewrite his/her entire program.

Yep, I know that using trees would be faster, but of course these are a little more difficult to code than a linked list.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

### Suggested Solutions

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
The goal of this video is to provide viewers with basic examples to understand recursion in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use nested-loops in the C programming language.
###### Suggested Courses
Course of the Month4 days, 10 hours left to enroll