Solved

PROGRAM THAT COUNTS THE WORD OF A TXT FILE

Posted on 2004-09-09
9
184 Views
Last Modified: 2010-04-15
Hi
I have been trying to code a program which counts the number of words in a text file. I know it is so easy but I have not expertise on this. I am a DBA and have not idea how to do it. Can anybody help me? Thank you so much
0
Comment
Question by:caro1216
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
9 Comments
 
LVL 12

Accepted Solution

by:
stefan73 earned 25 total points
ID: 12016356
Hi caro1216,
The basic algorithm is like this: You have an "in word" and an "not in word" state. Every time the state changes to "in word", you increase the word counter.

Obviously, it's up to you how to define the conditions for a change between "in word" and "not in word". Would you count words as "hyper-sensitive" as one word or as two?

Check the C routine isalpha(). If you have your own definition for alpha characters (such as using foreign letters), using a lookup table is the fastest solution.

Cheers!

Stefan
0
 

Author Comment

by:caro1216
ID: 12017306
Thank you.
0
 
LVL 45

Expert Comment

by:Kent Olsen
ID: 12020586

Hi Carol,

You're going to have to make a few decisions before you start counting.  Mostly, what constitues a "word".  Is it alphabetic data only?  Mixed alphabetic and numeric?  Numeric data only?  How do you count hyphenated words?  Do you want to explicitly look for decimal data and count that as one word?

Lots of decisions......

But Stephan is right.  You can start by simply writing a small program to examine, count, and skip based on the first character of each sequence, then modify it if necessary.  

It's so nice to be able to help someone else when I can't quite get my head wrapped around my own Database issue.  :)  Here's some code to get you started...


main ()
{
  int WordCount = 0;
  char cc;

  while (1)   /*  move to the first alphabetic character in the file  */
  {
    cc = getch (stdin);
    if (feof (stdin))
      break;  /*  We've hit EOF and never found a character worth looking at  */
    if (isalpha (cc))
      break;
  }

  while (!feof (stdin))
  {
    WordCount++;
    while (!feof (stdin) && isalpha (cc))  /* skip past this word  */
      cc = getch (stdin);
    while (!feof (stdin) && !isalpha (cc)) /*  skip to the next word  */
      cc = getch (stdin);
  }
  printf ("%d words found.\n", WordCount);
}


Kent
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 23

Expert Comment

by:brettmjohnson
ID: 12020815
The easiest way to do it is to let someone else do it:

long numWords = 0;
FILE fp = popen("wc -w myfile.txt");
if (fp) {
  fscanf(fd, "%ld", &numWords);
  pclose(fp);
}


The next easiest is to use appropriate standard library functions to do the work,
specifically strtok():

  FILE * fd;
  long numWords = 0;
  static char buff[4096]; /* ??? if line > 4096 bytes, last word in buff may be cut in two */

  /* open the file */
  if (!(fd = fopen(argv[1], "r")))
    return 1;

  /* read the file line-by-line, tokenize each line, count number of words on each line */
  while (fgets(buff, sizeof(buff), fd)) {
    char *word;
    static const char *sep = " \t\r\n\"()!?,.;:/[]{}+=@<>#*&^|`~";
    for (word = strtok(buff, sep); word; word = strtok(NULL, sep))
       numWords++;
  }

  fclose(fd);



0
 
LVL 45

Assisted Solution

by:Kent Olsen
Kent Olsen earned 25 total points
ID: 12022483

Good call Brett,

There's still a fair amount of special casing to do.


0000   -- integer (word)
00.3    -- real     (word)
.3       -- real      (word)
. 3      -- int        (word)
.3.3    -- ???

This is a test.       4 words
"This is a test"     ?? words
'This is a test'      ?? words
This is Brett's test ?? words (Possible apostrophe, possible unterminated quote.)

One-Word-Or-Four?  (Counting hyphens depends on the context.)

12-4=8                    (I'll buy the argument that this is three words.)
10-4                        (2 words if algebra, 1 word if said as an "OK" or "Roger")

...                           (How do you count an elipsis?  Or do you?)


Here Caro asked a simple question and on my first venture back to the boards in a while I've managed to give him/her too much to think about....

Kent

0
 
LVL 11

Expert Comment

by:cjjclifford
ID: 12025362
how about the standard Unix program "wc" - this will count bytes, characters, words and lines...

# to do all 3
wc filename

# to count words
wc -w filename
0
 

Author Comment

by:caro1216
ID: 12026444
wow.
thank you to everybody.That is helping me a lot. For now no UNIX, thanks, that will be my next assigment and i am going to keep an eye on it.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.

630 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question