Link to home
Start Free TrialLog in
Avatar of downatone
downatone

asked on

Reading in word at a time from plain text file?

Hi the title basically sums up what I'm trying to do.
I am currently doing this:

FILE *inputPtr;
char *token[999];
int a =o;

inputPtr = fopen ("newfile.txt" , "r");

while((fgets (str , 40 , inputPtr)) != NULL)
{
  token[a] = strtok(str," ");
  printf("Current word to be checked is %s\n",token[a]);
  a++;
}

From what I've read fgets will read in a line at a time from a text file (I've indicated it can be 40 chars long), and then token[a] will point to each word not seperated by a space (" "). So if my text file where to look like this:

Hello and welcome to the place where the answers are found and solutions are saught after

It would put:
token[0] = Hello
token[1] = and
token[2] = welcome

Anyway you get the idea. This is what I am TRYING to get as my result, but am not.
It's currently jumping from one word in the file to another random one...
Any solutions would be MUCH appreciated.
By the way (in case it matters) I'm programming in C with linux using gcc as the compiler...?!
cheers
David
 
Avatar of prady_21
prady_21

This program i think works for you. Why dont you try it out.
I prefer not to use strtok.
First of all, with your pgm u had given the size to be 4o characters only but the line size was more than that.




#include<stdio.h>
#include<string.h>

int main(void)
{
    char *index, *str;
    char array[50][500];
    int num,len,i,length;
    FILE *inputPtr;

    inputPtr = fopen ("newfile.txt" , "r");

    while((fgets (str , 500 , inputPtr)) != NULL)
    {
       num = 0;
       while(index = strchr(str,' ')) {
          len = index - str;
          if ( len > 0 ) {
             strncpy(array[num],str,len);
             length=strlen(array[num]);
             array[num][length+1] = '\0';
             str = index+1;
             num++;
          }
       }
       strncpy(array[num],str,len+1);
       num++;
       for(i=0; i<num; i++) {
          printf("%s\n",array[i]);
       }
    }
}
Avatar of downatone

ASKER

Thanks for the prompt reply prady.
Warning this message might end up being long, but I'll try to keep it short!
The code you gave pretty much does what I want, but its given a little bit of strange behaviour here and there. Now I don't know how to best explain this than to let you run my exact code (I can't find a way to just attach the code as an attachment so I'll paste to core of what I'm trying to do):

#ifndef SPELLCHECKER_H

#define SPELLCHECKER_H

/* Maxmimal length for a lable or input field */
#define MAX_LENGTH 40
#define MAX_READ 400


/* Node structure for a label or input field */
typedef struct Field
{
      /* Text of the label, or value of an input field */
      /* Extra 1 for the null-teminator */
      char text[MAX_LENGTH+1];

      /* Number of Words in specific dictionary */
      int NumOfWords;
      
      /* Length of each word */
      int len;

      /* Double linked list for easy access */
      /* You can use simple linked list if you are not comfortable with double linked list */
      struct Field *next;
      struct Field *prev;

} Field;

typedef struct Lib
{

      /* Linux dictionary located @ (/usr/share/dict/words) */
      Field *linux_dictionary;

      /* Personal dictionary of words that will be written too */
      Field *personal_dictionary;

      Field *personal;

      Field *linuxWords;

      /* Point to the input field currently being edited */
      Field *current;

} Lib;

#endif

/* --------------END OF PROGRAM_NAME.H ----------------*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "spellchecker.h"

int main (int argc, char* argv[])

{

/* ------------------------------------- STEP 0 ----------------------------------------------- */
   FILE *linuxWordsPtr;
   FILE *inputPtr;
   FILE *outPtr;

   char str[MAX_LENGTH], *token[MAX_READ];
   int a = 0, b;
   long i=0;

   char *index, *string;
   char array[50][500];
   int num,len,z,length;

   Field *new;
   Lib *XP;
   XP = (Lib*)calloc(1,sizeof(Lib));

   linuxWordsPtr = fopen ("/usr/share/dict/words", "r");
   inputPtr = fopen (argv[1], "r");
   outPtr = fopen ("newfile.txt" , "w");



   if(argv==NULL)
   {
     printf("Error ---- You must enter a valid argument");
   }

   else{

     //if file @ command line is empty return error !!! but does not currently work !!!
     if (inputPtr == NULL)
     {
       printf("Error opening file to spell check -- File may not exist or be empty\n");
       return 0;
     }

/* ------------------------------------- STEP 1 ----------------------------------------------- */

     //ptr = NULL;

     XP->linuxWords = NULL;
     //this loop takes the linux dictionary and puts it in a linked list (dictionary.linux_dictionary)
     while((fgets (str , 40 , linuxWordsPtr)) != NULL)
     {
       //token[a] = strtok(str," ");
       //printf("Current word in linux dict is: %s\n", token[a]);
       //dictionary.linux_dictionary->NumOfWords = dictionary.linux_dictionary->NumOfWords + 1;
       //printf("Current word is: %s\n", str);

       new = (Field*)calloc(1,sizeof(Field));

       if(XP->linuxWords==NULL)
       {
       XP->linuxWords = new;
       XP->current = new;
       new->prev = NULL;
       new->next = NULL;
       strcpy(new->text, str);
       new->len = strlen(new->text);
       }

       else
       {
       new->next = NULL;
       new->prev = XP->current;
       strcpy(new->text, str);
       new->len = strlen(new->text);
       XP->current->next = new;
       XP->current = new;
       }

       //printf("Current word in linked list is: %s\n", new->text);

     }//end of fgets while loop


    //traverse the list!!!
    /*XP->current = XP->linuxWords;
    while(XP->current->next!=NULL)
    {
      printf("Word is: %s\n",XP->current->text);
      XP->current = XP->current->next;
    }*/


    while((fgets (string , 400 , inputPtr)) != NULL)
    {
      num = 0;
      while(index = strchr(string,' '))
      {
        len = index - string;
        if ( len > 0 )
      {
          strncpy(array[num],string,len);
          length=strlen(array[num]);
          array[num][length+1] = '\0';
          string = index+1;
          num++;
        }
      }
      strncpy(array[num],str,len+1);
      num++;

      for(z=0; z<num; z++)
      {
        printf("%s ",array[z]);
      }

     }//end of big fgets, while loop, reading in file to be checked

     }

     return 0;

}//end of main


Now in here I think that the linked list is for some reason interferring with the previous code you gave me.
Any further thoughts?
cheers
David
(apologies for just having to throw a bunch of code up)
Ignore that last comment, I find it hard to follow (the comment that is!) Basically I'm trying to take a txt file and break it up into single words, and then put each word into a linked list...
The code that you provided prady, didn't seem to work too well with what I was trying to do, where strtok() can be done on one line...and it was also not accurate every time, I was getting alot of strange results with it. Anyway, I've already changed a wee bit of the code above. If I can't figure it out within the next out, I'll post where I'm at. Apologies on the 'waveryness' (not a word) of these postings!
cheers
David
The problem with your original program is that you read a line (or 40 characters), call strtok once, read another line, calls strtok on that one, etc.  So you're only finding the first token in each block that you've read.

In order to use strtok properly, you need to keep calling it on the current string before moving to the next string.  Hence you need a nested loop.  Read the man page for strtok on how to do this (or ask for more info).

Finally, are you sure that 40 characters per line is enough?  If it's not, you risk getting words broken in the middle.

Gary
Gary,
Ok, checked the man pages no real help on the nested if statements, oi this makes me feel dumb!
Ok so my code has been simplified to try to resolve this problem...I'll worry about putting it into a linked list later.
So heres the current code:

token[a] = "initialize";

while((fgets (string2 , 400 , inputPtr)) != NULL)
{
  while(token[a]!=NULL)
  {
    token[a] = strtok(string2, " ");
    printf("current word is %s\n",token[a]);
    a++;
  }
}

So string2 is an essay style string, and fgets takes the first line...right? Then strtok() should take each word and put it into token[a], one word at a time (a++ increments). But my print statement keeps printing my first word. I'm assuming I've set the loop up.
Any thoughts, or perferablly code to help fix this???
Thanks alot for all the help everyone, much appreciated.
cheers
David
As I recall, strtok takes a null pointer after the first call for each string.  Thus move your current call on strtok to just before the second while, and add another call on strtok, using NULL as the first parameter, and put it after the a++.

Gary
Well yes, after every call to strtok you have to set strtok to point to the next element and make the first element NULL. See the manpage of strtok


  gets(buffer);
  token = strtok(buffer, " ");
  while( token != NULL ) {
      strcpy(word[i], token);
      token = strtok(NULL, " ");
  }

hope this helps
i think this should suffice.
Make sure when you are using fgets, the size specified to be greater than the input string length.
and one more thing, if this dosent work, place the code in C or C++ section, where i think you will be answered soon.

:)
Good, point, thanks for all the help, still not really working, so I'll post her in the C section (can't really remember why I put it here, my bad!)
cheers
David
No comment has been added lately, so it's time to clean up this TA.
I will leave the following recommendation for this question in the Cleanup topic area:

PAQ with points refunded

Please leave any comments here within the next seven days.
PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer
ASKER CERTIFIED SOLUTION
Avatar of Computer101
Computer101
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial