?
Solved

ispunct function-removing punctiation marks from a text file

Posted on 2003-03-12
12
Medium Priority
?
421 Views
Last Modified: 2012-08-13
Hi. I have a little search program for searching a text file for a particular word. I'd like to add a function to scan the file to find and remove all punctuation marks from the file before the search is performed. I'm guessing i use the ispunct function but can't get it working properly. My code is below.
Cheers,
Triona#include <stdio.h>
#include <string.h>
#include <ctype.h>

int main()
{
  FILE *essays;
  FILE *results;
  char line[250];
  char Search[80];
  const char *delimiters = {"[]"};
  char *token=NULL;
  char* pStr=NULL;
  int count = 0, line_count=0;
  char *test = "";
  int punct = 0;
 
 
 
  /* Open file to be searched and file to be written to */

  if ((essays = fopen("essays.txt", "r")) == NULL)
  {
      printf("Unable to open the input file");
      return 0;
  }

  if ((results = fopen("results.txt","w")) ==NULL)
  {
       printf("Unable to open output file");
       return 0;
  }
 
  /* Get string to be searched for */

  printf("Enter string to be searched for:\n",Search);
  scanf("%s",&Search);
     

               
   while(!feof(essays))
   {
          /*Read in first line of text file */
       
           if (fgets(line, 250, essays) != NULL)
                    {
     
         
                         line_count++;
                    /* Break up line into tokens */

                if ((test = strtok(line, delimiters )) != NULL)
                    {
                   
                    /*Search for string */
                         
                   if ( strstr(test, Search) != NULL)
                           
                       {    
                            count++;
                       printf("Entry %d on line %d: %s\n",count,line_count,test);
                            fputs(test,results);
                            fputs("\n", results);
                       }  
                            /* MOve on to next token in the line */

                   while ((test = strtok((char *)NULL, delimiters )) != NULL)
                       {
                              if ( strstr(test, Search) != NULL)
                              {
                                   count++;
                                    printf("Entry %d on line %d %s\n",count,line_count,test);
                                    fputs(test,results);
                                    fputs("\n",results);
                                   
                              }
                     
                   }

                } //End of If(test....
                             
               
          }  //End of Entry IF
       
   
           
     
   
   

   }
   printf("The string %s occurs %d times\n", Search, count);
   fclose(results);
   fclose(essays);
   
    return 0;
}


0
Comment
Question by:Triona
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 3
  • 2
  • +1
12 Comments
 
LVL 6

Expert Comment

by:gj62
ID: 8120216
after you get your token, to remove punctuation you can do the following:

 char test[]="this.is.a.test!!!";
 char *p;

p=test;
while(*p)
{
  if (ispunct(*p))
  {
    memmove(p,p+1,strlen(p));
  }
  else
    p++;

}

which will leave test = "thisisatest"
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 8120390
Wouldn't it make more sense to test for punctuation in the search itself?  After all, what about a situation like

no,spaces,here

If you yank the punctuation, you get

nospaceshere

so searching for "spaces" as a word will fail.  If it succeeeds, it will also mess up and indicate a hit with "spaceship," which doesn't sound like what you want to do.
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120438
He's only testing for [ and ] as his delimiters - I assumed they were word delimiters.

If not, he can either make the delimiters all the punctuation, or maybe he should be replacing punctuation with delimiters BEFORE strtok, as follows:

char *p;
if (fgets(line, 250, essays) != NULL)
{
  p=line;
  while(*p)
  {
    if(ispunct(*p)
    {
      *p='[';  /*or any delimiter you test for*/
    }
    p++;
  }
   
now strtok...
rest of code here...
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 

Author Comment

by:Triona
ID: 8120516
Thanks, but i still want to keep the words seperate, just want to remove punctuation like full stops, commas etc. How do i go about that?
0
 
LVL 11

Accepted Solution

by:
KurtVon earned 120 total points
ID: 8120529
In that case use the solution proposed by gj62 and change the memmov to

*p = ' ';
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120546
Uh, not the memmove solution, but the simple replace solution, right?   e.g.

char *p;
if (fgets(line, 250, essays) != NULL)
{
 p=line;
 while(*p)
 {
   if(ispunct(*p)
   {
     *p=' ';  /*replace all punctuation with spaces...*/
   }
   p++;
 }

you could have more than 1 space in a row - do you want just 1 space?
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120574
Gee, kinda thought that would be for me :-( <grin>
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 8120575
Um, I think gj62 should be getting those points.

I was just trying to be helpful outside the question.
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120578
Oh, never mind, it is a B - you can have it Kurt <grin>
0
 

Author Comment

by:Triona
ID: 8120609
Sorry gj62 - accepted KurtVons comments before i received yours ( and because you assumed i was a 'HE'!! )
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 8120632
And if you don't want your token edited, you can compare them yourself:

// Search for "s2" in "s1" ignoring embedded punctuation
// returns address within s1 where s2 was found, else NULL

char *strpcmp (char *s1, char *s2)
{
  char *p1, *p2;

  while (*s1)
  {
    if (*s1 == *s2)
    {
      p1 = s1+1;
      p2 = s2+1;
      while (*p2)
      {
        if (*p1 == *p2)  // if match, including identical punctuation
        {
          p1++;
          p2++;
        }
        else if (ispunct (*p1)) // punct in object string
          p1++;
        else break;
      }
      if (*p2 == 0) // end of p2 reached, match found
        return s1;
    }
    s1++;
  }
  return 0;
}
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 8120654
Man, you guys are quick today....
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This tutorial is posted by Aaron Wojnowski, administrator at SDKExpert.net.  To view more iPhone tutorials, visit www.sdkexpert.net. This is a very simple tutorial on finding the user's current location easily. In this tutorial, you will learn ho…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use conditional statements in the C programming language.
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question