Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

ispunct function-removing punctiation marks from a text file

Posted on 2003-03-12
12
Medium Priority
?
430 Views
Last Modified: 2012-08-13
Hi. I have a little search program for searching a text file for a particular word. I'd like to add a function to scan the file to find and remove all punctuation marks from the file before the search is performed. I'm guessing i use the ispunct function but can't get it working properly. My code is below.
Cheers,
Triona#include <stdio.h>
#include <string.h>
#include <ctype.h>

int main()
{
  FILE *essays;
  FILE *results;
  char line[250];
  char Search[80];
  const char *delimiters = {"[]"};
  char *token=NULL;
  char* pStr=NULL;
  int count = 0, line_count=0;
  char *test = "";
  int punct = 0;
 
 
 
  /* Open file to be searched and file to be written to */

  if ((essays = fopen("essays.txt", "r")) == NULL)
  {
      printf("Unable to open the input file");
      return 0;
  }

  if ((results = fopen("results.txt","w")) ==NULL)
  {
       printf("Unable to open output file");
       return 0;
  }
 
  /* Get string to be searched for */

  printf("Enter string to be searched for:\n",Search);
  scanf("%s",&Search);
     

               
   while(!feof(essays))
   {
          /*Read in first line of text file */
       
           if (fgets(line, 250, essays) != NULL)
                    {
     
         
                         line_count++;
                    /* Break up line into tokens */

                if ((test = strtok(line, delimiters )) != NULL)
                    {
                   
                    /*Search for string */
                         
                   if ( strstr(test, Search) != NULL)
                           
                       {    
                            count++;
                       printf("Entry %d on line %d: %s\n",count,line_count,test);
                            fputs(test,results);
                            fputs("\n", results);
                       }  
                            /* MOve on to next token in the line */

                   while ((test = strtok((char *)NULL, delimiters )) != NULL)
                       {
                              if ( strstr(test, Search) != NULL)
                              {
                                   count++;
                                    printf("Entry %d on line %d %s\n",count,line_count,test);
                                    fputs(test,results);
                                    fputs("\n",results);
                                   
                              }
                     
                   }

                } //End of If(test....
                             
               
          }  //End of Entry IF
       
   
           
     
   
   

   }
   printf("The string %s occurs %d times\n", Search, count);
   fclose(results);
   fclose(essays);
   
    return 0;
}


0
Comment
Question by:Triona
  • 5
  • 3
  • 2
  • +1
12 Comments
 
LVL 6

Expert Comment

by:gj62
ID: 8120216
after you get your token, to remove punctuation you can do the following:

 char test[]="this.is.a.test!!!";
 char *p;

p=test;
while(*p)
{
  if (ispunct(*p))
  {
    memmove(p,p+1,strlen(p));
  }
  else
    p++;

}

which will leave test = "thisisatest"
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 8120390
Wouldn't it make more sense to test for punctuation in the search itself?  After all, what about a situation like

no,spaces,here

If you yank the punctuation, you get

nospaceshere

so searching for "spaces" as a word will fail.  If it succeeeds, it will also mess up and indicate a hit with "spaceship," which doesn't sound like what you want to do.
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120438
He's only testing for [ and ] as his delimiters - I assumed they were word delimiters.

If not, he can either make the delimiters all the punctuation, or maybe he should be replacing punctuation with delimiters BEFORE strtok, as follows:

char *p;
if (fgets(line, 250, essays) != NULL)
{
  p=line;
  while(*p)
  {
    if(ispunct(*p)
    {
      *p='[';  /*or any delimiter you test for*/
    }
    p++;
  }
   
now strtok...
rest of code here...
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 

Author Comment

by:Triona
ID: 8120516
Thanks, but i still want to keep the words seperate, just want to remove punctuation like full stops, commas etc. How do i go about that?
0
 
LVL 11

Accepted Solution

by:
KurtVon earned 120 total points
ID: 8120529
In that case use the solution proposed by gj62 and change the memmov to

*p = ' ';
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120546
Uh, not the memmove solution, but the simple replace solution, right?   e.g.

char *p;
if (fgets(line, 250, essays) != NULL)
{
 p=line;
 while(*p)
 {
   if(ispunct(*p)
   {
     *p=' ';  /*replace all punctuation with spaces...*/
   }
   p++;
 }

you could have more than 1 space in a row - do you want just 1 space?
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120574
Gee, kinda thought that would be for me :-( <grin>
0
 
LVL 11

Expert Comment

by:KurtVon
ID: 8120575
Um, I think gj62 should be getting those points.

I was just trying to be helpful outside the question.
0
 
LVL 6

Expert Comment

by:gj62
ID: 8120578
Oh, never mind, it is a B - you can have it Kurt <grin>
0
 

Author Comment

by:Triona
ID: 8120609
Sorry gj62 - accepted KurtVons comments before i received yours ( and because you assumed i was a 'HE'!! )
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 8120632
And if you don't want your token edited, you can compare them yourself:

// Search for "s2" in "s1" ignoring embedded punctuation
// returns address within s1 where s2 was found, else NULL

char *strpcmp (char *s1, char *s2)
{
  char *p1, *p2;

  while (*s1)
  {
    if (*s1 == *s2)
    {
      p1 = s1+1;
      p2 = s2+1;
      while (*p2)
      {
        if (*p1 == *p2)  // if match, including identical punctuation
        {
          p1++;
          p2++;
        }
        else if (ispunct (*p1)) // punct in object string
          p1++;
        else break;
      }
      if (*p2 == 0) // end of p2 reached, match found
        return s1;
    }
    s1++;
  }
  return 0;
}
0
 
LVL 46

Expert Comment

by:Kent Olsen
ID: 8120654
Man, you guys are quick today....
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An Outlet in Cocoa is a persistent reference to a GUI control; it connects a property (a variable) to a control.  For example, it is common to create an Outlet for the text field GUI control and change the text that appears in this field via that Ou…
Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
The goal of this video is to provide viewers with basic examples to understand opening and reading files in the C programming language.
The goal of this video is to provide viewers with basic examples to understand and use switch statements in the C programming language.
Suggested Courses

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question