• C

ispunct function-removing punctiation marks from a text file

Hi. I have a little search program for searching a text file for a particular word. I'd like to add a function to scan the file to find and remove all punctuation marks from the file before the search is performed. I'm guessing i use the ispunct function but can't get it working properly. My code is below.
Cheers,
Triona#include <stdio.h>
#include <string.h>
#include <ctype.h>

int main()
{
  FILE *essays;
  FILE *results;
  char line[250];
  char Search[80];
  const char *delimiters = {"[]"};
  char *token=NULL;
  char* pStr=NULL;
  int count = 0, line_count=0;
  char *test = "";
  int punct = 0;
 
 
 
  /* Open file to be searched and file to be written to */

  if ((essays = fopen("essays.txt", "r")) == NULL)
  {
      printf("Unable to open the input file");
      return 0;
  }

  if ((results = fopen("results.txt","w")) ==NULL)
  {
       printf("Unable to open output file");
       return 0;
  }
 
  /* Get string to be searched for */

  printf("Enter string to be searched for:\n",Search);
  scanf("%s",&Search);
     

               
   while(!feof(essays))
   {
          /*Read in first line of text file */
       
           if (fgets(line, 250, essays) != NULL)
                    {
     
         
                         line_count++;
                    /* Break up line into tokens */

                if ((test = strtok(line, delimiters )) != NULL)
                    {
                   
                    /*Search for string */
                         
                   if ( strstr(test, Search) != NULL)
                           
                       {    
                            count++;
                       printf("Entry %d on line %d: %s\n",count,line_count,test);
                            fputs(test,results);
                            fputs("\n", results);
                       }  
                            /* MOve on to next token in the line */

                   while ((test = strtok((char *)NULL, delimiters )) != NULL)
                       {
                              if ( strstr(test, Search) != NULL)
                              {
                                   count++;
                                    printf("Entry %d on line %d %s\n",count,line_count,test);
                                    fputs(test,results);
                                    fputs("\n",results);
                                   
                              }
                     
                   }

                } //End of If(test....
                             
               
          }  //End of Entry IF
       
   
           
     
   
   

   }
   printf("The string %s occurs %d times\n", Search, count);
   fclose(results);
   fclose(essays);
   
    return 0;
}


TrionaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

gj62Commented:
after you get your token, to remove punctuation you can do the following:

 char test[]="this.is.a.test!!!";
 char *p;

p=test;
while(*p)
{
  if (ispunct(*p))
  {
    memmove(p,p+1,strlen(p));
  }
  else
    p++;

}

which will leave test = "thisisatest"
0
KurtVonCommented:
Wouldn't it make more sense to test for punctuation in the search itself?  After all, what about a situation like

no,spaces,here

If you yank the punctuation, you get

nospaceshere

so searching for "spaces" as a word will fail.  If it succeeeds, it will also mess up and indicate a hit with "spaceship," which doesn't sound like what you want to do.
0
gj62Commented:
He's only testing for [ and ] as his delimiters - I assumed they were word delimiters.

If not, he can either make the delimiters all the punctuation, or maybe he should be replacing punctuation with delimiters BEFORE strtok, as follows:

char *p;
if (fgets(line, 250, essays) != NULL)
{
  p=line;
  while(*p)
  {
    if(ispunct(*p)
    {
      *p='[';  /*or any delimiter you test for*/
    }
    p++;
  }
   
now strtok...
rest of code here...
0
Turn Raw Data into a Real Career

There’s a growing demand for qualified analysts who can make sense of Big Data. With an MS in Data Analytics, you can become the data mining, management, mapping, and munging expert that today’s leading corporations desperately need.

TrionaAuthor Commented:
Thanks, but i still want to keep the words seperate, just want to remove punctuation like full stops, commas etc. How do i go about that?
0
KurtVonCommented:
In that case use the solution proposed by gj62 and change the memmov to

*p = ' ';
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
gj62Commented:
Uh, not the memmove solution, but the simple replace solution, right?   e.g.

char *p;
if (fgets(line, 250, essays) != NULL)
{
 p=line;
 while(*p)
 {
   if(ispunct(*p)
   {
     *p=' ';  /*replace all punctuation with spaces...*/
   }
   p++;
 }

you could have more than 1 space in a row - do you want just 1 space?
0
gj62Commented:
Gee, kinda thought that would be for me :-( <grin>
0
KurtVonCommented:
Um, I think gj62 should be getting those points.

I was just trying to be helpful outside the question.
0
gj62Commented:
Oh, never mind, it is a B - you can have it Kurt <grin>
0
TrionaAuthor Commented:
Sorry gj62 - accepted KurtVons comments before i received yours ( and because you assumed i was a 'HE'!! )
0
Kent OlsenData Warehouse Architect / DBACommented:
And if you don't want your token edited, you can compare them yourself:

// Search for "s2" in "s1" ignoring embedded punctuation
// returns address within s1 where s2 was found, else NULL

char *strpcmp (char *s1, char *s2)
{
  char *p1, *p2;

  while (*s1)
  {
    if (*s1 == *s2)
    {
      p1 = s1+1;
      p2 = s2+1;
      while (*p2)
      {
        if (*p1 == *p2)  // if match, including identical punctuation
        {
          p1++;
          p2++;
        }
        else if (ispunct (*p1)) // punct in object string
          p1++;
        else break;
      }
      if (*p2 == 0) // end of p2 reached, match found
        return s1;
    }
    s1++;
  }
  return 0;
}
0
Kent OlsenData Warehouse Architect / DBACommented:
Man, you guys are quick today....
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.