[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 509
  • Last Modified:

name matching algorithm

Hi !!!

Does anyone know about the matching algorithm which can handle this situation:

for example..we saecrh for "smi" (surname)
and when we type "smi" and press enter, the program looks into the datfile (surname.dat) and returns
the list which begined with "smi***" or in between ***smi*** or end uped with "***smi"
nysmi
nismithy
smi
smid
smith
smithe
smithy
smithye
wysmi

etc..

or we type  "smith"
the matching algorithm generates the list of matches as following:

smith
smithey
smithy
smithye
willsmith
willsmithyoung
youngsmith


or  type "will"

youngwillsmith
will
willsmith
----------------------

if this algorithm doesn't exist , could anyone implement the simple code which overcome this problem above

many thanks,

Korsila
0
korsila
Asked:
korsila
  • 16
  • 7
  • 2
  • +3
1 Solution
 
AxterCommented:
Is this homework?
0
 
smiskCommented:
/*
** How's this?
*/
#include <stdio.h>
#include <strings.h>

/*
** Returns 1 if the *target contains *src
*/
int matches(char *src, char *target) {
    int i;

    for(i = 0; i < strlen(target); i++)
        if(!strncmp(&target[i], src, strlen(src))) return(1);

    return(0);
}

int main(int argc, char *argv[]) {

    char *surname = "smi";
    char data[4][15] = { "smith", "willsmith", "bobby", "willsmithson" };
    int i;

    /*
    ** Loop through the array printing matches
    */
    for(i = 0; i < 4; i++)
        if(matches(surname, data[i]))
            printf("%s\n", data[i]);

    return(1);
}
0
 
korsilaAuthor Commented:
aXTER,
Not a homeowrk, it's something I am interested in most of my work..!!!but as you might not know am just a beginer of C or other programming language apart from HTML...

cheers,
KORSILA
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
smiskCommented:
Korsila,

  Yeah I looked around and you asked some questions similar to this (in SQL, etc.) so I figured it wasn't homework.  Try to read through and understand the code I gave.  Also, the loop in "matches" does too many iterations.  See if you can figure out how to change it to speed things up (and avoid possible memory problems).

Thanks,
Steve
0
 
SapaCommented:
Use standard strstr function:

NAME
       strstr - locate a substring

SYNOPSIS
       #include <string.h>

       char *strstr(const char *haystack, const char *needle);

DESCRIPTION
       The  strstr()  function  finds the first occurrence of the
       substring needle in the string haystack.  The  terminating
       `\0' characters are not compared.

RETURN VALUE
       The  strstr()  function returns a pointer to the beginning
       of the substring, or NULL if the substring is not found.

0
 
korsilaAuthor Commented:
Smisk,

can you code and test it with datafile...(surname.dat)

surname.dat
             
              barra
              baramy
              nabarra
              nybarramy
               nysmi
               nismithy
               smi
               smid
               smith
               smithe
               smithy
               smithye
               will
               willsmith
               willsmithyoung
               wysmi
               youngsmith
               youngwillsmith
   
-------------------------------

so the output should look like this:

enter name to macth: smith ---(example)---chick eneter

return name mathed with above requirement..!!!!

can you do this....

cheers,
Korsila

0
 
dhanaspace1Commented:
hi,
   u can not get more easy than this.....
 i strongly belive, it _must_ help u.
sooooooooooo, i may expect 100 pts.

i will work, perfect.



# include <stdio.h>  
# include <stdlib.h>    
# include <string.h>
 
# define TRUE 1
# define FALSE 0
  main (int c, char **argv)
{
  char t[128] = { 0x00, };
  char one[10] = { 0x00, }, two[10] =
  {
  0x00,};
  if (c != 3)
   
    {
      printf ("\nUsage: expression matcher\n\n");
      return 0;
    }
  /* call like */

  /* fp = fopen ("./surname.dat","r")
  *  fgets (...) ;  read each name
  *  then
  *    mask_match (name,"smi**" or "**smi**")...
  *    on success 1 on fail 0
  *    */
  printf ("\n[%d]\n", mask_match (argv[1], argv[2], 1));
  return 0;
}
int
do_match (char *str, char *regexp)
{
  char *p;
  for (p = regexp; *p && *str;)
   
    {
      switch (*p)
     
     {
     case '?':
       str++;
       p++;
       break;
     case '*':
       p++;
       if (!*p)
         return TRUE;
       while (*str)
         
         {
           while (*str && (toupper (*p) != toupper (*str)))
          str++;
           if (do_match (str, p))
          return TRUE;
           if (!*str)
          return FALSE;
           str++;
         }
       return FALSE;
     default:
       if (toupper (*str) != toupper (*p))
         
         {
           return FALSE;
         }
       str++;
       p++;
       break;
     }
    }
  if (!*p && !*str)
    return TRUE;
  if (!*p && (str[0] == '.') && (str[1] == 0))
    return (TRUE);
  if (!*str && *p == '?')
   
    {
      while (*p == '?')
     p++;
      return (!*p);
    }
  if (!*str && (*p == '*') && (p[1] == '\0'))
    return TRUE;
  return FALSE;
}
int
mask_match (char *str, char *regexp, int trans2)
{
  char *p = NULL;
  char pattern[1024] = { 0x00, }, string[1024] =
  {
  0x00,};
  int matched;
  if (str == NULL || regexp == NULL)
    return TRUE;
  strncpy (pattern, regexp, sizeof (pattern) - 1);
  for (p = pattern; *p; p++)
   
    {
      while (*p == '*' && (p[1] == '?' || p[1] == '*'))
     
     {
       memmove (p + 1, p + 2, strlen (p + 2));
       p[strlen (p + 2) + 1] = 0;
     }
    }
  if (!strcasecmp (pattern, "*"))
    return (TRUE);
  strncpy (string, str, sizeof (string) - 1);
  matched = do_match (string, pattern);
  return matched;
}


0
 
AxterCommented:
Hi (dhanaspace1), welcome to EE.

All of the experts here, for the most part have learn from other experts as to the proper etiquette for posting answer.

An answer should not be posted as an answer, if other experts have previously posted possible answers as comments, and/or have already made contributions to the question.

There are many experts who never post answers as answer.  Instead, they post their answers as comments.

If you read the following link, you'll see why this is the preferred method for many of our valued experts, including myself.

http://www.experts-exchange.com/jsp/cmtyQuestAnswer.jsp

Hi (korsila):
Feel free to click the [Reject Answer] button near (Answer-poster's) response, even if it seems like a good answer.
Doing so will increase your chance of obtaining additional input from other experts.  Later, you can click the [Select Comment as Answer] button on any response.
0
 
korsilaAuthor Commented:
First of all, I think Axter is right...
secondly,  the code that dhanaspace1 has posted it here didn't work...I got something instead of correct outpput of name macthes after running the code:
 "Usage: expression matcher"

so if you could figure out what it was and fix it for me..that would be ok though..!!
am glad load of experts are paying attendtion with my question and trying to help me out..!!!
thanks load..!!

Korsil;a
0
 
korsilaAuthor Commented:
First of all, I think Axter is right...
secondly,  the code that dhanaspace1 has posted it here didn't work...I got something instead of correct outpput of name macthes after running the code:
 "Usage: expression matcher"

so if you could figure out what it was and fix it for me..that would be ok though..!!
am glad load of experts are paying attendtion with my question and trying to help me out..!!!
thanks load..!!

Korsil;a
0
 
korsilaAuthor Commented:
make sure you got subroutine called "position()" which will handle the job above...just wanna make it tidy..!!!
Korsila

0
 
korsilaAuthor Commented:
so 4 steps in "posiiton ()" subroutine -if you could :
1. exact match  , smit--->smit
2. beginning match, smit--> smity, smith, smithy
3. middle match, smit-->willsmither, yungsmitty
4. ending match, smit-->willsmit, yungsmit
 

hope that's not bad..!!!

thanks,
Korsila
0
 
smiskCommented:
/* Revisited... :) */
#include <stdio.h>
#include <fcntl.h>
#include <strings.h>

/*
** Returns 1 if the *target contains *src
*/
int matches(char *src, char *target) {
    int i;

    for(i = 0; i < strlen(target); i++)
        if(!strncmp(&target[i], src, strlen(src))) return(1);

    return(0);
}

int main(int argc, char *argv[]) {
    char surname[12];
    /* char data[4][15] = { "smith", "willsmith", "bobby", "willsmithson" }; */
    char data[80];
    int i;
    FILE *fp;

    if(argc == 2) {

        /*
        ** Read the surname to test
        */
        printf("Enter a surname [max 12 chars] (example : smi) : ");
        fgets(surname, 12, stdin);
        surname[strlen(surname) - 1] = '\0';

        /*
        ** Loop through the datafile printing matches
        */
        fp = fopen(argv[1], "r");
        if(fp != NULL) {
            while(fgets(data, 80, fp))
                if(matches(surname, data))
                    printf("%s", data);

            return(0);
        }
        else {
            printf("Error accessing %s\n", argv[1]);
            return(1);
        }
        fclose(fp);

    }
    else {
        printf("Usage : %s <datafile>\n", argv[0]);
        return(1);
    }
}
0
 
ebosscherCommented:
what was wrong with using Sapa's answer and using strstr?

if(strstr(szLookIn, szFind) != NULL)
{
  /* we found it */
}
0
 
korsilaAuthor Commented:
Ebosscher,
I need some help with the code , not some comments without it otherwise I CANN'T TEST how it works...!!!!as I am not that expert, but I would love to learn...provide me the code and I WILL run it if it works you got the poitns and I got help that's fair is n't it...?

Korsila

0
 
korsilaAuthor Commented:
smisk,
I WIL HAVE A GO WITH YOUR CODE AND WILL let you know as soon as I get it run/done...

cheers,
Korsila
0
 
korsilaAuthor Commented:
Is it ok for anyone if I would accept Smisk as the best answer here..as I got it working with my datafile and I reckon i would accept his asnwer as the good one..!!!

anybody is happy with that....? I Wwill wait for "dhanaspace1" for updating new codes and "Sapa" for implementing the code not commenting algorithm..!!!

Korsila
0
 
smiskCommented:
Korsila,

  You can basically swap out my "matches()" function with a call to "strstr()" which I had no idea existed (you learn something new every day! :)).

Try doing this switch :

fp = fopen(argv[1], "r");
       if(fp != NULL) {
           while(fgets(data, 80, fp))
               if(matches(surname, data))
                   printf("%s", data);

Instead becomes :

fp = fopen(argv[1], "r");
       if(fp != NULL) {
           while(fgets(data, 80, fp))
               if(strstr(data, surname))
                   printf("%s", data);

If that works for you (it does for me), use it and remove the function declaration of "matches()" from the program.  Note that the parameters are reversed in the call to strstr()...

Thanks,
Steve
0
 
korsilaAuthor Commented:
Dear Smis or Steve,

here is what I have done to swap your matches() to strstr()

/* Revisited... :) */
#include <stdio.h>
#include <fcntl.h>
#include <strings.h>

/*
** Returns 1 if the *target contains *src
*/
int strstr(char *src, char *target) {
   int i;

   for(i = 0; i < strlen(target); i++)
       if(!strncmp(&target[i], src, strlen(src))) return(1);

   return(0);
}

int main(int argc, char *argv[]) {
   char surname[12];
   /* char data[4][15] = { "smith", "willsmith", "bobby", "willsmithson" }; */
   char data[80];
   int i;
   FILE *fp;

   if(argc == 2) {

       /*
       ** Read the surname to test
       */
       printf("Enter a surname [max 12 chars] (example : smi) : ");
       fgets(surname, 12, stdin);
       surname[strlen(surname) - 1] = '\0';

       /*
       ** Loop through the datafile printing matches
       */
       fp = fopen(argv[1], "r");
       if(fp != NULL) {
           while(fgets(data, 80, fp))
               if(strstr(data, surname))
                   printf("%s", data);

           return(0);
       }
       else {
           printf("Error accessing %s\n", argv[1]);
           return(1);
       }
       fclose(fp);

   }
   else {
       printf("Usage : %s <datafile>\n", argv[0]);
       return(1);
   }
}


however, it got error when I compiled the code:
line 10: Inconsistent type declaration: "strstr"
line 10: Inconsistent parameter list declaration for "strstr"


can you provide your new update code (whole program)
0
 
smiskCommented:
Korsila,

Nononono...  The whole point of using strstr() is that it is already a function defined in the library so you don't need to define it.  Try this :

/* Revisited again... :) */
#include <stdio.h>
#include <fcntl.h>
#include <strings.h>

int main(int argc, char *argv[]) {
  char surname[12];
  char data[80];
  int i;
  FILE *fp;

  if(argc == 2) {

      /*
      ** Read the surname to test
      */
      printf("Enter a surname [max 12 chars] (example : smi) : ");
      fgets(surname, 12, stdin);
      surname[strlen(surname) - 1] = '\0';

      /*
      ** Loop through the datafile printing matches
      */
      fp = fopen(argv[1], "r");
      if(fp != NULL) {
          while(fgets(data, 80, fp))
              if(strstr(data, surname))
                  printf("%s", data);

          return(0);
      }
      else {
          printf("Error accessing %s\n", argv[1]);
          return(1);
      }
      fclose(fp);

  }
  else {
      printf("Usage : %s <datafile>\n", argv[0]);
      return(1);
  }
}
0
 
korsilaAuthor Commented:
Smisk,
that's better...but whenn I test with a large datfile which copntained aprroximately 5000 names..the program returns nothing...

or this is the cause of problem:
while(fgets(data, 80, fp))
you set up only 80 names or something..????
can you make it unlimited...both input name and name in datafile...
can I change these 2 line or even get rid of them:
char surname[12];
 char data[80];

many thanks,
Korsila
 
0
 
korsilaAuthor Commented:
Smisk,
i have tried to change these 2 lines:
char surname[12];
char data[80];

to
char surname[120];
char data[500];

and while(fgets(data, 500, fp))
fgets(surname, 120, stdin);

however, the program still return nothing after i enter name like "smith" or "smi" from a large datfile whichj contained more than 5000 names...

what we should do next?
I thuink to get rid of the unlimited length of input name and names in datafile is sa solution..any suggestion ..we are almost there

korsila
0
 
korsilaAuthor Commented:
aha ha I have just found the real problem...

becus the input name is "lower case" but the names in datfile are upper case...!!!that's why the program couldn't generate any output...!!!

but is it going to work with a large datfile..??? please answer me..what i should do to set up an unlimited length of input name and datafile...(or maximum)
last question i hope..!!
many thanks,
Korsila
0
 
smiskCommented:
korsila,

  Hmm...  It works fine with a 75,000 line recordset for me.  Does the program print an error message or anything?  Keep in mind that this search IS case sensitive so "SMI" will not match "smi"!

Maybe you're on to something though.  Here's an explanation of those variables :

/*
** This is a character array of 12 characters that will
** hold the input.  This means that the input can only
** be 11 characters long at a maximum (one for the null).
** If you want to hold longer names, extend this as well
** as the "fgets(surname, 12, stdin);" to reflect the new
** length.
*/
char surname[12];

/*
** This is the maximum length of a line to read from
** the file.  It really shouldn't be a problem unless
** you have really long lines in the file...
*/
char line[80];

0
 
korsilaAuthor Commented:
aha ha I have just found the real problem...

becus the input name is "lower case" but the names in datfile are upper case...!!!that's why the program couldn't generate any output...!!!

but is it going to work with a large datfile..??? please answer me..what i should do to set up an unlimited length of input name and datafile...(or maximum)
last question i hope..!!
many thanks,
Korsila
0
 
korsilaAuthor Commented:
Smisk,
yeah , you are absolutely right , just a case sensitive though, I wil accept your answer anyway...!!!
weldone and a million thank for your great help...

Korsila
0
 
korsilaAuthor Commented:
glad to get your brilliant help..!!!
0
 
smiskCommented:
No problem...
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 16
  • 7
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now