Sequential Search Continued to Mr ozymandias other Expert welcomed

hi again,

1)i just showed the work to my supervisor. well he was happy about it. thanks to you.

now i need to extend the code. inorder to allow it to read
from a file automatically. i.e i sepcify the path of the file in the code(there many be more than one file to be read). the result should also include name of the file from which it obtained the data.

for example output : from file C:\TravelPlan1.txt
                              Match:graphics is defined..
                              Match:

2)The supervisor just commented that searching for a single word i.e graphics, is not enough. he said that i should extend it to search for Terms such as "Computer Graphics"

which means User Input = "computer graphics"?(am not restricted to only this term)

lets say the file contains

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

if user enters "Mobile Agent"

results should be :

From File XYZ.txt
Search results:

1) Mobile Agent can be  defined as an agent that transports itself and its execution state through the net.

2) Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.

From File TravelPLan.txt
Search Results:
1) ..
2)..





cancer_66Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

cancer_66Author Commented:
lets say the file contains

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

if user enters "Mobile Agent"

results should be :

From File XYZ.txt
Search results:

1) Mobile Agent can be  defined as an agent that transports itself and its execution state through the net.

2) Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.

From File TravelPLan.txt
Search Results:
1) ..
2)..


0
TimYatesCommented:
This isn't really a java problem...it's more of a Text Data Mining problem...

I'd suggest looking here:

http://citeseer.nj.nec.com/grobelnik98efficient.html

For some cool documents about data mining, and hierarchical text searching algorythms :-)
0
cancer_66Author Commented:
Waiting for reply from ozymandias . anyone else who wants to help is welcomed.
0
Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

cancer_66Author Commented:
guys this is the code that ozymandias helped me with.
now i need to extend it as i explained above. i guess ozymandias is busy. so anyone else would like to help me please.

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

    String keyword;

    String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
    String[] token2 = new String[]{" defined "," described "," delimited "};
    String[] token3 = new String[]{" as "," by "};
    String sentences[];

    public static void main(String[] args){
         if (args.length < 1){
              System.out.println("USAGE : DefintionChecker keyword file");
         }
         DefinitionChecker dc = new DefinitionChecker(args);
    }

    public DefinitionChecker(String[] args){
         try{
              sentences = getArrayFromFile(new File(args[1]));
         }catch(IOException ioe){
              System.out.println(ioe);
         }
         keyword = args[0].toLowerCase();
         for (int i = 0; i < sentences.length; i++){
              int pos = 0;
              //System.out.println("Checking : " + sentences[i]);
              if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                   if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                        if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                             if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                  System.out.println("\t\tMATCH : " + sentences[i]);
                             }else{
                                  //System.out.println("\t\t" + sentences[i] + " has no token3.");
                             }
                        }else{
                             //System.out.println("\t\t" + sentences[i] + " has no token2.");
                        }
                   }else{
                        //System.out.println("\t\t" + sentences[i] + " has no token1.");
                   }
              }else{
                   //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
              }
         }
    }

    private int containsToken(String[] tokens, String s, int pos){
         int tPosition = -1;
         for (int i = 0; i < tokens.length && tPosition == -1; i++){
              tPosition = s.indexOf(tokens[i],pos);
         }
         return tPosition;
    }

    private String[] getArrayFromFile(File f) throws IOException{
         FileReader reader = new FileReader(f);
         Vector sentences = new Vector();
         char[] cbuf = new char[1];
         String delimiter = "#";
         String sentence = "";
         String c = "";
         while (reader.read(cbuf) != -1){
              c = new String(cbuf);
              if (c.equals(delimiter)){
                   sentences.add(sentence);
                   sentence = "";
              }else{
                   sentence += c;
              }
         }
         String[] sentenceArray = new String[sentences.size()];
         sentenceArray = (String[])sentences.toArray(sentenceArray);
         return sentenceArray;
    }
}
0
cancer_66Author Commented:
Guys i really need help.
0
cancer_66Author Commented:
Guys i really need help.
0
cancer_66Author Commented:
Guys i really need help.
0
cancer_66Author Commented:
guys why do i feel iam ignored over here. i dont think its a difficult problem. wouldnt anyone help
0
ozymandiasCommented:
Sorry for the delay.
I had to go into hospital on Monay morning and I only just got back.
0
cancer_66Author Commented:
ok no problem. the main thing is that i got some reponse. iam glad you r back.

well i hope everything is fine with you ?
0
ozymandiasCommented:
Just to clarify.

1) You dont want to specify the file(s) to be searched you want to hard code them in the program.

2) You want to search multiple files and have the the file from which each match is retreived shown along with the output.

This is prety simple as long as your are sure about point 1) above.

Basically you would create an array file files and loop through each file calling the existing code. The you modify the output to include the filename where appropriate.

If you can confirm that this is what you want I can modify the code very quickly and easily for you.
0
cancer_66Author Commented:
1)hmm.. yeah that what i was told from my supervisor. that the files from which iam going to retrive the sentence should be hard coded. keep in mind i might have multiple files to look in. now lets assume 3 file but they might be more.

2) yes as i explained above the file from which each match is retrived should be shown with the output.

when this is done there is one more thing. if its fine with you..:)
0
cancer_66Author Commented:
yeah and the most important thing ! now the user input is no longer a single word such as "graphics" it should be a term like "Computer Graphics", "mobile agent " and so on.

0
ozymandiasCommented:
OK. That's fine.

In terms of our program so far there is no differnce between searching "graphics" and "Computer Graphics".
0
cancer_66Author Commented:
thats good news then. thanks.
0
ozymandiasCommented:
OK. I now have three files called abc.txt, xyz.txt and other.txt, the names of which I have hardcoded into the program. Below is a copy of each of the files and the new code.

To run it you must make sure that the txt files are in the same directory as the program or you will have to change the file names to include a full path so the program can find them.

Cheers.
0
ozymandiasCommented:
abc.txt
========

Graphics defined t t t is t as#Graphics was defined by the Science Acadamy in london#Graphics is the an important subject in the computer strand#Graphics is described as the art of drawing used in mathematics and engineering.#Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.#

xyz.txt
=======

graphics defined is by nonsense#Graphics is delimited by science#Graphics is defined as science#Mobile agents can sometimes be described as intelligent agents.#Graphics be defined as the science of calculating by diagrams.#

other.txt
=========
Graphics are described as picturesPictures are described as Graphics#Mobile Agents are often described as robots or bots.#Graphics is defined as a pictorial computer output produced on a display screen plotter or printer#Graphics is purely delimited by science.#
0
ozymandiasCommented:
//new code

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};

     String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
     String[] token2 = new String[]{" defined "," described "," delimited "};
     String[] token3 = new String[]{" as "," by "};
     String sentences[];

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
          }
          DefinitionChecker dc = new DefinitionChecker(args);
     }

     public DefinitionChecker(String args[]){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = args[0].toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    int pos = 0;
                    //System.out.println("Checking : " + sentences[i]);
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                         if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                              if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                   if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                        if (!fileMentioned){
                                             System.out.println("Matches found in " + file);
                                             fileMentioned = true;
                                        }
                                        System.out.println("\t\tMATCH : " + sentences[i]);
                                   }else{
                                        //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                   }
                              }else{
                                   //System.out.println("\t\t" + sentences[i] + " has no token2.");
                              }
                         }else{
                              //System.out.println("\t\t" + sentences[i] + " has no token1.");
                         }
                    }else{
                         //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                    }
               }
          }
     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
cancer_66Author Commented:
ok. ill try it now.

thanks
0
cancer_66Author Commented:
hmm..there is a problem

you see even if the user enters "Computer Grap", "Computer Graphoc"

the result is still printed??

this should be the case. if user enter "Computer Graphics" with the correct spelling then it should print the results.



0
cancer_66Author Commented:
you see even if i type

java DefintionChecker a

it also prints the result?

it doesnt check for the whole term.



0
cancer_66Author Commented:
i think the problem that when the argument is read. even if it was one letter it is compared with the whole sentence if it is equal to one of the sentence letter then it is printed.

0
cancer_66Author Commented:
try this as a test file

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

only when the user enters "Computer Graphics" or "Mobile Agent" results should be print otherwise they shouldnt.

0
ozymandiasCommented:
How could that work ?
How do I decide what is a correct searchable term and what isn't ?
Unless you are going to provide a complete list of all the words or terms that a user can search for, or a minimum length in characters, or a list of all the words the user is not allowed to search for etc........
0
cancer_66Author Commented:
i think the input should be tokenised

computer,graphics

as well as the sentences,

and then they are compared
0
ozymandiasCommented:
>>only when the user enters "Computer Graphics" or "Mobile Agent" results should be print otherwise they shouldnt.

Are you saying that the only terms that the user should ever be allowed to search for are "Computer Graphics" and "Mobile Agent" ?
0
ozymandiasCommented:
>>i think the input should be tokenised
>>
>>computer,graphics

Yes but what does that mean ?

computer,grpahics = computer followed by graphics
computer and graphics = anywhere in the same sentence
computer or graphics = anywhere in the same sentence



0
cancer_66Author Commented:
lets say we have the sentence.

Computer Graphics is purely defined as a field in science

we take the input from the user = Computer Graphics

tokenize it so we have

Computer
Graphics

we read the file whenever we meet "#" we put it in a array then tokenize it

so we would have

Computer            
Graphics
purely
is
Defined
as
a
field
in
Science

firstly check for the MainMarker[]={"defined","described","delimeted"}

if found in sentence THEN

we secondly compare

first token of U.Input with first token of sentence
Computer compared with Computer (match)
then
Graphics compared with Graphics (match)

then (take one element from Token1[] and compare it with a token from sentence)

is compared with is (match)

then (take another from Token2[])
defined compared with purely (no match)

when there is no match we keep on comparing with the pattern "defined" which is from Token2 till we find a match in the sentence if we didnt then nothing is printed

then (again compare with element "defined since in last attempt there was no match)
defined compared with defined (match)

then (from Token3[] compared with remaining sentence token
as compared with as (match)

since User input as well as the pattern from Token1[] and Token2[] and Token3[] matched now print results.

0
cancer_66Author Commented:
no no the user should be allowed to search for any term he is not restricted to "Mobile Agent" or "Computer Graphics"

Computer Followed by Graphics, for the time being,
0
cancer_66Author Commented:
to clearify things futher. the whole point of the search is to retrive specific information.

in our example the user is looking for defintion of a specific term such as computer graphic, mobile agent, Expert Exchange, and so on...

0
ozymandiasCommented:
>>Computer Followed by Graphics, for the time being.

If that's the case then why bother with all this messy tokenization of the sentence. Either the sentence contains "computer graphics" or it doesn't. I don't the the value in comparing it word by word.

>>no no the user should be allowed to search for any term he is not restricted to "Mobile Agent" or "Computer Graphics"

So, there for "a" is a valid search term and so is "computar griphocs" unless you want to build a spell checker into the program.

>>to clearify things futher. the whole point of the search is to retrive specific information.

Yes, and the definition of specific information is any sentence in any of the files that contains both the user-specified search word(s) (e.g. "computer graphics") and one of the combinations of three token words (such as "is defined as").

That is pretty much exactly what it does now.
0
cancer_66Author Commented:
note that the following sentence wont be retrived

Computer Graphic defined X Y is Z as science

U.input = computer graphics

tokenize it

computer
graphics

Sentence is tokenized

computer
graphics
defined
X
Y
is
Z
as science

first element of U.input with first token of file
Computer           with             Computer (match)

second element of U.input with second token of file
Graphics           with             Graphics (match)

then first element from Token1[] with third element file

is                  with                defined (no match)

then
since no match keep comparing with "defined" till we find a match

then still first element of Token1[] with fourth elem.file
is                  with                 X (no match)

then still first element of Token1[] with fifth elen.file
is                  with                 Y (no match)

then still first element of Token1[] with sixth elen.file
is                  with                 is (match)

since there is a match go to Token2[] take first element and compare.

then first element of Token2[] with seventh elem.file
defined             with           Y (no match)

since no match keep on comparing with "defined" till match is found.

then still first element of Token2[] with eighth elem.file
defined             with               as (no match)

then still first element of Token2[] with nineth elem.file
defined             with                 is (no match)

therefore nothing is printed!


hope its clear where iam heading
0
cancer_66Author Commented:
well iam really sorry but this is how the supervisor told me it should be done.

0
cancer_66Author Commented:
see if the user enters "comptear graphise"

there should be any results since all out sentences startes with "computer graphics"

and this can only be done by tokenizing the sentence and user input as well as patterns.

unless there is another way which i dont know ~
0
cancer_66Author Commented:
i meant "there shouldnt be any .."
0
cancer_66Author Commented:
i have to leave now. its 1:30am over here. ill be online tomorrow.

please help me out

thanks
0
ozymandiasCommented:
>>see if the user enters "comptear graphise"
>>there shouldn't be any results since all out sentences startes with "computer graphics"

Yes. And using the program as writtenm so far there aren't any.

>>note that the following sentence wont be retrived
>>Computer Graphic defined X Y is Z as science

No. It isn't retreived because it doesn't match.
The tokens defined, is and as appear in the wrong order.

As far as I can see the program does exactly what it is supposed to, but your "supervisor" is not happy about the way it does it, which seems crazy because the way your "supervisor" is suggesting is a lot less efficient unless there is some other thing its supposed to do that you haven't mentioned.
0
cancer_66Author Commented:
mr ozymandias i just tried the following

java DefinitionChecker computer graphc

and it still found matches.

i need to do it with the tokeniser. well yes this is just a part of my project. its not the whole thing.

will you help me ?
0
ozymandiasCommented:
cancer_66, of course it found matches.

It found all the sentences with the word computer in them.

When you run

    java DefinitionChecker computer graphc

computer becomes args[0], graphc becomes args[1].

Our program searches for args[0].

If you want it to search for computer graphc you either have to run

    java DefinitionChecker "computer graphc"

so that args[0] becomes computer graphc or you have to chnage the code so that all the values args[] are concatenated into a single search string.

I can do it either way.
0
ozymandiasCommented:
import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};

      String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
      String[] token2 = new String[]{" defined "," described "," delimited "};
      String[] token3 = new String[]{" as "," by "};
      String sentences[];

      public static void main(String[] args){
            if (args.length < 1){
                  System.out.println("USAGE : DefintionChecker keyword");
                  System.exit(1);
            }
            String s = "";
            for (int i = 0;i < args.length;i++){
                  s = s + args[i] + " ";
            }
            s = s.trim();
            DefinitionChecker dc = new DefinitionChecker(s);
      }

      public DefinitionChecker(String s){

            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  for (int i = 0; i < sentences.length; i++){
                        int pos = 0;
                        //System.out.println("Checking : " + sentences[i]);
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                              if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                                    if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                          if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                                if (!fileMentioned){
                                                      System.out.println("Matches found in " + file);
                                                      fileMentioned = true;
                                                }
                                                System.out.println("\t\tMATCH : " + sentences[i]);
                                          }else{
                                                //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                          }
                                    }else{
                                          //System.out.println("\t\t" + sentences[i] + " has no token2.");
                                    }
                              }else{
                                    //System.out.println("\t\t" + sentences[i] + " has no token1.");
                              }
                        }else{
                              //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                        }
                  }
            }
      }

      private int containsToken(String[] tokens, String s, int pos){
            int tPosition = -1;
            for (int i = 0; i < tokens.length && tPosition == -1; i++){
                  tPosition = s.indexOf(tokens[i],pos);
            }
            return tPosition;
      }

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  if (c.equals(delimiter)){
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        sentence += c;
                  }
            }
            String[] sentenceArray = new String[sentences.size()];
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }
}
0
cancer_66Author Commented:
ok iam with you

but the problem is that its comparing chracter by chracter

there for when i enter

java DefintionChecker "a"

it find matches since the letter "a" exsists in the sentence. which is irrelevant
0
cancer_66Author Commented:
it it difficult to do it the way my supervisor wants it. the problem is that i need to make him happy. and he wants me to do it in that manner.

the only problem with this code is that it is comparing character by character.

for example when i enter

java DefinitionChecker z

there are no results since my sentences does not have the letter "z" but it it did it would have been printed
0
cancer_66Author Commented:
it would have been exactly as i wented. if the above problem wasnt there :(
0
cancer_66Author Commented:
are u there ?
0
cancer_66Author Commented:
are you going to help me with this
0
cancer_66Author Commented:
a got a question. how can i put all the patterns
such as    

String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
String[] token2 = new String[]{" defined "," described "," delimited "};
String[] token3 = new String[]{" as "," by "};

now in my project iam going to have alot of patterns

therefore the supervisor sugested putting all the patterns in one file, compiling it then importing it in the main program..

now how is that done ?
0
ozymandiasCommented:
Sorry...was busy with some other stuff.

OK. You have two outstanding issues here.

1. The programme works OK but you dont like the fact that it will search for things like "a" and "z". The best thing to do would be to set a minimum length on the search term. Let's say it must be three or more characters. That is very simple to do.

2. The current set of patterns is likely to be much more complex. OK, that is failry simple to achieve too. Currently we have three arrays of values. These could be stored in three files called Token1.txt, Token2.txt and Token3.txt. In the same way that we read through the current text files and make them into arrays of sentences we could easily read through text files containing the token words and make them into arrays.

You have to bear in mind however that reading text files (i.e. disk IO) is quite time intensive and the files would have to be read every time the program was run. Unless there are an awful lot of these token words I would consider trying to keep then in the program.
0
cancer_66Author Commented:
no problem.

1)thats a good idea. yeah i think that would make sense. so now i have to talk to the supervisor and decide on the minimum length. ok lets assume its 3characters.

2) well no that would be a very inefficent way to deal with it. what he meant is like to have all the patterns in one file. where i have to compile it and then import it

as "import Patterns;

0
cancer_66Author Commented:
whenever you are free can you answer my question.

0
ozymandiasCommented:
OK for item 1 I have revised the main() method as below :

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
ozymandiasCommented:
For item 2 we need to think about the format of the file that will store the patterns.

One option would be something like this :

is
are
was
be
can be
#
defined
described
delimited
#
as
by
#

You then have a function that reads the file line by line.
Each line is put into the token1 array until we hit a #, the we start filling token2 array until we hit a # and we start filling token3 array until we hit a # which means we are at the end if the file.
0
ozymandiasCommented:
OK. Here is some new code.

First id the contents of a file called patterns.txt :

is
are
was
be
can be
#
defined
described
delimited
#
as
by
#


and next the new code that uses it :
0
ozymandiasCommented:
import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
      String patternFile = "patterns.txt";

      String[] token1;
      String[] token2;
      String[] token3;

      String sentences[];

      public static void main(String[] args){
            if (args.length < 1){
                  System.out.println("USAGE : DefintionChecker keyword");
                  System.exit(1);
            }
            String s = "";
            for (int i = 0;i < args.length;i++){
                  s = s + args[i] + " ";
            }
            s = s.trim();
            if (s.length() < 3){
                  System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
                  System.exit(1);
            }
            DefinitionChecker dc = new DefinitionChecker(s);
      }

      public DefinitionChecker(String s){

            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                        importPatterns(patternFile);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  for (int i = 0; i < sentences.length; i++){
                        int pos = 0;
                        //System.out.println("Checking : " + sentences[i]);
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                              if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                                    if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                          if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                                if (!fileMentioned){
                                                      System.out.println("Matches found in " + file);
                                                      fileMentioned = true;
                                                }
                                                System.out.println("\t\tMATCH : " + sentences[i]);
                                          }else{
                                                //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                          }
                                    }else{
                                          //System.out.println("\t\t" + sentences[i] + " has no token2.");
                                    }
                              }else{
                                    //System.out.println("\t\t" + sentences[i] + " has no token1.");
                              }
                        }else{
                              //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                        }
                  }
            }
      }

      private int containsToken(String[] tokens, String s, int pos){
            int tPosition = -1;
            for (int i = 0; i < tokens.length && tPosition == -1; i++){
                  tPosition = s.indexOf(tokens[i],pos);
            }
            return tPosition;
      }

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  if (c.equals(delimiter)){
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        sentence += c;
                  }
            }
            reader.close();
            String[] sentenceArray = new String[sentences.size()];
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }

      private void importPatterns(String filename) throws IOException{
            File file = new File(filename);
            BufferedReader reader = new BufferedReader(new FileReader(file));
            String line;
            Vector v1 = new Vector();
            Vector v2 = new Vector();
            Vector v3 = new Vector();
            int hashCount = 0;
            while ((line = reader.readLine()) != null && hashCount < 3){
                  if (line.equals("#")){
                        hashCount++;
                  }else{
                        switch (hashCount){
                              case 0:
                                    v1.add(line);
                                    break;
                              case 1:
                                    v2.add(line);
                                    break;
                              case 2:
                                    v3.add(line);
                                    break;
                        }
                  }
            }
            reader.close();
            token1 = new String[v1.size()];
            token1 = (String[])v1.toArray(token1);
            token2 = new String[v2.size()];
            token2 = (String[])v2.toArray(token2);
            token3 = new String[v3.size()];
            token3 = (String[])v3.toArray(token3);
      }
}
0
cancer_66Author Commented:
hello.

iam really sorry. i didnt reply i got busy with my family. thanks for answering

ill just test it
0
cancer_66Author Commented:
well for the second question. the problem is that the superisor told me that he wants to keep the business of having 3 arryay token1[],token2[],token3[]. but he said that he is thinking of keeping it in a different file. lets say it has some sort of method. and then after compiling it i just have to import the file to the main program as "import patterns"

but ill have to clearify more with him. iam not sure if u understood what iam trying to say,
0
cancer_66Author Commented:
hmm is it possible to put "private void importPatterns(String filename)" in a different file and let say i have to compile (importPattern.java) so that it does all the processing of the patterns and puts them in 3 arrays

token1[],token2[],token3[]

which is them imported (import importPatterns;) to the main program which is "DefinitionChecker.java"

iam really sorry am trying my best to clearify things.
0
cancer_66Author Commented:
1)two comments i added above

2)can i restrict the user to enter a Term "Mobile Agents" i.e the input must consist of two words with lengths not less then 3characters?

answer me whenever you are free ill be waiting,
0
cancer_66Author Commented:
(ill be gone for 2hours but ill be back soon)
whenever u can answer my questions

thanks
0
ozymandiasCommented:
1)I can understand wanting to keep the token data in a separate file (e.g. patterns.txt), but it makes absolutely no sense to have the importPatterns function in another file. Basically, you are talking about writing a whole separate class.

Let's say you create a new class in a new file called PatternBuilder.java which creates PatternBuilder.class.

To get your patterns you would then have to create an instance of PatterBuilder abd call some method to get the Token arrays. You would probably have to make 3 separate calls to get all 3 arrays. That makes no sense at all.

The other possibility is that you have are creating the PatternBuilder class because you want to keep all the patterns hardcoded (i.e.not read from pattrens.txt) but you want to keep all that clutter out of the main program. That makes some sense but you would have to clarify exactly that that was what you were trying to do.
0
ozymandiasCommented:
2) You can restrict the user to any type of input you want. You are saying that the search pattern must be :

    xxx yyy

i.e. two groups of a minimum of three characters separated by a space.

Does this mean that the user cannot ever search for a single word like "graphics", or is it OK to search for 1 word as long as it is more than 3 characters.
0
cancer_66Author Commented:
1) ok what do you suggest doing. basically yes i want all the patterns hardcoded rather then reading them from file..etc. i did suggest to the supervisor doing exatcly what you have provided (keeping all the patterns in file reading them ..etc) but he said that its not an effiencent way to do it that way. he talked about having a new class which builds the patterns where i simply have to import it in my main program (import PatternBuilder;)

can you please explain the possible solution. it wasnt really clear for me.

2)yes the supervisor argued that i should be searching for Terms rather than a single word and yes the search pattern must be:

xxx yyy


0
cancer_66Author Commented:
answer me whenever you are free please.

thanks
0
ozymandiasCommented:
OK. I am at work today, but I will try to look into this in the next few hours.
0
cancer_66Author Commented:
no problem. thanks alot for your help.
0
ozymandiasCommented:
OK. There are now two classes.

The first is still DefinitionChecker but simplified.
The new class is called PatternMatcher.
This conatins all the patterns in its arrays and has the functions for searching for those patterns in strings passed to it.
0
ozymandiasCommented:
/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
     String sentences[];
     PatternMatcher pm = new PatternMatcher();

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }

     public DefinitionChecker(String s){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = s.toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i])){
                         if (!fileMentioned){
                              System.out.println("Matches found in " + file);
                              fileMentioned = true;
                         }
                         System.out.println("\t\tMATCH : " + sentences[i]);
                    }
               }
          }
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          reader.close();
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
ozymandiasCommented:
/*
* PatternMatcher.java
*
*/

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s){
          int pos = 0;
          if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
               if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                         return true;
                    }else{
                         //System.out.println(s + " contains no token3");
                    }
               }else{
                    //System.out.println(s + " contains no token2");
               }
          }else{
               //System.out.println(s + " contains no token1");
          }
          return false;
     }

}
0
cancer_66Author Commented:
thanks alot in advance. unfortunatly iam not home in order to test the program

1)however can you just explain how can i run it ?
2)does the DefinitionChecker take a search pattern of

xxx yyy ?


0
ozymandiasCommented:
1) You just compile DefintionChecker.java and that will automatically compile PatternMatcher.java as well.

Then run the program in exactly the same way :

java DefinitionMatcher keyword(s)

2) I haven't done that yet because I ran out of time. It should only take a couple of mintes though so I may have time to post it later.
0
ozymandiasCommented:
Here is the revised main method of the DefinitionChecker class. It now makes sure that easch component word of the search phrase is 3 characters or more.

However, it will still allow 1 word on its own.
Is that OK, or do you want to force the user to provide at least two words of three characters ?


     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword(s)");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
cancer_66Author Commented:
ok bro. ill just test the code now and give you the comments. thanks a million,
0
cancer_66Author Commented:
1)hmm ok it works fine so far. ill test it more and give you the feedback.

2)yes please it should allow one word on its own. Terms are only accepted.

thanks
0
cancer_66Author Commented:
its 2am here.ill  talk to you tommorrow.

thanks for all the help
0
cancer_66Author Commented:
hello
please answer me whenever u r free
0
cancer_66Author Commented:
there is a mistake in question (2)

2)yes please it should not allow one word on its own. Terms are only accepted.
0
cancer_66Author Commented:
3) lets say i wanted to as the pattern "is the"

i.e computer graphics is the field of computer science

therefore i should:-
 
add "is" to token1[]
add "the" to token2[] (this is also the mainmarker)
i have nothing to add in token3[]

hmm so would it work?

thanks

0
ozymandiasCommented:
OK.

I have revised the main method again so now itb wants a minimum of two words.

     public static void main(String[] args){
          if (args.length < 2){
               System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
ozymandiasCommented:
3) At the moment the PatternMatcher will only find three word patterns, but it could be modified to find two word patterns as well.
0
cancer_66Author Commented:
1)thanks alot. ill just test the new main.
2)i think its better to modify the PatternMatcher, is it difficult to do so?
0
ozymandiasCommented:
1)OK
2) Modifying the PatternMatcher to look for different patterns is quite easy, however before we make any more changes I would suggest that you think about all the different patterns and types of pattern you might want to search for so that we can make one lot of changes and optimize the PatternMatcher to make it sufficiently flexible to cope with all your needs.
0
cancer_66Author Commented:
1)ok i totally agree with you. therefore for this i have to wait till tomorrow, so that i could meet with my supervisor and discuss all the possibilities.
0
ozymandiasCommented:
OK.
0
cancer_66Author Commented:
2)lets say i want to add another option to the user to choose between

a)Sequential search (current code-already completed)
b)Strict Sequential search

now what i mean by strict sequential is that all the patterns i.e "is defined as,was described such as..etc"
have to come one after another.

example

i)graphics is defined as a fielld in cs (printed in strict sequential)
ii)graphics is purely defined as a field in cs(not printed notice pattern "is defined as" not one after another.

hmm hope i explained it properly?



0
cancer_66Author Commented:
answer me.whenevr u r free please
0
cancer_66Author Commented:
answer me.whenevr u r free please
0
cancer_66Author Commented:
ill be waiting for your reply
0
ozymandiasCommented:
Done.

I have modified the code to do the above.
You can now specify a -s argument to the program.

e.g. if I run :

    java DefintionChecker mobile agents


I get :

Matches found in abc.txt
                MATCH : Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with
services on the user's behalf.
Matches found in xyz.txt
                MATCH : Mobile agents can sometimes be described as intelligent agents.
Matches found in other.txt
                MATCH : Mobile Agents are often described as robots or bots.

However, if I run :

    java DefintionChecker mobile agents -s

I get :

Matches found in abc.txt
                MATCH : Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with
services on the user's behalf.
Matches found in xyz.txt
                MATCH : Mobile agents can sometimes be described as intelligent agents.
0
ozymandiasCommented:
The -s argument can be used anywhere, i.e.

    java DefintionChecker mobile agents -s

will work, and so will

    java DefintionChecker -s mobile agents
0
ozymandiasCommented:
/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
     String sentences[];
     PatternMatcher pm = new PatternMatcher();

     public static void main(String[] args){

          boolean strict = false;

          if (args.length < 2){
               System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].equalsIgnoreCase("-s")){
                    strict = true;
                    continue;
               }
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          DefinitionChecker dc = new DefinitionChecker(s,strict);
     }

     public DefinitionChecker(String s, boolean strict){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = s.toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i],strict)){
                         if (!fileMentioned){
                              System.out.println("Matches found in " + file);
                              fileMentioned = true;
                         }
                         System.out.println("\t\tMATCH : " + sentences[i]);
                    }
               }
          }
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          reader.close();
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
ozymandiasCommented:
/*
* PatternMatcher.java
*
*/

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s, boolean strict){

          if (strict){
               String[] words = s.split(" ");
               for (int i = 0; i < words.length - 2; i++){
                    if (containsToken(token1,words[i],0) > -1 && containsToken(token2,words[i+1],0) > -1 && containsToken(token3,words[i+2],0) > -1){
                         return true;
                    }
               }
          }else{
               int pos = 0;
               if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                         if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                              return true;
                         }else{
                              //System.out.println(s + " contains no token3");
                         }
                    }else{
                         //System.out.println(s + " contains no token2");
                    }
               }else{
                    //System.out.println(s + " contains no token1");
               }
          }
          return false;
     }

}
0
cancer_66Author Commented:
Thanks a alot ozymandias i appricate your help. ill test it right away.

0
cancer_66Author Commented:
hmm i get an error
C:\aglets\public\Expert Exchange\DefinitionChecker7\PatternMatcher.java:27: cannot resolve symbol
symbol  : method split  (java.lang.String)
location: class java.lang.String
              String[] words = s.split(" ");
                                ^
1 error

0
cancer_66Author Commented:
sorry they are acually two.

C:\aglets\public\Expert Exchange\DefinitionChecker7\DefinitionChecker7.java:46: containsPattern(java.lang.String,boolean) in PatternMatcher cannot be applied to (java.lang.String)
                   if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i])){
                                                                             ^
C:\aglets\public\Expert Exchange\DefinitionChecker7\PatternMatcher.java:27: cannot resolve symbol
symbol  : method split  (java.lang.String)
location: class java.lang.String
              String[] words = s.split(" ");
                                ^
2 errors
0
cancer_66Author Commented:
sorry again.

its only one error. the one i posted first?
0
cancer_66Author Commented:
sorry again.

its only one error. the one i posted first?
0
cancer_66Author Commented:
sorry again.

its only one error. the one i posted first?
0
cancer_66Author Commented:
note iam using jdk1.3.1

0
cancer_66Author Commented:
ill be waiting 4 ur answer.
0
ozymandiasCommented:
Sorry, I forgot about the JDK version.
You cannot use the split method.
Here is the same code usinbg a StringTokenizer.

/*
* PatternMatcher.java
*
*/

import java.util.StringTokenizer;

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s, boolean strict){

          if (strict){
               int token = 0;
               StringTokenizer st = new StringTokenizer(s);
               String[] words = new String[st.countTokens()];
               while (st.hasMoreTokens()){
                    words[token++] = st.nextToken();
               }
               for (int i = 0; i < words.length - 2; i++){
                    if (containsToken(token1,words[i],0) > -1 && containsToken(token2,words[i+1],0) > -1 && containsToken(token3,words[i+2],0) > -1){
                         return true;
                    }
               }
          }else{
               int pos = 0;
               if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                         if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                              return true;
                         }else{
                              //System.out.println(s + " contains no token3");
                         }
                    }else{
                         //System.out.println(s + " contains no token2");
                    }
               }else{
                    //System.out.println(s + " contains no token1");
               }
          }
          return false;
     }

}
0
cancer_66Author Commented:
sorry i  wasnt at my seat.ill just test it right away.

why cant i use the split method is it because of the JDK version.
0
cancer_66Author Commented:
perfect its working fine. so far ill test it more. :)

thanks alot.
0
cancer_66Author Commented:
ill talk to u tomorrow. thanks 4 everything
0
ozymandiasCommented:
Yes. The split method of string was only introduced in JDK 1.4.
0
cancer_66Author Commented:
hello. again.

1)i showed the code "DefintionChecker" + "PatternMatcher" to my supervisor. he was happy with the output.

but he oen comment which is. he asked if i could just compile the "patternMatcher" and import it in "DefinitionChecker" as "import PatternMatcher; ? he wants it that way.

2)i also discussed the modification of the PatternMatcher in order to look for different patterns. he said to keep it on hold till i meet him next. because he is ot sure himself.

but he did say something

1)like having a data structure to indicate how many lists we need. i.e Token1[],..etc
2)Data structure for the lists itself i.e Token1[] ={"is"} , Token2 ={"the"}

i think basically he wants to have different patterns of lists.

for example:-

DefintionPatterns

Pattern1:- consists of Token1[],Token2[]
Pattern2:- consists of Token3[],Token3[],Token4[]
Pattern3:- consists of Token6[],Token7[],Token[]8,Token9[]
.
.
.

while
Token1[]:- consists of {"is"}
Token2[]:- consists of {"the"}

Token3[]:- consists of {"is","was","can be"}
Token4[]:- consists of {"defined","described",delimited"}
Token5[]:- consists of {"as","by","such as", "as"}

Token6[]
.
.
.
(iam just giving you an idea of what he told me) so what do you think?




0
cancer_66Author Commented:
whenever you have the time answer me. ill be waiting

thanks
0
ozymandiasCommented:
1) You have already compiled PatternMatcher and you don't need to import it. It works without an import statement. Why would he weant to import something when it works without importing ? In order to import it you would have to put it in a named package.

2) I can see how this could be done. You would have to create a couple of new classes.

a) WordList : this would just be a list of words,

e.g.  
    is, was, can be
or
    described defined delimited

b) WordPattern : this would be a set of word lists in a particular order.

A PatternMatcher could then contain a set of word patterns, all of which is checks.

I will produce a prototype and post it later if I have time.
0
cancer_66Author Commented:
1)hmm well thats the way he wants it. even though i dont see the need for that.

basically iam going to use the "Sequential Search + strict sequential" along with my Mobile Agent software(Aglets). which will have to Travel Along a number of computers on a LAN do some MAtching and finaly display the message of the user screen.

if i wanted to do what he as told me what changes should i do ?

2)yeah it might be something like that. ok thanks. whenever u r free. ill wait
0
cancer_66Author Commented:
3)
0
ozymandiasCommented:
OK.

I have a couple of questions.

1) Is this part of an school project ? Is this "supervisor" your boss or your teacher or what ? Are you supposed to be learning Java ? I don't mind you using this for learning but I don't want you to get into to trouble if you are supposed to be doing this yourself.

2) Adding this extra code to have multiple complex patterns is quite a big peice of work. I don't mind doing it because it's quite interesting, but you will have to think about awarding some more points soon.
0
ozymandiasCommented:
One last thing....

this code is now getting quite big and complex so could you please email me your email address and I will email you the code and source files etc rather than pasting them here, because it is making the thread very long.
0
cancer_66Author Commented:
1)m_alkhamis@hotmail.com
2)no my supervisor is my teacher at college. this one of the parts of my project. there wont be any problems. thanks
3)about the points dont worry. ill award you points and open up another thread, whatever suits you.

4)can you please help me out with some documentation, its make my life easier in understanding. thats if you dont mind. please
0
ozymandiasCommented:
1) OK. I will mail you soon.
2) OK.
3) Thanks.
4) Documentation for what ? Are you saying you want me to add comments to the code and also stuff suitable for javadoc ?

Let me know.
0
ozymandiasCommented:
I have mailed you the new code and files.
0
cancer_66Author Commented:
sorry i wasnt at my seat. iam back now ill just check the mail. and give you the feedback

4)yes i mean comments. so its easier for me to follow. if u dont mind.
please
0
cancer_66Author Commented:
hmm ok i tested it its working perfectly. so far. ill test it with more texts.

ill try adding more patterns like "is the" "consists of" ..etc

thanks for your help.

answer me whenevr u r free.
0
cancer_66Author Commented:
1)i managed to add a two new patterns "can be defined as","can be defined as"

but could not add the pattern "is the"?

i added the sentence :- mobile agents is the future of e-commerce

but didnt work? maybe iam doing something wrong
0
cancer_66Author Commented:
1)i managed to add a two new patterns "can be defined as","can be defined as"

but could not add the pattern "is the"?

i did the following

private static WordList list4 = new WordList("foo,bar,buzz,is",",");
private static WordList list5 = new WordList("token,the",",");

i added the sentence :- mobile agents is the future of e-commerce

but didnt work? maybe iam doing something wrong

0
cancer_66Author Commented:
ok it worked . i did exactly what i told u up there.

acually when i checked the sub-directory "definitions"
it contained "PatternMatcher.class,..etc" and inside the sub-directory "definitions" there was another sub-directory called "definitions again?

so i deleted all the classes and duplicate sub-directory. and compiled "definitionchecker.java" it work.
0
ozymandiasCommented:
OK.

I will add some comments to the code and send you the updates.

Let me know if there are any problems in your tests.
0
cancer_66Author Commented:
no problems so far, thanks alot.

0
ozymandiasCommented:
Code with comments sent as requested.

I think I have answered this question pretty fully now.

Please award points and we can continue this in another thread if necessary.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
cancer_66Author Commented:
ok . ill award points and open a new thread. incase i need help.
0
cancer_66Author Commented:
ok one last question before i end this thread and open a new one.

1)whenever i want to add a new pattern which consists of two word i should add it in Token4,Token5
otherswise Token1,2,3

correct?
-----------------------------------------------------
correct way of adding pattern?

private static WordList list4 = new WordList("foo,bar,buzz,is",",");

or

private static WordList list4 = new WordList("foo,bar,buzz","is",",");

why is the "," at the end ?

i know i ask stupid questions sorry
0
cancer_66Author Commented:
i opened up a new thread called (Search 2:- For Mr ozymandias )

thanks for your help
0
cancer_66Author Commented:
answer me when u r free.
0
cancer_66Author Commented:
you can answer my question in the new thread . ill award points here ,
0
cancer_66Author Commented:
thanks alot for everything. you deserve more than execllent
0
ozymandiasCommented:
For completeness and PAQ value the entire code will be posted here :

/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;
import definitions.PatternMatcher;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
      String sentences[];
      PatternMatcher pm = new PatternMatcher();

      public static void main(String[] args){

            boolean strict = false;

            // first make sure that we have at least two arguments
            if (args.length < 2){
                  System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
                  System.exit(1);
            }
            String s = "";
            // now lets check what the arguments are
            for (int i = 0;i < args.length;i++){
                  //if any of them are -s then we are in stric mode
                  if (args[i].equalsIgnoreCase("-s")){
                        strict = true;
                        continue;
                  }
                  // make sure they are all 3 chracaters or longer
                  if (args[i].length() < 3){
                        System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                        System.exit(1);
                  }
                  // concatenate the arguments into one search string
                  s = s + args[i] + " ";
            }
            s = s.trim();
            // finally instantiate a DefinitionChecker and pass it the string and tell it whether to be stric or not
            DefinitionChecker dc = new DefinitionChecker(s,strict);
      }

      /**
      *
      * Constructor for the DefinitionChecker
      *
      */

      public DefinitionChecker(String s, boolean strict){

            // loop through each file in the list of files
            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        // get all the sentences
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  // loop through all the sentences
                  for (int i = 0; i < sentences.length; i++){
                        // if any sentence conatins the keyword and matches any of the patterns specified in the PatternMatcher
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.matches(sentences[i],strict)){
                              // if this is the first match found in this file
                              if (!fileMentioned){
                                    // output the file name
                                    System.out.println("Matches found in " + file);
                                    fileMentioned = true;
                              }
                              // say we found a match
                              System.out.println("\t\tMATCH : " + sentences[i]);
                        }
                  }
            }
      }

      /**
      *
      * GetArrayFromFile
      *
      * This function reads a specified file and breaks the contents into
      * and array of strings (sentences) using the # character as a delimiter
      *
      */

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            // read the file character by character
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  // if the chracter is a delimiter (#)
                  if (c.equals(delimiter)){
                        // add the sentence to the Vector and start a new blank sentence
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        // otherwise just add the character to the current sentence string
                        sentence += c;
                  }
            }
            reader.close();
            String[] sentenceArray = new String[sentences.size()];
            // convert the Vector to an array and return it
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }
}
0
ozymandiasCommented:
/*
* PatternMatcher.java
*
*/

package definitions;

import java.util.StringTokenizer;
import java.util.Vector;

public class PatternMatcher{

      /*
      *
      * These are some static WordLists which can be used to create
      * the WordPatterns that this PatterMatcher will use
      *
      */
      private static WordList list1 = new WordList("is,was,are,be",",");
      private static WordList list2 = new WordList("described,defined,delimited",",");
      private static WordList list3 = new WordList("as,by",",");
      private static WordList list4 = new WordList("foo,bar,buzz",",");
      private static WordList list5 = new WordList("token",",");

      private Vector patterns;

      /**
      *
      * Constructor for the PatternMatcher. This adds the
      * WordPatterns to the PatternMatchers list of patterns
      * ready for matching.
      *
      */
      public PatternMatcher(){
            // create the vector to store our WordPatterns
            patterns = new Vector();

            // create a WordPattern
            WordPattern pattern1 = new WordPattern();
            // add the appropriate WordLists
            pattern1.addList(list1);
            pattern1.addList(list2);
            pattern1.addList(list3);
            // add the WordPattern to the vector
            patterns.add(pattern1);

            // create another WordPattern
            WordPattern pattern2 = new WordPattern();
            // add the appropriate WordLists
            pattern2.addList(list4);
            pattern2.addList(list5);
            // add the WordPattern to the vector
            patterns.add(pattern2);

      }

      /**
      *
      * This is just a function for adding WordPatterns
      * to the PatternMatcher. It's not used currently
      * but it will probably come in handy.
      */
      public void addPattern(WordPattern pattern){
            patterns.add(pattern);
      }

      /**
      *
      * This is the key function on the PatternMatcher. It is
      * passed a String (sentence) and information on "strictnesss".
      * It thens cycles through all its patterns seeing if any of them
      * are found in the sentence.
      *
      */
      public boolean matches(String s, boolean strict){

            // loop through all the WordPatterns checking to see if
            // any of them match the sentence.
            for (int i = 0; i < patterns.size();i++){
                  WordPattern wp = (WordPattern)patterns.elementAt(i);
                  if (wp.containsPattern(s,strict)){
                        return true;
                  }
            }
            return false;
      }

}

/**
*
* This class contains the core of the "comparison logic". Each WordPattern
* contains one or more word lists which it uses in sequence to do a word by
* word comparison with the sentence provided.
*
*/
class WordPattern{

      private Vector lists;

      /**
      *
      * This constructor takes an array of WordLists
      * and uses them to populate its own Vector
      * of WordLists
      */
      public WordPattern(WordList[] wl){
            lists = new Vector();
            for (int i = 0; i < wl.length; i++){
                  lists.add(wl[i]);
            }
      }

      /**
      *
      * This constructor simply initialises a blank Vector
      * to be used to store the WordLists which can be added
      * using the addList() method
      */
      public WordPattern(){
            lists = new Vector();
      }

      /**
      *
      * This function adds a WordList to the Word Pattern
      *
      */
      public void addList(WordList list){
            lists.add(list);
      }

      /**
      *
      * This function does all the real work. It breaks the suuplied
      * String into iuts component words and then compares them either
      * strictly or not, to the words in the WordLists.
      *
      */
      public boolean containsPattern(String s, boolean strict){
            int token = 0;
            // create a StringTokeniser from the sentence
            StringTokenizer st = new StringTokenizer(s);
            // Create an array to hold the words
            String[] words = new String[st.countTokens()];
            // iterate through the Tokenizer adding the words to the array
            while (st.hasMoreTokens()){
                  words[token++] = st.nextToken();
            }
            // if there are less words that lists then the sentence cannot
            // possibly contain a full pattern, so return false
            if (words.length < lists.size()){
                  return false;
            }
            // this counter will hold the number of words matched
            int count = 0;
            // this counter will hold the number of words matched contiguously (i.e. in strict sequence)
            int sequence = 0;
            // this value will tell us whether the previous word was a match
            boolean inSequence = false;
            // simultaneously loop through the array of words and the Vector
            // of WordLists, starting by comparing the first word with tye first WordList
            for (int l = 0, w = 0; ((l < lists.size()) && (w < words.length));){
                  WordList wordlist = (WordList)lists.elementAt(l);
                  String word = words[w];
                  // if the wordlist contains the word then we can move to the next word
                  // and to the next wordlist
                  if (wordlist.containsWord(word)){
                        l++;
                        w++;
                        count++;
                        // if we are are in sequence (i.e. the previous word was a match
                        // then we increment the number of seqential words found
                        if (inSequence){
                              sequence++;
                        }
                        // set the value to indicate that this word was matched
                        inSequence = true;
                  }else{
                        // if the wordlist contains the word then we can move to the next word
                        // but we do not move to the next wordlist
                        w++;
                        // set the value to indicate that we are no longer in strict sequence
                        inSequence = false;
                  }
            }
            // if the number of words matched is the same as the number of lists
            // then we have a match
            if (count == lists.size()){
                  if (strict){
                        // if we are in strict mode then the number of sequentially matched words should
                        // be 1 less that the number of matched words
                        if(sequence == count-1){
                              return true;
                        }else{
                              return false;
                        }
                  }else{
                        return true;
                  }
            }else{
                  return false;
            }
      }

      /**
      *
      * This function returns the length of the longest word list.
      * It's not used at the moment but may be useful
      *
      */
      public int maxListLength(){
            int length = 0;
            for (int l = 0; l < lists.size(); l ++){
                  if (((WordList)lists.elementAt(l)).numWords() > length){
                        length = ((WordList)lists.elementAt(l)).numWords();
                  }
            }
            return length;
      }

}

/**
*
* This class holds an array of strings (words) which
* can be combined in a WordPattern with other WordLists
*
*/
class WordList{

      private String[] words;

      /**
      *
      * This constructor takes a string and a delimiter string
      * and then uses a StringTokenizer to break the string into
      * an array of words
      */
      public WordList(String s, String delimiter){
            int token = 0;
            StringTokenizer st = new StringTokenizer(s,delimiter);
            words = new String[st.countTokens()];
            while (st.hasMoreTokens()){
                  words[token++] = st.nextToken();
            }
      }

      /**
      *
      * This is just an accessor function that lets you get the words
      * held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public String[] getWords(){
            return words;
      }

      /**
      *
      * This is just an accessor function that lets you get the number of
      * words held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public int numWords(){
            return words.length;
      }

      /**
      *
      * This function takes a string (word) and checks to
      * see if it matches any of the words in its list.
      */
      public boolean containsWord(String s){
            for (int i = 0; i < words.length; i++){
                  if (s.trim().equalsIgnoreCase(words[i])){
                        return true;
                  }
            }
            return false;
      }

      /**
      *
      * This is just an accessor function that prints out the words
      * held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public void print(){
            for (int i = 0; i < words.length; i++){
                  System.out.println(words[i]);
            }
      }
}
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.