?
Solved

Sequential Search Continued to Mr ozymandias other Expert welcomed

Posted on 2003-02-24
129
Medium Priority
?
385 Views
Last Modified: 2008-02-01
hi again,

1)i just showed the work to my supervisor. well he was happy about it. thanks to you.

now i need to extend the code. inorder to allow it to read
from a file automatically. i.e i sepcify the path of the file in the code(there many be more than one file to be read). the result should also include name of the file from which it obtained the data.

for example output : from file C:\TravelPlan1.txt
                              Match:graphics is defined..
                              Match:

2)The supervisor just commented that searching for a single word i.e graphics, is not enough. he said that i should extend it to search for Terms such as "Computer Graphics"

which means User Input = "computer graphics"?(am not restricted to only this term)

lets say the file contains

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

if user enters "Mobile Agent"

results should be :

From File XYZ.txt
Search results:

1) Mobile Agent can be  defined as an agent that transports itself and its execution state through the net.

2) Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.

From File TravelPLan.txt
Search Results:
1) ..
2)..





0
Comment
Question by:cancer_66
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 83
  • 45
129 Comments
 

Author Comment

by:cancer_66
ID: 8007787
lets say the file contains

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

if user enters "Mobile Agent"

results should be :

From File XYZ.txt
Search results:

1) Mobile Agent can be  defined as an agent that transports itself and its execution state through the net.

2) Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.

From File TravelPLan.txt
Search Results:
1) ..
2)..


0
 
LVL 35

Expert Comment

by:TimYates
ID: 8008710
This isn't really a java problem...it's more of a Text Data Mining problem...

I'd suggest looking here:

http://citeseer.nj.nec.com/grobelnik98efficient.html

For some cool documents about data mining, and hierarchical text searching algorythms :-)
0
 

Author Comment

by:cancer_66
ID: 8010074
Waiting for reply from ozymandias . anyone else who wants to help is welcomed.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:cancer_66
ID: 8014671
guys this is the code that ozymandias helped me with.
now i need to extend it as i explained above. i guess ozymandias is busy. so anyone else would like to help me please.

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

    String keyword;

    String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
    String[] token2 = new String[]{" defined "," described "," delimited "};
    String[] token3 = new String[]{" as "," by "};
    String sentences[];

    public static void main(String[] args){
         if (args.length < 1){
              System.out.println("USAGE : DefintionChecker keyword file");
         }
         DefinitionChecker dc = new DefinitionChecker(args);
    }

    public DefinitionChecker(String[] args){
         try{
              sentences = getArrayFromFile(new File(args[1]));
         }catch(IOException ioe){
              System.out.println(ioe);
         }
         keyword = args[0].toLowerCase();
         for (int i = 0; i < sentences.length; i++){
              int pos = 0;
              //System.out.println("Checking : " + sentences[i]);
              if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                   if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                        if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                             if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                  System.out.println("\t\tMATCH : " + sentences[i]);
                             }else{
                                  //System.out.println("\t\t" + sentences[i] + " has no token3.");
                             }
                        }else{
                             //System.out.println("\t\t" + sentences[i] + " has no token2.");
                        }
                   }else{
                        //System.out.println("\t\t" + sentences[i] + " has no token1.");
                   }
              }else{
                   //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
              }
         }
    }

    private int containsToken(String[] tokens, String s, int pos){
         int tPosition = -1;
         for (int i = 0; i < tokens.length && tPosition == -1; i++){
              tPosition = s.indexOf(tokens[i],pos);
         }
         return tPosition;
    }

    private String[] getArrayFromFile(File f) throws IOException{
         FileReader reader = new FileReader(f);
         Vector sentences = new Vector();
         char[] cbuf = new char[1];
         String delimiter = "#";
         String sentence = "";
         String c = "";
         while (reader.read(cbuf) != -1){
              c = new String(cbuf);
              if (c.equals(delimiter)){
                   sentences.add(sentence);
                   sentence = "";
              }else{
                   sentence += c;
              }
         }
         String[] sentenceArray = new String[sentences.size()];
         sentenceArray = (String[])sentences.toArray(sentenceArray);
         return sentenceArray;
    }
}
0
 

Author Comment

by:cancer_66
ID: 8015637
Guys i really need help.
0
 

Author Comment

by:cancer_66
ID: 8015642
Guys i really need help.
0
 

Author Comment

by:cancer_66
ID: 8015643
Guys i really need help.
0
 

Author Comment

by:cancer_66
ID: 8016440
guys why do i feel iam ignored over here. i dont think its a difficult problem. wouldnt anyone help
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8018721
Sorry for the delay.
I had to go into hospital on Monay morning and I only just got back.
0
 

Author Comment

by:cancer_66
ID: 8018741
ok no problem. the main thing is that i got some reponse. iam glad you r back.

well i hope everything is fine with you ?
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8018747
Just to clarify.

1) You dont want to specify the file(s) to be searched you want to hard code them in the program.

2) You want to search multiple files and have the the file from which each match is retreived shown along with the output.

This is prety simple as long as your are sure about point 1) above.

Basically you would create an array file files and loop through each file calling the existing code. The you modify the output to include the filename where appropriate.

If you can confirm that this is what you want I can modify the code very quickly and easily for you.
0
 

Author Comment

by:cancer_66
ID: 8018784
1)hmm.. yeah that what i was told from my supervisor. that the files from which iam going to retrive the sentence should be hard coded. keep in mind i might have multiple files to look in. now lets assume 3 file but they might be more.

2) yes as i explained above the file from which each match is retrived should be shown with the output.

when this is done there is one more thing. if its fine with you..:)
0
 

Author Comment

by:cancer_66
ID: 8018811
yeah and the most important thing ! now the user input is no longer a single word such as "graphics" it should be a term like "Computer Graphics", "mobile agent " and so on.

0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8018981
OK. That's fine.

In terms of our program so far there is no differnce between searching "graphics" and "Computer Graphics".
0
 

Author Comment

by:cancer_66
ID: 8019058
thats good news then. thanks.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019106
OK. I now have three files called abc.txt, xyz.txt and other.txt, the names of which I have hardcoded into the program. Below is a copy of each of the files and the new code.

To run it you must make sure that the txt files are in the same directory as the program or you will have to change the file names to include a full path so the program can find them.

Cheers.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019116
abc.txt
========

Graphics defined t t t is t as#Graphics was defined by the Science Acadamy in london#Graphics is the an important subject in the computer strand#Graphics is described as the art of drawing used in mathematics and engineering.#Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf.#

xyz.txt
=======

graphics defined is by nonsense#Graphics is delimited by science#Graphics is defined as science#Mobile agents can sometimes be described as intelligent agents.#Graphics be defined as the science of calculating by diagrams.#

other.txt
=========
Graphics are described as picturesPictures are described as Graphics#Mobile Agents are often described as robots or bots.#Graphics is defined as a pictorial computer output produced on a display screen plotter or printer#Graphics is purely delimited by science.#
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019119
//new code

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};

     String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
     String[] token2 = new String[]{" defined "," described "," delimited "};
     String[] token3 = new String[]{" as "," by "};
     String sentences[];

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
          }
          DefinitionChecker dc = new DefinitionChecker(args);
     }

     public DefinitionChecker(String args[]){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = args[0].toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    int pos = 0;
                    //System.out.println("Checking : " + sentences[i]);
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                         if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                              if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                   if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                        if (!fileMentioned){
                                             System.out.println("Matches found in " + file);
                                             fileMentioned = true;
                                        }
                                        System.out.println("\t\tMATCH : " + sentences[i]);
                                   }else{
                                        //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                   }
                              }else{
                                   //System.out.println("\t\t" + sentences[i] + " has no token2.");
                              }
                         }else{
                              //System.out.println("\t\t" + sentences[i] + " has no token1.");
                         }
                    }else{
                         //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                    }
               }
          }
     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
 

Author Comment

by:cancer_66
ID: 8019131
ok. ill try it now.

thanks
0
 

Author Comment

by:cancer_66
ID: 8019214
hmm..there is a problem

you see even if the user enters "Computer Grap", "Computer Graphoc"

the result is still printed??

this should be the case. if user enter "Computer Graphics" with the correct spelling then it should print the results.



0
 

Author Comment

by:cancer_66
ID: 8019233
you see even if i type

java DefintionChecker a

it also prints the result?

it doesnt check for the whole term.



0
 

Author Comment

by:cancer_66
ID: 8019282
i think the problem that when the argument is read. even if it was one letter it is compared with the whole sentence if it is equal to one of the sentence letter then it is printed.

0
 

Author Comment

by:cancer_66
ID: 8019330
try this as a test file

Computer Graphics is defined as a pictorial computer output produced on a display screen, plotter, or printer # Computer Graphics is purely delimited by science #  Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with services on the user's behalf. #
# Computer Graphicsis defined as science # Graphics is as ahmad # Mobile Agent can be defined as an agent that transports itself and its execution state through the net #

only when the user enters "Computer Graphics" or "Mobile Agent" results should be print otherwise they shouldnt.

0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019361
How could that work ?
How do I decide what is a correct searchable term and what isn't ?
Unless you are going to provide a complete list of all the words or terms that a user can search for, or a minimum length in characters, or a list of all the words the user is not allowed to search for etc........
0
 

Author Comment

by:cancer_66
ID: 8019439
i think the input should be tokenised

computer,graphics

as well as the sentences,

and then they are compared
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019442
>>only when the user enters "Computer Graphics" or "Mobile Agent" results should be print otherwise they shouldnt.

Are you saying that the only terms that the user should ever be allowed to search for are "Computer Graphics" and "Mobile Agent" ?
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019459
>>i think the input should be tokenised
>>
>>computer,graphics

Yes but what does that mean ?

computer,grpahics = computer followed by graphics
computer and graphics = anywhere in the same sentence
computer or graphics = anywhere in the same sentence



0
 

Author Comment

by:cancer_66
ID: 8019572
lets say we have the sentence.

Computer Graphics is purely defined as a field in science

we take the input from the user = Computer Graphics

tokenize it so we have

Computer
Graphics

we read the file whenever we meet "#" we put it in a array then tokenize it

so we would have

Computer            
Graphics
purely
is
Defined
as
a
field
in
Science

firstly check for the MainMarker[]={"defined","described","delimeted"}

if found in sentence THEN

we secondly compare

first token of U.Input with first token of sentence
Computer compared with Computer (match)
then
Graphics compared with Graphics (match)

then (take one element from Token1[] and compare it with a token from sentence)

is compared with is (match)

then (take another from Token2[])
defined compared with purely (no match)

when there is no match we keep on comparing with the pattern "defined" which is from Token2 till we find a match in the sentence if we didnt then nothing is printed

then (again compare with element "defined since in last attempt there was no match)
defined compared with defined (match)

then (from Token3[] compared with remaining sentence token
as compared with as (match)

since User input as well as the pattern from Token1[] and Token2[] and Token3[] matched now print results.

0
 

Author Comment

by:cancer_66
ID: 8019617
no no the user should be allowed to search for any term he is not restricted to "Mobile Agent" or "Computer Graphics"

Computer Followed by Graphics, for the time being,
0
 

Author Comment

by:cancer_66
ID: 8019711
to clearify things futher. the whole point of the search is to retrive specific information.

in our example the user is looking for defintion of a specific term such as computer graphic, mobile agent, Expert Exchange, and so on...

0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8019926
>>Computer Followed by Graphics, for the time being.

If that's the case then why bother with all this messy tokenization of the sentence. Either the sentence contains "computer graphics" or it doesn't. I don't the the value in comparing it word by word.

>>no no the user should be allowed to search for any term he is not restricted to "Mobile Agent" or "Computer Graphics"

So, there for "a" is a valid search term and so is "computar griphocs" unless you want to build a spell checker into the program.

>>to clearify things futher. the whole point of the search is to retrive specific information.

Yes, and the definition of specific information is any sentence in any of the files that contains both the user-specified search word(s) (e.g. "computer graphics") and one of the combinations of three token words (such as "is defined as").

That is pretty much exactly what it does now.
0
 

Author Comment

by:cancer_66
ID: 8019968
note that the following sentence wont be retrived

Computer Graphic defined X Y is Z as science

U.input = computer graphics

tokenize it

computer
graphics

Sentence is tokenized

computer
graphics
defined
X
Y
is
Z
as science

first element of U.input with first token of file
Computer           with             Computer (match)

second element of U.input with second token of file
Graphics           with             Graphics (match)

then first element from Token1[] with third element file

is                  with                defined (no match)

then
since no match keep comparing with "defined" till we find a match

then still first element of Token1[] with fourth elem.file
is                  with                 X (no match)

then still first element of Token1[] with fifth elen.file
is                  with                 Y (no match)

then still first element of Token1[] with sixth elen.file
is                  with                 is (match)

since there is a match go to Token2[] take first element and compare.

then first element of Token2[] with seventh elem.file
defined             with           Y (no match)

since no match keep on comparing with "defined" till match is found.

then still first element of Token2[] with eighth elem.file
defined             with               as (no match)

then still first element of Token2[] with nineth elem.file
defined             with                 is (no match)

therefore nothing is printed!


hope its clear where iam heading
0
 

Author Comment

by:cancer_66
ID: 8020013
well iam really sorry but this is how the supervisor told me it should be done.

0
 

Author Comment

by:cancer_66
ID: 8020033
see if the user enters "comptear graphise"

there should be any results since all out sentences startes with "computer graphics"

and this can only be done by tokenizing the sentence and user input as well as patterns.

unless there is another way which i dont know ~
0
 

Author Comment

by:cancer_66
ID: 8020150
i meant "there shouldnt be any .."
0
 

Author Comment

by:cancer_66
ID: 8020273
i have to leave now. its 1:30am over here. ill be online tomorrow.

please help me out

thanks
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8023788
>>see if the user enters "comptear graphise"
>>there shouldn't be any results since all out sentences startes with "computer graphics"

Yes. And using the program as writtenm so far there aren't any.

>>note that the following sentence wont be retrived
>>Computer Graphic defined X Y is Z as science

No. It isn't retreived because it doesn't match.
The tokens defined, is and as appear in the wrong order.

As far as I can see the program does exactly what it is supposed to, but your "supervisor" is not happy about the way it does it, which seems crazy because the way your "supervisor" is suggesting is a lot less efficient unless there is some other thing its supposed to do that you haven't mentioned.
0
 

Author Comment

by:cancer_66
ID: 8024032
mr ozymandias i just tried the following

java DefinitionChecker computer graphc

and it still found matches.

i need to do it with the tokeniser. well yes this is just a part of my project. its not the whole thing.

will you help me ?
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8024080
cancer_66, of course it found matches.

It found all the sentences with the word computer in them.

When you run

    java DefinitionChecker computer graphc

computer becomes args[0], graphc becomes args[1].

Our program searches for args[0].

If you want it to search for computer graphc you either have to run

    java DefinitionChecker "computer graphc"

so that args[0] becomes computer graphc or you have to chnage the code so that all the values args[] are concatenated into a single search string.

I can do it either way.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8024095
import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};

      String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
      String[] token2 = new String[]{" defined "," described "," delimited "};
      String[] token3 = new String[]{" as "," by "};
      String sentences[];

      public static void main(String[] args){
            if (args.length < 1){
                  System.out.println("USAGE : DefintionChecker keyword");
                  System.exit(1);
            }
            String s = "";
            for (int i = 0;i < args.length;i++){
                  s = s + args[i] + " ";
            }
            s = s.trim();
            DefinitionChecker dc = new DefinitionChecker(s);
      }

      public DefinitionChecker(String s){

            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  for (int i = 0; i < sentences.length; i++){
                        int pos = 0;
                        //System.out.println("Checking : " + sentences[i]);
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                              if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                                    if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                          if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                                if (!fileMentioned){
                                                      System.out.println("Matches found in " + file);
                                                      fileMentioned = true;
                                                }
                                                System.out.println("\t\tMATCH : " + sentences[i]);
                                          }else{
                                                //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                          }
                                    }else{
                                          //System.out.println("\t\t" + sentences[i] + " has no token2.");
                                    }
                              }else{
                                    //System.out.println("\t\t" + sentences[i] + " has no token1.");
                              }
                        }else{
                              //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                        }
                  }
            }
      }

      private int containsToken(String[] tokens, String s, int pos){
            int tPosition = -1;
            for (int i = 0; i < tokens.length && tPosition == -1; i++){
                  tPosition = s.indexOf(tokens[i],pos);
            }
            return tPosition;
      }

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  if (c.equals(delimiter)){
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        sentence += c;
                  }
            }
            String[] sentenceArray = new String[sentences.size()];
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }
}
0
 

Author Comment

by:cancer_66
ID: 8024119
ok iam with you

but the problem is that its comparing chracter by chracter

there for when i enter

java DefintionChecker "a"

it find matches since the letter "a" exsists in the sentence. which is irrelevant
0
 

Author Comment

by:cancer_66
ID: 8024138
it it difficult to do it the way my supervisor wants it. the problem is that i need to make him happy. and he wants me to do it in that manner.

the only problem with this code is that it is comparing character by character.

for example when i enter

java DefinitionChecker z

there are no results since my sentences does not have the letter "z" but it it did it would have been printed
0
 

Author Comment

by:cancer_66
ID: 8024308
it would have been exactly as i wented. if the above problem wasnt there :(
0
 

Author Comment

by:cancer_66
ID: 8024417
are u there ?
0
 

Author Comment

by:cancer_66
ID: 8024730
are you going to help me with this
0
 

Author Comment

by:cancer_66
ID: 8025113
a got a question. how can i put all the patterns
such as    

String[] token1 = new String[]{" is "," are "," was "," be "," can be "};
String[] token2 = new String[]{" defined "," described "," delimited "};
String[] token3 = new String[]{" as "," by "};

now in my project iam going to have alot of patterns

therefore the supervisor sugested putting all the patterns in one file, compiling it then importing it in the main program..

now how is that done ?
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8025152
Sorry...was busy with some other stuff.

OK. You have two outstanding issues here.

1. The programme works OK but you dont like the fact that it will search for things like "a" and "z". The best thing to do would be to set a minimum length on the search term. Let's say it must be three or more characters. That is very simple to do.

2. The current set of patterns is likely to be much more complex. OK, that is failry simple to achieve too. Currently we have three arrays of values. These could be stored in three files called Token1.txt, Token2.txt and Token3.txt. In the same way that we read through the current text files and make them into arrays of sentences we could easily read through text files containing the token words and make them into arrays.

You have to bear in mind however that reading text files (i.e. disk IO) is quite time intensive and the files would have to be read every time the program was run. Unless there are an awful lot of these token words I would consider trying to keep then in the program.
0
 

Author Comment

by:cancer_66
ID: 8025243
no problem.

1)thats a good idea. yeah i think that would make sense. so now i have to talk to the supervisor and decide on the minimum length. ok lets assume its 3characters.

2) well no that would be a very inefficent way to deal with it. what he meant is like to have all the patterns in one file. where i have to compile it and then import it

as "import Patterns;

0
 

Author Comment

by:cancer_66
ID: 8026223
whenever you are free can you answer my question.

0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8026756
OK for item 1 I have revised the main() method as below :

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8026798
For item 2 we need to think about the format of the file that will store the patterns.

One option would be something like this :

is
are
was
be
can be
#
defined
described
delimited
#
as
by
#

You then have a function that reads the file line by line.
Each line is put into the token1 array until we hit a #, the we start filling token2 array until we hit a # and we start filling token3 array until we hit a # which means we are at the end if the file.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8027033
OK. Here is some new code.

First id the contents of a file called patterns.txt :

is
are
was
be
can be
#
defined
described
delimited
#
as
by
#


and next the new code that uses it :
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8027035
import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
      String patternFile = "patterns.txt";

      String[] token1;
      String[] token2;
      String[] token3;

      String sentences[];

      public static void main(String[] args){
            if (args.length < 1){
                  System.out.println("USAGE : DefintionChecker keyword");
                  System.exit(1);
            }
            String s = "";
            for (int i = 0;i < args.length;i++){
                  s = s + args[i] + " ";
            }
            s = s.trim();
            if (s.length() < 3){
                  System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
                  System.exit(1);
            }
            DefinitionChecker dc = new DefinitionChecker(s);
      }

      public DefinitionChecker(String s){

            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                        importPatterns(patternFile);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  for (int i = 0; i < sentences.length; i++){
                        int pos = 0;
                        //System.out.println("Checking : " + sentences[i]);
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && containsToken(token2,sentences[i].toLowerCase(),0) > -1){
                              if ((pos = containsToken(token1,sentences[i].toLowerCase(),pos)) != -1){
                                    if ((pos = containsToken(token2,sentences[i].toLowerCase(),pos)) != -1){
                                          if ((pos = containsToken(token3,sentences[i].toLowerCase(),pos)) != -1){
                                                if (!fileMentioned){
                                                      System.out.println("Matches found in " + file);
                                                      fileMentioned = true;
                                                }
                                                System.out.println("\t\tMATCH : " + sentences[i]);
                                          }else{
                                                //System.out.println("\t\t" + sentences[i] + " has no token3.");
                                          }
                                    }else{
                                          //System.out.println("\t\t" + sentences[i] + " has no token2.");
                                    }
                              }else{
                                    //System.out.println("\t\t" + sentences[i] + " has no token1.");
                              }
                        }else{
                              //System.out.println("\t\t" + sentences[i] + " has no keyword or has no token2.");
                        }
                  }
            }
      }

      private int containsToken(String[] tokens, String s, int pos){
            int tPosition = -1;
            for (int i = 0; i < tokens.length && tPosition == -1; i++){
                  tPosition = s.indexOf(tokens[i],pos);
            }
            return tPosition;
      }

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  if (c.equals(delimiter)){
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        sentence += c;
                  }
            }
            reader.close();
            String[] sentenceArray = new String[sentences.size()];
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }

      private void importPatterns(String filename) throws IOException{
            File file = new File(filename);
            BufferedReader reader = new BufferedReader(new FileReader(file));
            String line;
            Vector v1 = new Vector();
            Vector v2 = new Vector();
            Vector v3 = new Vector();
            int hashCount = 0;
            while ((line = reader.readLine()) != null && hashCount < 3){
                  if (line.equals("#")){
                        hashCount++;
                  }else{
                        switch (hashCount){
                              case 0:
                                    v1.add(line);
                                    break;
                              case 1:
                                    v2.add(line);
                                    break;
                              case 2:
                                    v3.add(line);
                                    break;
                        }
                  }
            }
            reader.close();
            token1 = new String[v1.size()];
            token1 = (String[])v1.toArray(token1);
            token2 = new String[v2.size()];
            token2 = (String[])v2.toArray(token2);
            token3 = new String[v3.size()];
            token3 = (String[])v3.toArray(token3);
      }
}
0
 

Author Comment

by:cancer_66
ID: 8031228
hello.

iam really sorry. i didnt reply i got busy with my family. thanks for answering

ill just test it
0
 

Author Comment

by:cancer_66
ID: 8031255
well for the second question. the problem is that the superisor told me that he wants to keep the business of having 3 arryay token1[],token2[],token3[]. but he said that he is thinking of keeping it in a different file. lets say it has some sort of method. and then after compiling it i just have to import the file to the main program as "import patterns"

but ill have to clearify more with him. iam not sure if u understood what iam trying to say,
0
 

Author Comment

by:cancer_66
ID: 8031342
hmm is it possible to put "private void importPatterns(String filename)" in a different file and let say i have to compile (importPattern.java) so that it does all the processing of the patterns and puts them in 3 arrays

token1[],token2[],token3[]

which is them imported (import importPatterns;) to the main program which is "DefinitionChecker.java"

iam really sorry am trying my best to clearify things.
0
 

Author Comment

by:cancer_66
ID: 8031346
1)two comments i added above

2)can i restrict the user to enter a Term "Mobile Agents" i.e the input must consist of two words with lengths not less then 3characters?

answer me whenever you are free ill be waiting,
0
 

Author Comment

by:cancer_66
ID: 8031383
(ill be gone for 2hours but ill be back soon)
whenever u can answer my questions

thanks
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8031941
1)I can understand wanting to keep the token data in a separate file (e.g. patterns.txt), but it makes absolutely no sense to have the importPatterns function in another file. Basically, you are talking about writing a whole separate class.

Let's say you create a new class in a new file called PatternBuilder.java which creates PatternBuilder.class.

To get your patterns you would then have to create an instance of PatterBuilder abd call some method to get the Token arrays. You would probably have to make 3 separate calls to get all 3 arrays. That makes no sense at all.

The other possibility is that you have are creating the PatternBuilder class because you want to keep all the patterns hardcoded (i.e.not read from pattrens.txt) but you want to keep all that clutter out of the main program. That makes some sense but you would have to clarify exactly that that was what you were trying to do.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8031957
2) You can restrict the user to any type of input you want. You are saying that the search pattern must be :

    xxx yyy

i.e. two groups of a minimum of three characters separated by a space.

Does this mean that the user cannot ever search for a single word like "graphics", or is it OK to search for 1 word as long as it is more than 3 characters.
0
 

Author Comment

by:cancer_66
ID: 8032574
1) ok what do you suggest doing. basically yes i want all the patterns hardcoded rather then reading them from file..etc. i did suggest to the supervisor doing exatcly what you have provided (keeping all the patterns in file reading them ..etc) but he said that its not an effiencent way to do it that way. he talked about having a new class which builds the patterns where i simply have to import it in my main program (import PatternBuilder;)

can you please explain the possible solution. it wasnt really clear for me.

2)yes the supervisor argued that i should be searching for Terms rather than a single word and yes the search pattern must be:

xxx yyy


0
 

Author Comment

by:cancer_66
ID: 8032767
answer me whenever you are free please.

thanks
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8032792
OK. I am at work today, but I will try to look into this in the next few hours.
0
 

Author Comment

by:cancer_66
ID: 8033570
no problem. thanks alot for your help.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8034602
OK. There are now two classes.

The first is still DefinitionChecker but simplified.
The new class is called PatternMatcher.
This conatins all the patterns in its arrays and has the functions for searching for those patterns in strings passed to it.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8034624
/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
     String sentences[];
     PatternMatcher pm = new PatternMatcher();

     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }

     public DefinitionChecker(String s){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = s.toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i])){
                         if (!fileMentioned){
                              System.out.println("Matches found in " + file);
                              fileMentioned = true;
                         }
                         System.out.println("\t\tMATCH : " + sentences[i]);
                    }
               }
          }
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          reader.close();
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8034631
/*
* PatternMatcher.java
*
*/

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s){
          int pos = 0;
          if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
               if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                         return true;
                    }else{
                         //System.out.println(s + " contains no token3");
                    }
               }else{
                    //System.out.println(s + " contains no token2");
               }
          }else{
               //System.out.println(s + " contains no token1");
          }
          return false;
     }

}
0
 

Author Comment

by:cancer_66
ID: 8035120
thanks alot in advance. unfortunatly iam not home in order to test the program

1)however can you just explain how can i run it ?
2)does the DefinitionChecker take a search pattern of

xxx yyy ?


0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8035787
1) You just compile DefintionChecker.java and that will automatically compile PatternMatcher.java as well.

Then run the program in exactly the same way :

java DefinitionMatcher keyword(s)

2) I haven't done that yet because I ran out of time. It should only take a couple of mintes though so I may have time to post it later.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8035838
Here is the revised main method of the DefinitionChecker class. It now makes sure that easch component word of the search phrase is 3 characters or more.

However, it will still allow 1 word on its own.
Is that OK, or do you want to force the user to provide at least two words of three characters ?


     public static void main(String[] args){
          if (args.length < 1){
               System.out.println("USAGE : DefintionChecker keyword(s)");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
 

Author Comment

by:cancer_66
ID: 8036699
ok bro. ill just test the code now and give you the comments. thanks a million,
0
 

Author Comment

by:cancer_66
ID: 8036780
1)hmm ok it works fine so far. ill test it more and give you the feedback.

2)yes please it should allow one word on its own. Terms are only accepted.

thanks
0
 

Author Comment

by:cancer_66
ID: 8037133
its 2am here.ill  talk to you tommorrow.

thanks for all the help
0
 

Author Comment

by:cancer_66
ID: 8039632
hello
please answer me whenever u r free
0
 

Author Comment

by:cancer_66
ID: 8039654
there is a mistake in question (2)

2)yes please it should not allow one word on its own. Terms are only accepted.
0
 

Author Comment

by:cancer_66
ID: 8039705
3) lets say i wanted to as the pattern "is the"

i.e computer graphics is the field of computer science

therefore i should:-
 
add "is" to token1[]
add "the" to token2[] (this is also the mainmarker)
i have nothing to add in token3[]

hmm so would it work?

thanks

0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8040541
OK.

I have revised the main method again so now itb wants a minimum of two words.

     public static void main(String[] args){
          if (args.length < 2){
               System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          if (s.length() < 3){
               System.out.println("Input Error : The keyword(s) supplied must have a combined length of 3 characters or more.");
               System.exit(1);
          }
          DefinitionChecker dc = new DefinitionChecker(s);
     }
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8040550
3) At the moment the PatternMatcher will only find three word patterns, but it could be modified to find two word patterns as well.
0
 

Author Comment

by:cancer_66
ID: 8040754
1)thanks alot. ill just test the new main.
2)i think its better to modify the PatternMatcher, is it difficult to do so?
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8040802
1)OK
2) Modifying the PatternMatcher to look for different patterns is quite easy, however before we make any more changes I would suggest that you think about all the different patterns and types of pattern you might want to search for so that we can make one lot of changes and optimize the PatternMatcher to make it sufficiently flexible to cope with all your needs.
0
 

Author Comment

by:cancer_66
ID: 8040812
1)ok i totally agree with you. therefore for this i have to wait till tomorrow, so that i could meet with my supervisor and discuss all the possibilities.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8040847
OK.
0
 

Author Comment

by:cancer_66
ID: 8040907
2)lets say i want to add another option to the user to choose between

a)Sequential search (current code-already completed)
b)Strict Sequential search

now what i mean by strict sequential is that all the patterns i.e "is defined as,was described such as..etc"
have to come one after another.

example

i)graphics is defined as a fielld in cs (printed in strict sequential)
ii)graphics is purely defined as a field in cs(not printed notice pattern "is defined as" not one after another.

hmm hope i explained it properly?



0
 

Author Comment

by:cancer_66
ID: 8041248
answer me.whenevr u r free please
0
 

Author Comment

by:cancer_66
ID: 8041390
answer me.whenevr u r free please
0
 

Author Comment

by:cancer_66
ID: 8042669
ill be waiting for your reply
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8042746
Done.

I have modified the code to do the above.
You can now specify a -s argument to the program.

e.g. if I run :

    java DefintionChecker mobile agents


I get :

Matches found in abc.txt
                MATCH : Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with
services on the user's behalf.
Matches found in xyz.txt
                MATCH : Mobile agents can sometimes be described as intelligent agents.
Matches found in other.txt
                MATCH : Mobile Agents are often described as robots or bots.

However, if I run :

    java DefintionChecker mobile agents -s

I get :

Matches found in abc.txt
                MATCH : Mobile Agents are defined as autonomous, intelligent programs that move through a network, searching for and interacting with
services on the user's behalf.
Matches found in xyz.txt
                MATCH : Mobile agents can sometimes be described as intelligent agents.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8042757
The -s argument can be used anywhere, i.e.

    java DefintionChecker mobile agents -s

will work, and so will

    java DefintionChecker -s mobile agents
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8042769
/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;

public class DefinitionChecker{

     String keyword;
     String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
     String sentences[];
     PatternMatcher pm = new PatternMatcher();

     public static void main(String[] args){

          boolean strict = false;

          if (args.length < 2){
               System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
               System.exit(1);
          }
          String s = "";
          for (int i = 0;i < args.length;i++){
               if (args[i].equalsIgnoreCase("-s")){
                    strict = true;
                    continue;
               }
               if (args[i].length() < 3){
                    System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                    System.exit(1);
               }
               s = s + args[i] + " ";
          }
          s = s.trim();
          DefinitionChecker dc = new DefinitionChecker(s,strict);
     }

     public DefinitionChecker(String s, boolean strict){

          for (int f = 0; f < files.length; f++){
               boolean fileMentioned = false;
               File file = null;
               try{
                    file = new File(files[f]);
                    sentences = getArrayFromFile(file);
               }catch(IOException ioe){
                    System.out.println(ioe);
               }
               keyword = s.toLowerCase();
               for (int i = 0; i < sentences.length; i++){
                    if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i],strict)){
                         if (!fileMentioned){
                              System.out.println("Matches found in " + file);
                              fileMentioned = true;
                         }
                         System.out.println("\t\tMATCH : " + sentences[i]);
                    }
               }
          }
     }

     private String[] getArrayFromFile(File f) throws IOException{
          FileReader reader = new FileReader(f);
          Vector sentences = new Vector();
          char[] cbuf = new char[1];
          String delimiter = "#";
          String sentence = "";
          String c = "";
          while (reader.read(cbuf) != -1){
               c = new String(cbuf);
               if (c.equals(delimiter)){
                    sentences.add(sentence);
                    sentence = "";
               }else{
                    sentence += c;
               }
          }
          reader.close();
          String[] sentenceArray = new String[sentences.size()];
          sentenceArray = (String[])sentences.toArray(sentenceArray);
          return sentenceArray;
     }
}
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8042774
/*
* PatternMatcher.java
*
*/

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s, boolean strict){

          if (strict){
               String[] words = s.split(" ");
               for (int i = 0; i < words.length - 2; i++){
                    if (containsToken(token1,words[i],0) > -1 && containsToken(token2,words[i+1],0) > -1 && containsToken(token3,words[i+2],0) > -1){
                         return true;
                    }
               }
          }else{
               int pos = 0;
               if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                         if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                              return true;
                         }else{
                              //System.out.println(s + " contains no token3");
                         }
                    }else{
                         //System.out.println(s + " contains no token2");
                    }
               }else{
                    //System.out.println(s + " contains no token1");
               }
          }
          return false;
     }

}
0
 

Author Comment

by:cancer_66
ID: 8042802
Thanks a alot ozymandias i appricate your help. ill test it right away.

0
 

Author Comment

by:cancer_66
ID: 8042882
hmm i get an error
C:\aglets\public\Expert Exchange\DefinitionChecker7\PatternMatcher.java:27: cannot resolve symbol
symbol  : method split  (java.lang.String)
location: class java.lang.String
              String[] words = s.split(" ");
                                ^
1 error

0
 

Author Comment

by:cancer_66
ID: 8042892
sorry they are acually two.

C:\aglets\public\Expert Exchange\DefinitionChecker7\DefinitionChecker7.java:46: containsPattern(java.lang.String,boolean) in PatternMatcher cannot be applied to (java.lang.String)
                   if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.containsPattern(sentences[i])){
                                                                             ^
C:\aglets\public\Expert Exchange\DefinitionChecker7\PatternMatcher.java:27: cannot resolve symbol
symbol  : method split  (java.lang.String)
location: class java.lang.String
              String[] words = s.split(" ");
                                ^
2 errors
0
 

Author Comment

by:cancer_66
ID: 8042916
sorry again.

its only one error. the one i posted first?
0
 

Author Comment

by:cancer_66
ID: 8042917
sorry again.

its only one error. the one i posted first?
0
 

Author Comment

by:cancer_66
ID: 8042918
sorry again.

its only one error. the one i posted first?
0
 

Author Comment

by:cancer_66
ID: 8043059
note iam using jdk1.3.1

0
 

Author Comment

by:cancer_66
ID: 8043094
ill be waiting 4 ur answer.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8043130
Sorry, I forgot about the JDK version.
You cannot use the split method.
Here is the same code usinbg a StringTokenizer.

/*
* PatternMatcher.java
*
*/

import java.util.StringTokenizer;

public class PatternMatcher{

     String[] token1 = new String[]{"is","was","are","be","can be"};
     String[] token2 = new String[]{"described","defined","delimited"};
     String[] token3 = new String[]{"as","by"};

     public PatternMatcher(){

     }

     private int containsToken(String[] tokens, String s, int pos){
          int tPosition = -1;
          for (int i = 0; i < tokens.length && tPosition == -1; i++){
               tPosition = s.indexOf(tokens[i],pos);
          }
          return tPosition;
     }

     public boolean containsPattern(String s, boolean strict){

          if (strict){
               int token = 0;
               StringTokenizer st = new StringTokenizer(s);
               String[] words = new String[st.countTokens()];
               while (st.hasMoreTokens()){
                    words[token++] = st.nextToken();
               }
               for (int i = 0; i < words.length - 2; i++){
                    if (containsToken(token1,words[i],0) > -1 && containsToken(token2,words[i+1],0) > -1 && containsToken(token3,words[i+2],0) > -1){
                         return true;
                    }
               }
          }else{
               int pos = 0;
               if ((pos = containsToken(token1,s.toLowerCase(),pos)) != -1){
                    if ((pos = containsToken(token2,s.toLowerCase(),pos)) != -1){
                         if ((pos = containsToken(token3,s.toLowerCase(),pos)) != -1){
                              return true;
                         }else{
                              //System.out.println(s + " contains no token3");
                         }
                    }else{
                         //System.out.println(s + " contains no token2");
                    }
               }else{
                    //System.out.println(s + " contains no token1");
               }
          }
          return false;
     }

}
0
 

Author Comment

by:cancer_66
ID: 8043382
sorry i  wasnt at my seat.ill just test it right away.

why cant i use the split method is it because of the JDK version.
0
 

Author Comment

by:cancer_66
ID: 8043484
perfect its working fine. so far ill test it more. :)

thanks alot.
0
 

Author Comment

by:cancer_66
ID: 8043904
ill talk to u tomorrow. thanks 4 everything
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8043985
Yes. The split method of string was only introduced in JDK 1.4.
0
 

Author Comment

by:cancer_66
ID: 8047099
hello. again.

1)i showed the code "DefintionChecker" + "PatternMatcher" to my supervisor. he was happy with the output.

but he oen comment which is. he asked if i could just compile the "patternMatcher" and import it in "DefinitionChecker" as "import PatternMatcher; ? he wants it that way.

2)i also discussed the modification of the PatternMatcher in order to look for different patterns. he said to keep it on hold till i meet him next. because he is ot sure himself.

but he did say something

1)like having a data structure to indicate how many lists we need. i.e Token1[],..etc
2)Data structure for the lists itself i.e Token1[] ={"is"} , Token2 ={"the"}

i think basically he wants to have different patterns of lists.

for example:-

DefintionPatterns

Pattern1:- consists of Token1[],Token2[]
Pattern2:- consists of Token3[],Token3[],Token4[]
Pattern3:- consists of Token6[],Token7[],Token[]8,Token9[]
.
.
.

while
Token1[]:- consists of {"is"}
Token2[]:- consists of {"the"}

Token3[]:- consists of {"is","was","can be"}
Token4[]:- consists of {"defined","described",delimited"}
Token5[]:- consists of {"as","by","such as", "as"}

Token6[]
.
.
.
(iam just giving you an idea of what he told me) so what do you think?




0
 

Author Comment

by:cancer_66
ID: 8047118
whenever you have the time answer me. ill be waiting

thanks
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8047185
1) You have already compiled PatternMatcher and you don't need to import it. It works without an import statement. Why would he weant to import something when it works without importing ? In order to import it you would have to put it in a named package.

2) I can see how this could be done. You would have to create a couple of new classes.

a) WordList : this would just be a list of words,

e.g.  
    is, was, can be
or
    described defined delimited

b) WordPattern : this would be a set of word lists in a particular order.

A PatternMatcher could then contain a set of word patterns, all of which is checks.

I will produce a prototype and post it later if I have time.
0
 

Author Comment

by:cancer_66
ID: 8047233
1)hmm well thats the way he wants it. even though i dont see the need for that.

basically iam going to use the "Sequential Search + strict sequential" along with my Mobile Agent software(Aglets). which will have to Travel Along a number of computers on a LAN do some MAtching and finaly display the message of the user screen.

if i wanted to do what he as told me what changes should i do ?

2)yeah it might be something like that. ok thanks. whenever u r free. ill wait
0
 

Author Comment

by:cancer_66
ID: 8047248
3)
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8047316
OK.

I have a couple of questions.

1) Is this part of an school project ? Is this "supervisor" your boss or your teacher or what ? Are you supposed to be learning Java ? I don't mind you using this for learning but I don't want you to get into to trouble if you are supposed to be doing this yourself.

2) Adding this extra code to have multiple complex patterns is quite a big peice of work. I don't mind doing it because it's quite interesting, but you will have to think about awarding some more points soon.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8047489
One last thing....

this code is now getting quite big and complex so could you please email me your email address and I will email you the code and source files etc rather than pasting them here, because it is making the thread very long.
0
 

Author Comment

by:cancer_66
ID: 8047584
1)m_alkhamis@hotmail.com
2)no my supervisor is my teacher at college. this one of the parts of my project. there wont be any problems. thanks
3)about the points dont worry. ill award you points and open up another thread, whatever suits you.

4)can you please help me out with some documentation, its make my life easier in understanding. thats if you dont mind. please
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8047651
1) OK. I will mail you soon.
2) OK.
3) Thanks.
4) Documentation for what ? Are you saying you want me to add comments to the code and also stuff suitable for javadoc ?

Let me know.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8047728
I have mailed you the new code and files.
0
 

Author Comment

by:cancer_66
ID: 8047855
sorry i wasnt at my seat. iam back now ill just check the mail. and give you the feedback

4)yes i mean comments. so its easier for me to follow. if u dont mind.
please
0
 

Author Comment

by:cancer_66
ID: 8048005
hmm ok i tested it its working perfectly. so far. ill test it with more texts.

ill try adding more patterns like "is the" "consists of" ..etc

thanks for your help.

answer me whenevr u r free.
0
 

Author Comment

by:cancer_66
ID: 8048068
1)i managed to add a two new patterns "can be defined as","can be defined as"

but could not add the pattern "is the"?

i added the sentence :- mobile agents is the future of e-commerce

but didnt work? maybe iam doing something wrong
0
 

Author Comment

by:cancer_66
ID: 8048081
1)i managed to add a two new patterns "can be defined as","can be defined as"

but could not add the pattern "is the"?

i did the following

private static WordList list4 = new WordList("foo,bar,buzz,is",",");
private static WordList list5 = new WordList("token,the",",");

i added the sentence :- mobile agents is the future of e-commerce

but didnt work? maybe iam doing something wrong

0
 

Author Comment

by:cancer_66
ID: 8048118
ok it worked . i did exactly what i told u up there.

acually when i checked the sub-directory "definitions"
it contained "PatternMatcher.class,..etc" and inside the sub-directory "definitions" there was another sub-directory called "definitions again?

so i deleted all the classes and duplicate sub-directory. and compiled "definitionchecker.java" it work.
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8048243
OK.

I will add some comments to the code and send you the updates.

Let me know if there are any problems in your tests.
0
 

Author Comment

by:cancer_66
ID: 8048287
no problems so far, thanks alot.

0
 
LVL 15

Accepted Solution

by:
ozymandias earned 2000 total points
ID: 8048506
Code with comments sent as requested.

I think I have answered this question pretty fully now.

Please award points and we can continue this in another thread if necessary.
0
 

Author Comment

by:cancer_66
ID: 8048527
ok . ill award points and open a new thread. incase i need help.
0
 

Author Comment

by:cancer_66
ID: 8048570
ok one last question before i end this thread and open a new one.

1)whenever i want to add a new pattern which consists of two word i should add it in Token4,Token5
otherswise Token1,2,3

correct?
-----------------------------------------------------
correct way of adding pattern?

private static WordList list4 = new WordList("foo,bar,buzz,is",",");

or

private static WordList list4 = new WordList("foo,bar,buzz","is",",");

why is the "," at the end ?

i know i ask stupid questions sorry
0
 

Author Comment

by:cancer_66
ID: 8048605
i opened up a new thread called (Search 2:- For Mr ozymandias )

thanks for your help
0
 

Author Comment

by:cancer_66
ID: 8048688
answer me when u r free.
0
 

Author Comment

by:cancer_66
ID: 8048723
you can answer my question in the new thread . ill award points here ,
0
 

Author Comment

by:cancer_66
ID: 8048731
thanks alot for everything. you deserve more than execllent
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8048896
For completeness and PAQ value the entire code will be posted here :

/*
* DefinitionChecker.java
*
*/

import java.util.Vector;
import java.io.*;
import definitions.PatternMatcher;

public class DefinitionChecker{

      String keyword;
      String[] files = new String[]{"abc.txt","xyz.txt","other.txt"};
      String sentences[];
      PatternMatcher pm = new PatternMatcher();

      public static void main(String[] args){

            boolean strict = false;

            // first make sure that we have at least two arguments
            if (args.length < 2){
                  System.out.println("USAGE : DefintionChecker keywords\nA minimum of two words must be provided to make a valid search term.");
                  System.exit(1);
            }
            String s = "";
            // now lets check what the arguments are
            for (int i = 0;i < args.length;i++){
                  //if any of them are -s then we are in stric mode
                  if (args[i].equalsIgnoreCase("-s")){
                        strict = true;
                        continue;
                  }
                  // make sure they are all 3 chracaters or longer
                  if (args[i].length() < 3){
                        System.out.println("Input Error : " + args[i] + "\nAll component words of the search must be three chracters or more.");
                        System.exit(1);
                  }
                  // concatenate the arguments into one search string
                  s = s + args[i] + " ";
            }
            s = s.trim();
            // finally instantiate a DefinitionChecker and pass it the string and tell it whether to be stric or not
            DefinitionChecker dc = new DefinitionChecker(s,strict);
      }

      /**
      *
      * Constructor for the DefinitionChecker
      *
      */

      public DefinitionChecker(String s, boolean strict){

            // loop through each file in the list of files
            for (int f = 0; f < files.length; f++){
                  boolean fileMentioned = false;
                  File file = null;
                  try{
                        // get all the sentences
                        file = new File(files[f]);
                        sentences = getArrayFromFile(file);
                  }catch(IOException ioe){
                        System.out.println(ioe);
                  }
                  keyword = s.toLowerCase();
                  // loop through all the sentences
                  for (int i = 0; i < sentences.length; i++){
                        // if any sentence conatins the keyword and matches any of the patterns specified in the PatternMatcher
                        if (sentences[i].toLowerCase().indexOf(keyword) > -1 && pm.matches(sentences[i],strict)){
                              // if this is the first match found in this file
                              if (!fileMentioned){
                                    // output the file name
                                    System.out.println("Matches found in " + file);
                                    fileMentioned = true;
                              }
                              // say we found a match
                              System.out.println("\t\tMATCH : " + sentences[i]);
                        }
                  }
            }
      }

      /**
      *
      * GetArrayFromFile
      *
      * This function reads a specified file and breaks the contents into
      * and array of strings (sentences) using the # character as a delimiter
      *
      */

      private String[] getArrayFromFile(File f) throws IOException{
            FileReader reader = new FileReader(f);
            Vector sentences = new Vector();
            char[] cbuf = new char[1];
            String delimiter = "#";
            String sentence = "";
            String c = "";
            // read the file character by character
            while (reader.read(cbuf) != -1){
                  c = new String(cbuf);
                  // if the chracter is a delimiter (#)
                  if (c.equals(delimiter)){
                        // add the sentence to the Vector and start a new blank sentence
                        sentences.add(sentence);
                        sentence = "";
                  }else{
                        // otherwise just add the character to the current sentence string
                        sentence += c;
                  }
            }
            reader.close();
            String[] sentenceArray = new String[sentences.size()];
            // convert the Vector to an array and return it
            sentenceArray = (String[])sentences.toArray(sentenceArray);
            return sentenceArray;
      }
}
0
 
LVL 15

Expert Comment

by:ozymandias
ID: 8048898
/*
* PatternMatcher.java
*
*/

package definitions;

import java.util.StringTokenizer;
import java.util.Vector;

public class PatternMatcher{

      /*
      *
      * These are some static WordLists which can be used to create
      * the WordPatterns that this PatterMatcher will use
      *
      */
      private static WordList list1 = new WordList("is,was,are,be",",");
      private static WordList list2 = new WordList("described,defined,delimited",",");
      private static WordList list3 = new WordList("as,by",",");
      private static WordList list4 = new WordList("foo,bar,buzz",",");
      private static WordList list5 = new WordList("token",",");

      private Vector patterns;

      /**
      *
      * Constructor for the PatternMatcher. This adds the
      * WordPatterns to the PatternMatchers list of patterns
      * ready for matching.
      *
      */
      public PatternMatcher(){
            // create the vector to store our WordPatterns
            patterns = new Vector();

            // create a WordPattern
            WordPattern pattern1 = new WordPattern();
            // add the appropriate WordLists
            pattern1.addList(list1);
            pattern1.addList(list2);
            pattern1.addList(list3);
            // add the WordPattern to the vector
            patterns.add(pattern1);

            // create another WordPattern
            WordPattern pattern2 = new WordPattern();
            // add the appropriate WordLists
            pattern2.addList(list4);
            pattern2.addList(list5);
            // add the WordPattern to the vector
            patterns.add(pattern2);

      }

      /**
      *
      * This is just a function for adding WordPatterns
      * to the PatternMatcher. It's not used currently
      * but it will probably come in handy.
      */
      public void addPattern(WordPattern pattern){
            patterns.add(pattern);
      }

      /**
      *
      * This is the key function on the PatternMatcher. It is
      * passed a String (sentence) and information on "strictnesss".
      * It thens cycles through all its patterns seeing if any of them
      * are found in the sentence.
      *
      */
      public boolean matches(String s, boolean strict){

            // loop through all the WordPatterns checking to see if
            // any of them match the sentence.
            for (int i = 0; i < patterns.size();i++){
                  WordPattern wp = (WordPattern)patterns.elementAt(i);
                  if (wp.containsPattern(s,strict)){
                        return true;
                  }
            }
            return false;
      }

}

/**
*
* This class contains the core of the "comparison logic". Each WordPattern
* contains one or more word lists which it uses in sequence to do a word by
* word comparison with the sentence provided.
*
*/
class WordPattern{

      private Vector lists;

      /**
      *
      * This constructor takes an array of WordLists
      * and uses them to populate its own Vector
      * of WordLists
      */
      public WordPattern(WordList[] wl){
            lists = new Vector();
            for (int i = 0; i < wl.length; i++){
                  lists.add(wl[i]);
            }
      }

      /**
      *
      * This constructor simply initialises a blank Vector
      * to be used to store the WordLists which can be added
      * using the addList() method
      */
      public WordPattern(){
            lists = new Vector();
      }

      /**
      *
      * This function adds a WordList to the Word Pattern
      *
      */
      public void addList(WordList list){
            lists.add(list);
      }

      /**
      *
      * This function does all the real work. It breaks the suuplied
      * String into iuts component words and then compares them either
      * strictly or not, to the words in the WordLists.
      *
      */
      public boolean containsPattern(String s, boolean strict){
            int token = 0;
            // create a StringTokeniser from the sentence
            StringTokenizer st = new StringTokenizer(s);
            // Create an array to hold the words
            String[] words = new String[st.countTokens()];
            // iterate through the Tokenizer adding the words to the array
            while (st.hasMoreTokens()){
                  words[token++] = st.nextToken();
            }
            // if there are less words that lists then the sentence cannot
            // possibly contain a full pattern, so return false
            if (words.length < lists.size()){
                  return false;
            }
            // this counter will hold the number of words matched
            int count = 0;
            // this counter will hold the number of words matched contiguously (i.e. in strict sequence)
            int sequence = 0;
            // this value will tell us whether the previous word was a match
            boolean inSequence = false;
            // simultaneously loop through the array of words and the Vector
            // of WordLists, starting by comparing the first word with tye first WordList
            for (int l = 0, w = 0; ((l < lists.size()) && (w < words.length));){
                  WordList wordlist = (WordList)lists.elementAt(l);
                  String word = words[w];
                  // if the wordlist contains the word then we can move to the next word
                  // and to the next wordlist
                  if (wordlist.containsWord(word)){
                        l++;
                        w++;
                        count++;
                        // if we are are in sequence (i.e. the previous word was a match
                        // then we increment the number of seqential words found
                        if (inSequence){
                              sequence++;
                        }
                        // set the value to indicate that this word was matched
                        inSequence = true;
                  }else{
                        // if the wordlist contains the word then we can move to the next word
                        // but we do not move to the next wordlist
                        w++;
                        // set the value to indicate that we are no longer in strict sequence
                        inSequence = false;
                  }
            }
            // if the number of words matched is the same as the number of lists
            // then we have a match
            if (count == lists.size()){
                  if (strict){
                        // if we are in strict mode then the number of sequentially matched words should
                        // be 1 less that the number of matched words
                        if(sequence == count-1){
                              return true;
                        }else{
                              return false;
                        }
                  }else{
                        return true;
                  }
            }else{
                  return false;
            }
      }

      /**
      *
      * This function returns the length of the longest word list.
      * It's not used at the moment but may be useful
      *
      */
      public int maxListLength(){
            int length = 0;
            for (int l = 0; l < lists.size(); l ++){
                  if (((WordList)lists.elementAt(l)).numWords() > length){
                        length = ((WordList)lists.elementAt(l)).numWords();
                  }
            }
            return length;
      }

}

/**
*
* This class holds an array of strings (words) which
* can be combined in a WordPattern with other WordLists
*
*/
class WordList{

      private String[] words;

      /**
      *
      * This constructor takes a string and a delimiter string
      * and then uses a StringTokenizer to break the string into
      * an array of words
      */
      public WordList(String s, String delimiter){
            int token = 0;
            StringTokenizer st = new StringTokenizer(s,delimiter);
            words = new String[st.countTokens()];
            while (st.hasMoreTokens()){
                  words[token++] = st.nextToken();
            }
      }

      /**
      *
      * This is just an accessor function that lets you get the words
      * held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public String[] getWords(){
            return words;
      }

      /**
      *
      * This is just an accessor function that lets you get the number of
      * words held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public int numWords(){
            return words.length;
      }

      /**
      *
      * This function takes a string (word) and checks to
      * see if it matches any of the words in its list.
      */
      public boolean containsWord(String s){
            for (int i = 0; i < words.length; i++){
                  if (s.trim().equalsIgnoreCase(words[i])){
                        return true;
                  }
            }
            return false;
      }

      /**
      *
      * This is just an accessor function that prints out the words
      * held in the list. Not used at the moment, but probably useful
      * for debugging.
      */
      public void print(){
            for (int i = 0; i < words.length; i++){
                  System.out.println(words[i]);
            }
      }
}
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For beginner Java programmers or at least those new to the Eclipse IDE, the following tutorial will show some (four) ways in which you can import your Java projects to your Eclipse workbench. Introduction While learning Java can be done with…
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
Video by: Michael
Viewers learn about how to reduce the potential repetitiveness of coding in main by developing methods to perform specific tasks for their program. Additionally, objects are introduced for the purpose of learning how to call methods in Java. Define …
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
Suggested Courses
Course of the Month9 days, 17 hours left to enroll

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question