Solved

Java Dictionary Search

Posted on 2007-11-28
11
3,191 Views
Last Modified: 2013-11-23
Hi,

I am trying to develop a piece of Java to search a dictionary for words that best-fit a charachter input stream. I found a piece of code on ee that I have included with this post that looks like it may be a fairly close fit. The problem I'm having with is that it finds things when it shouldn't!

Say the dictionary file only contained the following words:
absonant
absonous
absorb
absorbable
absorbate
absorbed
absorbefacient
absorbency
absorbent
absorber
absorbing
absorption

If the input stream was "ABSO", it should put all of those words into an array as possible options.
It would get the next character "R" and drop "absoNant" and  "absoNous", next character "B" and drop "absorPtion".
Eventually having to make a desicion on a "best-fit" word.
Problem is that i'd like it to come back with a best-fit word rather than nothing - like taking a guess.
If the next character after "B" was "X" making "ABSORBX" - it's not in the list, so perhaps getting the next character allows the guess to be made - next character after "X" is "D", "ABSORBXD" - it makes a decision to selec "ABSORBED" as a best-fit.
 
First problem I have with the code below is that
- enter "ABSO" - it returns not found (correct I think - the word is not in the dictionary, but words begin with it??)
- enter "ABSOX" - it returns found (I don't think its supposed to - word not in dictionary and no words begin with it.)

If anyone can help me overcome these problems, i'd very much appreciate the help!!

Thanks in advance!

Cheers,

Phil.

public class Wordsearch {
 
 
    public static void main(String args[]) throws IOException {
 
         ArrayList words = loadFile("dictionary.txt"); // Load the word in arrayList
 
         Collections.sort(words); //sort ArrayList required for binarySearch
 
         BufferedReader stdin = new BufferedReader(new InputStreamReader(System.in));
         while (true) {
            System.out.print("Enter Word (q to quit) ");
            System.out.print("\n");
            String option = stdin.readLine();
 
            if (option.equalsIgnoreCase("q")) System.exit(-1);
            else {
 
               if (Arrays.binarySearch(words.toArray(),option.toLowerCase())== -1)
                   System.out.println("Word is not found in dictionary.");
               else
                   System.out.println("Word Found");
            }
 
         }
 
    }
 
   private static ArrayList loadFile(String fileName) throws IOException{
        String word;
 
        File file  = new File(fileName);
        BufferedReader bfreader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
        ArrayList wordList = new ArrayList();
        while ((word=bfreader.readLine()) != null){
         wordList.add(word.toLowerCase());
 
        }
 
        return wordList;
 
   }
}

Open in new window

0
Comment
Question by:phil8258
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 3
11 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 20365040
That code is using exact (case-insensitive) matching. You probably need something like fuzzy matching algos
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20365049
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20365055
Java Lucene uses Levenshtein
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 
LVL 26

Expert Comment

by:ksivananth
ID: 20365084
>>enter "ABSOX" - it returns found (I don't think its supposed to - word not in dictionary and no words begin with it.)
>>

try below,

if (Arrays.binarySearch(words.toArray(),option.toLowerCase()) < 0)

the binary serach returns < 0 if it doesn't found but it is not -1 always!
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20365266
Change

>>Arrays.binarySearch(words.toArray()

to

Collections.binarySearch(words

as there's no reason to convert
0
 

Author Comment

by:phil8258
ID: 20367714
Ok, thanks. Those changes work pretty well.

The second part of the problem is now that I want to store a list of words
So that if you search the dictionary for the input string "absorbe", you get a list containing:
absorbed, absorbefacient, absorbency, absorbent, absorber

Anyone any idea how this could be done?
A snippet of code would be really helpful to get me kick started.

Cheers.
 
0
 
LVL 86

Accepted Solution

by:
CEHJ earned 500 total points
ID: 20367850
Just need to do something like

String target = "absorbe";
ArrayList found = new ArrayList();
for(int i = 0;i < wordList.size();i++) {
    String word = (String)wordList.get(i).toLowerCase();
    if (word.substring(target) > -1) {
        found.add(word);
    }
}

Open in new window

0
 

Author Comment

by:phil8258
ID: 20368044
Thanks for that prompt.
line 5 is having a little difficulty.
"The method substring(int) in the type String is not applicable for the arguments (String)"
any idea how to fix it?
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20368069
Sorry!


if (word.indexOf(target) > -1) {
        found.add(word);
}

Open in new window

0
 

Author Comment

by:phil8258
ID: 20368118
sorted. thanks :O)
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 20368327
:-)
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question