Link to home
Start Free TrialLog in
Avatar of mohammedf
mohammedfFlag for Palestine, State of

asked on

find the occurance of word using java

Hi

I have a file , ie HTML file , XML file .. .etc
i want a java method that calculates the number occurrence of a given term
ie:

public int method(String filename, String term)
return numberOfOcc

the term may be more than one word
the term may be in more than one line , at most two lines
ignore character cases
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

Is this classwork?
Avatar of mohammedf

ASKER

nop

first read the file line by line

http://helpdesk.objects.com.au/java/how-do-i-read-a-text-file-line-by-line

and for each line use the indexOf() String method to search for instances of term
indexOf or contains ??
ASKER CERTIFIED SOLUTION
Avatar of Mick Barry
Mick Barry
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
objects , thanks it works
but why 'contains' worked not correctly ??

because you need to check if it appears more than once in the line
the program goes to infinite loop and it seems it don't break the while loop objects ???
here is the code

 public int calculateTermFreq(String part, String file) throws Exception {
        int cnt = 0;
        try {
            FileInputStream fstream = new FileInputStream("files//" + file);
            DataInputStream in = new DataInputStream(fstream);
            BufferedReader br = new BufferedReader(new InputStreamReader(in));
            String current = "";
            while ((current = br.readLine()) != null) {
                int start = 0;
                while (start >= 0) {
                    int index = current.indexOf(part, start);
                    if (index == -1) {
                        start = -1;
                    } else {
                        // found the term
                        cnt++;
                        start = index + part.length();
                        
                    }
                }
            }
            in.close();
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
        }
 
        return cnt;
 
    }

Open in new window

looks fine, try adding some debug logging to determine where it is getting stuck
whats your text file look like?
it is HTML file

Neither html files nor xml files are line-oriented, so it doesn't make much sense to read them linewise.  Also, the phrase you're looking for could span more than one 'line'. Fortunately most html files at least are fairly small so it's best to read the file into a string to search. Line problems then disappear. See

http://www.technojeeves.com/joomla/index.php/free/93-file-to-string-in-java