We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

Java String Pattern Search and Count

jwphillips80
jwphillips80 asked
on
Medium Priority
2,325 Views
Last Modified: 2012-08-13
Hello everyone.  I'm currently looking to search an htm file looking for the number of iterations of strings returned in the file.

For some reason, everytime I run the compiled version; it doesn't count the number of times the string is found.  
I've modified this code for my application, but even this variant on the web isn't working correctly.  Please tell me what I'm doing wrong.

import java.io.*;
import java.net.*;


public class fastsearch
{

      // entry point
      //
      
      public static void main(String[] arguments)
      {
            System.out.println ("hello");

            fastsearch p = null;
            p = new fastsearch();
            p.Test();
      }

      public fastsearch()
      {
      }

      public void Test()
      {
            System.out.println ("Test");
            String szfile = lecture("USERCTRL.htm",false);

            long dwStart = System.currentTimeMillis();

            testcase(szfile,"my pattern",100);
            testcase(szfile,"IPropertyBag",100);
            testcase(szfile,"WebClient",100);
            testcase(szfile,"developer",100);
            testcase(szfile,"the",100);

            long dwEnd = System.currentTimeMillis();

            long dwDuration = dwEnd - dwStart;
            System.out.println ("duration = " + dwDuration + " milliseconds");

      }


      public void testcase(String s, String pattern, int nbtimes)
      {
            for (int k = 0; k < nbtimes; k++)
            {
                  int nb = 0;
                  int i,j;
                  i = 0;
                  while ( (j = s.indexOf(pattern,i))> -1 )
                  {
                        i = j + pattern.length();
                        nb++;
                  }

      /*            if (k==0)
                        System.out.println (nb + " occurences");*/
            }
      }


      public String lecture (String nom, boolean bKeepEOL)
      {

            String szTemp="";

            // build a EOL string
            byte[] myEOL= { 0x0D, 0x0A };
            String szEOL=new String(myEOL);

            InputStream is=null;

            try
            {
                  // il s'agit d'ouvrir le fichier en extrayant d'abord le chemin d'accès complet
                  String s=new String(System.getProperty("user.dir")); // propriété système "user.dir"
                  
                  // for DEBUG
                  System.out.println ("System property user dir = " + s);

                  // puis en transformant les anti-slashs de MS-DOS en slashs d'Unix puis en passant
                  // l'adresse URL complète protocole="file:"  + séparateur="//"
                  // + host=<void> + séparateur="/"
                  // + file="c:/jdk1.1.4/code/compta"
                  // + un séparateur + un nom quelconque (arst.html) qui va être viré automatiquement
                  // par le générateur d'URL et remplacé par le fichier à ouvrir dans l'URL réel.
                  //URL mon_url=new URL("file:///c:/jdk1.1.4/code/compta/arst.html");
                  URL mon_url=new URL("file:///"+s.replace('\\','/')+"/USERCTRL.html"); // s.replace('\\','/')

                  is = new URL(mon_url, nom).openStream();
                  // for DEBUG
                  //if (is!=null) System.out.println ("Ouverture du fichier \""+nom+"\"\n");

                  // ici lecture effective du flux par lignes entières
                  BufferedReader d = new BufferedReader(new InputStreamReader(is));

                  while (d.ready()) // on continue à lire tant qu'il y a des enregistrements à lire
                  {
                        // ligne 1= description entité, ligne 2= son nom
                        szTemp += d.readLine();
                        if (bKeepEOL)
                        {
                              szTemp += szEOL;
                        }
                  }

            }
            catch(Exception e)
            {
                  e.printStackTrace();
            }

            try
            {
                  if (is != null)
                  is.close();
            }
            catch(Exception e)
            {
            }

            return szTemp;

      } // end point

}

Comment
Watch Question

CERTIFIED EXPERT
Top Expert 2016

Commented:
I don't understand testcase. Why is there an 'nbtimes' value - what is it? why is this

>>
/*          if (k==0)
                    System.out.println (nb + " occurences");*/
>>


commented out?

Author

Commented:
Sorry, that shouldn't be commented out.  The code is something I saw online that I've modified some of the strings for searching my own files.  the nbtimes variable is the loop maximum for rechecking the file.  

My understanding of this is fairly limited, so that's why I'm here.

John

Author

Commented:
I would actually take any code that does this as I'm trying to streamline some testing of files that we have here at the office.
CERTIFIED EXPERT
Top Expert 2016

Commented:
StreamTokenizer is what you need really. See

http://www.cs.hut.fi/Docs/Eckel/TIJ2ed/code/c11/WordCount.java
Here is a simple program that can do what you want. You'll have to modify it to work for your program.

import javax.swing.*;
import java.io.*;
import java.io.File;

public class countString
{
      public countString()
      {
            File inputFile = new File("test.txt");
            System.out.println("The word Hello is in the file test.txt "+stringCounter(inputFile,"hello")+" times");      
      }      
      
      public int stringCounter(File fileName,String searchString)
      {
            int counter=0;
            
            try
            {
                  BufferedReader input = new BufferedReader(new FileReader(fileName));
              String line="";
              
              while ((line = input.readLine()) != null)
              {
                      //fills array with individual words
                      String temp[] = line.split(" ");
                
                        for(int cnt=0;cnt<temp.length;cnt++)
                        {
                              /*case sensative code
                              if(temp[cnt].equals(searchString))
                                    counter++;
                              */      
                              
                              //non case senseative
                              if((temp[cnt].toUpperCase()).equals(searchString.toUpperCase()))
                                    counter++;
                        }
              }
              
              input.close();
        }
        catch(Exception e)
        {
        }
      
            return counter;
      }
      
      public static void main(String args[])
      {
            new countString();      
      }
}

Cheers,
Ricky

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Note, you should use stringtokenizer instead of the split command I used in order to get it to pick up words that have special caracters attached to them. I didn't want to give you the complete ansewer as this is simular to a lot of homework problems.

Author

Commented:
Thanks.  I modified the code, got the parsing I wanted, and can now speed up some of my testing processes.

J
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.