Solved

Remove all punctuation!!!

Posted on 2004-05-01
15
680 Views
Last Modified: 2006-11-17
Hi,

 I have the following, very simple class that seperates strings in to words.  I then use these words as part of an SQL statement to look up their meaning.  I really need to remove all the punctuation so that the SQL statement runs.  And also so words are looked up without such characters as "£*!. at the end.  Can anyone show me how to write a simple method to do this please.

import java.util.StringTokenizer;

public class SeperateWords {
    public DBConnection dbConnection=null;
    /** Creates a new instance of SeperateWords */
    public SeperateWords(String chat)
    {
        StringTokenizer words = new StringTokenizer(chat);
        String[] Chat = new String[words.countTokens()];
       
        int i=0;

       while (words.hasMoreTokens())
       {
           Chat[i] = words.nextToken().toString();
           i++;
       }
       
       
       
       dbConnection = new DBConnection(Chat);

       
    }
}

Thanks

Garth
0
Comment
Question by:garth15
15 Comments
 
LVL 7

Accepted Solution

by:
maheshexp earned 500 total points
ID: 10970039
      while (words.hasMoreTokens())
       {
           String word1 = words.nextToken();
            word1 = word1.replaceAll(".","");
            word1 = word1.replaceAll("!","");
           /* other characters to be replaced */

           Chat[i] = words1;
           i++;
       }
       
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970042
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970051
String str1 = "hai.how are you, my friend?";
            String[] arr = str1.split(".?, ");
            StringTokenizer st = new StringTokenizer(str1,".?!, ");
            while(st.hasMoreTokens()){
                  System.out.println(st.nextToken());
            }
0
Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

 
LVL 7

Expert Comment

by:maheshexp
ID: 10970052
String str1 = "hai.how are you, my friend?";
StringTokenizer st = new StringTokenizer(str1,".?!, ");

while(st.hasMoreTokens()){
      System.out.println(st.nextToken());
}
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970058
in the 3rd post remove this lie  String[] arr = str1.split(".?, ");

u can also split using Regular Expressions
            String str1 = "hai.how are you, my friend?";
            String[] arr = str1.split("[.?, ]");
            for (int i = 0; i < arr.length; i++) {
                  System.out.println(arr[i]);
            }
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970065
hope u got it
0
 

Author Comment

by:garth15
ID: 10971121
Is there any way to remove all punctuation based on ASCII numbers so that all non alpha-numeric characters are removed from each word?  I had already looked at the string tokenizer way but my statement was massive!!  Also I have problems with the ' character as it thinks it begins or ends a string literal.  Any suggestions?
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971227
> based on ASCII numbers so that all non
what do u mean by this
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971232
does you sentance have \' literal....
0
 
LVL 30

Expert Comment

by:Mayank S
ID: 10971248
>> all non alpha-numeric characters are removed

Try this:

public String removeChars ( String sSource )
{
  StringBuffer sbTemp = new StringBuffer ( sSource ) ;

  for ( int i = sbTemp.length () - 1 ; i >= 0 ; i -- )
    if ( ! Character.isLetterOrDigit ( sbTemp.charAt ( i ) ) )
      sbTemp.deleteCharAt ( i ) ; // end if, for

  return sbTemp.toString () ;

}

Pass the word to it. It should return a word containing only alphabets/ digits, with the other characters removed.
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971316
               String pattern = "(\\p{Alpha}*)(\\p{Punct}*)(\\p{Digit}*)";
            
            String text = "hello??";
            
            String[] sp = text.split(pattern);
            Pattern pat = Pattern.compile(pattern);
            Matcher match = pat.matcher(text);
            System.out.println(match.matches());
            System.out.println(match.groupCount());
            
            if(match.matches())
            for(int i = 0; i <= match.groupCount(); i++){
                  System.out.println( i + ":" + match.group(i));
            }
0
 
LVL 92

Expert Comment

by:objects
ID: 10974040
if you use a PreparedStatement to do your query then there is no need to remove punctuation.
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
Video by: Michael
Viewers learn about how to reduce the potential repetitiveness of coding in main by developing methods to perform specific tasks for their program. Additionally, objects are introduced for the purpose of learning how to call methods in Java. Define …
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

776 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question