Solved

Remove all punctuation!!!

Posted on 2004-05-01
15
677 Views
Last Modified: 2006-11-17
Hi,

 I have the following, very simple class that seperates strings in to words.  I then use these words as part of an SQL statement to look up their meaning.  I really need to remove all the punctuation so that the SQL statement runs.  And also so words are looked up without such characters as "£*!. at the end.  Can anyone show me how to write a simple method to do this please.

import java.util.StringTokenizer;

public class SeperateWords {
    public DBConnection dbConnection=null;
    /** Creates a new instance of SeperateWords */
    public SeperateWords(String chat)
    {
        StringTokenizer words = new StringTokenizer(chat);
        String[] Chat = new String[words.countTokens()];
       
        int i=0;

       while (words.hasMoreTokens())
       {
           Chat[i] = words.nextToken().toString();
           i++;
       }
       
       
       
       dbConnection = new DBConnection(Chat);

       
    }
}

Thanks

Garth
0
Comment
Question by:garth15
15 Comments
 
LVL 7

Accepted Solution

by:
maheshexp earned 500 total points
ID: 10970039
      while (words.hasMoreTokens())
       {
           String word1 = words.nextToken();
            word1 = word1.replaceAll(".","");
            word1 = word1.replaceAll("!","");
           /* other characters to be replaced */

           Chat[i] = words1;
           i++;
       }
       
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970042
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970051
String str1 = "hai.how are you, my friend?";
            String[] arr = str1.split(".?, ");
            StringTokenizer st = new StringTokenizer(str1,".?!, ");
            while(st.hasMoreTokens()){
                  System.out.println(st.nextToken());
            }
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970052
String str1 = "hai.how are you, my friend?";
StringTokenizer st = new StringTokenizer(str1,".?!, ");

while(st.hasMoreTokens()){
      System.out.println(st.nextToken());
}
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970058
in the 3rd post remove this lie  String[] arr = str1.split(".?, ");

u can also split using Regular Expressions
            String str1 = "hai.how are you, my friend?";
            String[] arr = str1.split("[.?, ]");
            for (int i = 0; i < arr.length; i++) {
                  System.out.println(arr[i]);
            }
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10970065
hope u got it
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 

Author Comment

by:garth15
ID: 10971121
Is there any way to remove all punctuation based on ASCII numbers so that all non alpha-numeric characters are removed from each word?  I had already looked at the string tokenizer way but my statement was massive!!  Also I have problems with the ' character as it thinks it begins or ends a string literal.  Any suggestions?
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971227
> based on ASCII numbers so that all non
what do u mean by this
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971232
does you sentance have \' literal....
0
 
LVL 30

Expert Comment

by:mayankeagle
ID: 10971248
>> all non alpha-numeric characters are removed

Try this:

public String removeChars ( String sSource )
{
  StringBuffer sbTemp = new StringBuffer ( sSource ) ;

  for ( int i = sbTemp.length () - 1 ; i >= 0 ; i -- )
    if ( ! Character.isLetterOrDigit ( sbTemp.charAt ( i ) ) )
      sbTemp.deleteCharAt ( i ) ; // end if, for

  return sbTemp.toString () ;

}

Pass the word to it. It should return a word containing only alphabets/ digits, with the other characters removed.
0
 
LVL 7

Expert Comment

by:maheshexp
ID: 10971316
               String pattern = "(\\p{Alpha}*)(\\p{Punct}*)(\\p{Digit}*)";
            
            String text = "hello??";
            
            String[] sp = text.split(pattern);
            Pattern pat = Pattern.compile(pattern);
            Matcher match = pat.matcher(text);
            System.out.println(match.matches());
            System.out.println(match.groupCount());
            
            if(match.matches())
            for(int i = 0; i <= match.groupCount(); i++){
                  System.out.println( i + ":" + match.group(i));
            }
0
 
LVL 92

Expert Comment

by:objects
ID: 10974040
if you use a PreparedStatement to do your query then there is no need to remove punctuation.
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

INTRODUCTION Working with files is a moderately common task in Java.  For most projects hard coding the file names, using parameters in configuration files, or using command-line arguments is sufficient.   However, when your application has vi…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers learn about the “while” loop and how to utilize it correctly in Java. Additionally, viewers begin exploring how to include conditional statements within a while loop and avoid an endless loop. Define While Loop: Basic Example: Explanatio…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now