Solved

Java regular expression for company name

Posted on 2008-10-10
4
950 Views
Last Modified: 2010-08-05
I am trying to create a regular expression for use in Java that takes a company name and removes extraneous information and leaves me with the core elements of the name. So, for example it would:

- Exclude words like "co", "inc" "company", "incorporated" from the end of the string
- Ignore words like "the", "a" etc. from the beginning of the string
- It should ignore punctuation marks such as commas or periods

Any help is appreciated as I am relatively new to regular expressions in Java.

-- Matt
0
Comment
Question by:aaron_karp
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 16

Accepted Solution

by:
Bryan Butler earned 500 total points
ID: 22691035
Does this mean to search a string for any words not in (co, inc, company, incorporated, the, a, ",", ".", <etc.>) ?

I might be able to give you some code if you can provide more details, such as if it's reading user input, a file, or a database.  If you just want the regular expressions for each of these, that would be:

[^\.,;<etc>]  where the "^" cause the regex to match everthing except the things following it; the "\." is the period which has to be excaped as it is a 'special character', but the ";" "," and other punctuation are fine alone;  so this would return everything that isn't punctuation.  

Then you would have to negate each of the words, which would return everthing except the "bad words".  

^Co
^the
^inc
^<etc>

Here is a good tutorial on it: http://java.sun.com/docs/books/tutorial/essential/regex/index.html

Is this was you were looking for?
0
 

Author Comment

by:aaron_karp
ID: 22694291
I think this gets me most of the way there. Here is a little bit more about my problem and what I am trying to accomplish with this regex:

I am writing a function that will compare two strings in a language similar to Java (Salesforce.com's Apex which uses the Java regex processing functions.) So there will be String1 and String2 which will be equal to company names and I want to process them with a regex, use a replaceAll function to strip out the "bad words", and then compare the two results to see if they're equal.

So it looks like I'd want to use something like:

[^\.,;^co^the^inc]

And, what would I need to do to make those matches on "co" etc. case insensitive?

Thanks!
-- Matt
0
 
LVL 16

Expert Comment

by:Bryan Butler
ID: 22703355
The flag for performing a case-insensitive is: (?i)

So I believe you would put this in front of it such as: (?i)[^\.,;^co^the^inc]

But the "Apex" thing might be a problem.  Cheers,
BB
0
 

Author Closing Comment

by:aaron_karp
ID: 31505106
Thanks - this was a big help!
0

Featured Post

[Webinar] Code, Load, and Grow

Managing multiple websites, servers, applications, and security on a daily basis? Join us for a webinar on May 25th to learn how to simplify administration and management of virtual hosts for IT admins, create a secure environment, and deploy code more effectively and frequently.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
printf performancy 11 105
Bot application - advice 3 81
Using jdbcTemplate.batchUpdate to improve INSERT performance 6 35
American Express @Work site and Java 4 97
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
Viewers will learn about if statements in Java and their use The if statement: The condition required to create an if statement: Variations of if statements: An example using if statements:
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question