Solved

How to write a regex to do this...

Posted on 2012-03-28
4
214 Views
Last Modified: 2012-04-28
How would I write a regex  that would search through a paragraph of content for any "." period characters except those used to mark the end of a sentence.

Why?

I have a paragraph that I want to parse into individual sentences, by using the period at the end of each sentence as the delimiter.

But there are some extra periods in the content.  They are used after abbreviations.  

As a side note, these extra periods always show up in between brackets.

For example - my original paragraph:

Cleans brushes and floor, using solvent or soap and water. May transfer items to and from work area, using hoist or handtruck. May be designated according to article painted as Last-Code Striper (wood prod., nec); Painter, Drum (any industry); Painter, Mannequin (fabrication, nec); Pipe Coater (steel & rel.); or according to coating applied as Japanner (any industry); Lacquerer (machine shop); Car Varnisher (railroad equip.).

I want to remove any periods that are not end of sentence markers so I get this:

Cleans brushes and floor, using solvent or soap and water. May transfer items to and from work area, using hoist or handtruck. May be designated according to article painted as Last-Code Striper (wood prod, nec); Painter, Drum (any industry); Painter, Mannequin (fabrication, nec); Pipe Coater (steel & rel); or according to coating applied as Japanner (any industry); Lacquerer (machine shop); Car Varnisher (railroad equip).

Any ideas.

I'm doing this in ColdFusion if that makes any difference.
0
Comment
Question by:bigmikey88
  • 2
  • 2
4 Comments
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 37778432
I don't think regex is going to be a good tool for this as this is really more of a parsing question, but you might try:


ReReplace(input, "(\([^.]*)\.([^)]*\))", "\1")

Open in new window


...but I think it will require execution within a loop unless you are guaranteed never to encounter more than one period within any given set of brackets.
0
 

Accepted Solution

by:
bigmikey88 earned 0 total points
ID: 37884603
I found a regex within a loop.  But sorry, no time to document it
0
 
LVL 74

Expert Comment

by:käµfm³d 👽
ID: 37885613
It's super-awesome to actually get feedback from the author when something isn't quite doing the job. It makes it that much easier to tweak suggestions.

It's also quite amusing that you say "no complete solutions," yet you also say "found a regex within a loop." Maybe it's because I just woke up and I still have the eye crustees, but did I not say, " I think it will require execution within a loop"?
0
 

Author Closing Comment

by:bigmikey88
ID: 37905411
No complete solutions were offered by others
0

Featured Post

Highfive + Dolby Voice = No More Audio Complaints!

Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

Join & Write a Comment

Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Do you hate spam? I do, and I am willing to bet you do as well. I often wonder, though, "if people hate spam so much, why do they still post their email addresses on the web?" I'm not talking about a plain-text posting here. I am referring to the fa…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now