Solved

How to write a regex to do this...

Posted on 2012-03-28
4
222 Views
Last Modified: 2012-04-28
How would I write a regex  that would search through a paragraph of content for any "." period characters except those used to mark the end of a sentence.

Why?

I have a paragraph that I want to parse into individual sentences, by using the period at the end of each sentence as the delimiter.

But there are some extra periods in the content.  They are used after abbreviations.  

As a side note, these extra periods always show up in between brackets.

For example - my original paragraph:

Cleans brushes and floor, using solvent or soap and water. May transfer items to and from work area, using hoist or handtruck. May be designated according to article painted as Last-Code Striper (wood prod., nec); Painter, Drum (any industry); Painter, Mannequin (fabrication, nec); Pipe Coater (steel & rel.); or according to coating applied as Japanner (any industry); Lacquerer (machine shop); Car Varnisher (railroad equip.).

I want to remove any periods that are not end of sentence markers so I get this:

Cleans brushes and floor, using solvent or soap and water. May transfer items to and from work area, using hoist or handtruck. May be designated according to article painted as Last-Code Striper (wood prod, nec); Painter, Drum (any industry); Painter, Mannequin (fabrication, nec); Pipe Coater (steel & rel); or according to coating applied as Japanner (any industry); Lacquerer (machine shop); Car Varnisher (railroad equip).

Any ideas.

I'm doing this in ColdFusion if that makes any difference.
0
Comment
Question by:bigmikey88
  • 2
  • 2
4 Comments
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37778432
I don't think regex is going to be a good tool for this as this is really more of a parsing question, but you might try:


ReReplace(input, "(\([^.]*)\.([^)]*\))", "\1")

Open in new window


...but I think it will require execution within a loop unless you are guaranteed never to encounter more than one period within any given set of brackets.
0
 

Accepted Solution

by:
bigmikey88 earned 0 total points
ID: 37884603
I found a regex within a loop.  But sorry, no time to document it
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 37885613
It's super-awesome to actually get feedback from the author when something isn't quite doing the job. It makes it that much easier to tweak suggestions.

It's also quite amusing that you say "no complete solutions," yet you also say "found a regex within a loop." Maybe it's because I just woke up and I still have the eye crustees, but did I not say, " I think it will require execution within a loop"?
0
 

Author Closing Comment

by:bigmikey88
ID: 37905411
No complete solutions were offered by others
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Special characters in unix 14 105
JavaScript/REGEX: validate multiple emails addresses, separated my commas 6 75
regex code to filter this ip's? 2 39
REXEX help Part 2 2 50
by Batuhan Cetin Regular expression is a language that we use to edit a string or retrieve sub-strings that meets specific rules from a text. A regular expression can be applied to a set of string variables. There are many RegEx engines for u…
Whatever be the reason, if you are working on web development side,  you will need day-today validation codes like email validation, date validation , IP address validation, phone validation on any of the edit page or say at the time of registration…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question