Solved

Perl regex to replace any capital letters not preceded by ">"

Posted on 2016-08-04
6
119 Views
Last Modified: 2016-08-08
Argh... been going in circles.  Can someone please provide a Perl regex to replace all capital letters in a string that are not preceded by ">"

e.g. <b>A</b>ll Good Boys <b>D</b>eserve Favor

I want to surround the G, B and F with <b> and </b> like the other capital letters

Thanks-
0
Comment
Question by:SAbboushi
  • 3
  • 2
6 Comments
 
LVL 34

Assisted Solution

by:Dan Craciun
Dan Craciun earned 450 total points
ID: 41743302
$subject = '<b>A</b>ll Good Boys <b>D</b>eserve Favor';
$subject =~ s![^>]([A-Z])!<b>$1</b>!g;
or
$subject =~ s!([A-Z])[^<]!<b>$1</b>!g;

HTH,
Dan
0
 
LVL 5

Accepted Solution

by:
foochar earned 50 total points
ID: 41743385
To generalize the solution you have to protect against some edge cases as well.  The examples provided by the previous commenter break if the when looking at the first and last character of the string, as there is not a preceding (or succeeding in the second example) character to match the [^<].  To work around this the first solution I came up with was:

s/(^|[^>])([A-Z])/$1<b>$2<\/b>/g

Open in new window


When I tested this I realized however that it doesn't work so well when it encounters a consecutive all caps and the </b> tag makes it skip the next letter.  It would also "miss" a capital proceeded by another tag such as <i> or </i>.  To work around this I used the "look back" functionality to come up with the following regex:

s/(?<!<b>)([A-Z])/<b>$1<\/b>/g

Open in new window


By using the ?<! negative lookbehind it only matches cases where the capital letter is not proceeded by the <b> tag.  It also eliminates the need to specifically catch the edge case of the beginning of the string because when it tries to look back when on the first character there is nothing there, and therefore it satisfies the negative lookbehind.  Thanks to the information at http://www.regular-expressions.info/lookaround.html for clarifying some of the look around specifics for me...
1
 

Author Closing Comment

by:SAbboushi
ID: 41746367
Thanks folks!
0
ScreenConnect 6.0 Free Trial

Explore all the enhancements in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI, app configurations and chat acknowledgement to improve customer engagement!

 

Author Comment

by:SAbboushi
ID: 41746369
btw - you gave me exactly what I asked for.  Would be grateful if you don't mind amending it to work only on word boundaries so the "D" isn't a match in Ph.D.
0
 
LVL 34

Expert Comment

by:Dan Craciun
ID: 41746378
[^>]([A-Z])(?=[a-z ].)

Will only match if the capital letter is followed by a regular letter.

Edit: allowed a space after the capital letter, to allow "I ".
0
 

Author Comment

by:SAbboushi
ID: 41747978
k thanks
0

Featured Post

Master Your Team's Linux and Cloud Stack!

The average business loses $13.5M per year to ineffective training (per 1,000 employees). Keep ahead of the competition and combine in-person quality with online cost and flexibility by training with Linux Academy.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As most anyone who uses or has come across them can attest to, regular expressions (regex) are a complicated bit of magic. Packed so succinctly within their cryptic syntax lies a great deal of power. It's not the "take over the world" kind of power,…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question