Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Extra Spacing In Pattern Matching

Posted on 2014-02-21
13
Medium Priority
?
361 Views
Last Modified: 2014-02-21
This is a follow on from this question
http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_28370416.html

This is my regex
<b>([\s|\r\n]+)?<a href="(.+)?.html">(.+)?([\s|\r\n]+)?(.+)</a></b>

Open in new window


This is the replacement I am using
<a href="$2.html">$3 $5</a>

Open in new window


Here is the sample data I am using.
1.  <p><b>  <a href="http://www.thefrugallife.com/auto_lease.html">Getting Out of an Auto [LF] 
  Lease</a></b></p>

2.  <b><a href="http://www.thefrugallife.com/ants1.html">Click[LF]
      here!</a></b>

3.  <p><b>
          <a href="http://www.thefrugallife.com/new_car.html">New Car vs. Used Car</a></b> [LF]
</p>

Open in new window


Now here are the results I am getting
1.  <p><a href="http://www.thefrugallife.com/auto_lease.html">Getting Out of an Auto  Lease</a></p>

2.  <a href="http://www.thefrugallife.com/ants1.html">Click here!</a>

3.  <p><a href="http://www.thefrugallife.com/new_car.html">New Car vs. Used Ca r</a> 
</p>

Open in new window


Putting the space in the replacement between $3 and $5, I get good results on #2 but #1 has two spaces between Auto and Lease and  there is a problem in #3 with the word Ca r being represented with a space between the a and the r.

The problem seems to be the LF represented above as [LF] sometimes has a space between it and the preceding text.  So how can I account for this in the regex?

Thanks,

Randal
0
Comment
Question by:sharingsunshine
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
13 Comments
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877286
I think you might need 2 regex'es. The first:
<b>[\s|\r\n]*<a href="(.*).html">((.|\r\n)*?)</a>[\s|\r\n]*</b>

Open in new window

with the replacement
<a href="$1.html">$2</a>

Open in new window

will take care of the <b> tags, leaving only the line breaks to be solved.

HTH,
Dan
0
 

Author Comment

by:sharingsunshine
ID: 39877314
that will only take care of the 3rd example of data I am using.  1 & 2 are being passed over.
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877315
For the line breaks you can use:

regex: .html">(.*?)[\r\n]+\s*?(\w+[\w\s]*)</a>
repl: .html">$1$2</a>

Open in new window

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877320
Weird. In RegexBuddy, the result of the replacements are:

<a href="http://www.thefrugallife.com/auto_lease.html">Getting Out of an Auto [LF] 
  Lease</a>
<a href="http://www.thefrugallife.com/ants1.html">Click[LF]
      here!</a>
<a href="http://www.thefrugallife.com/new_car.html">New Car vs. Used Car</a>

Open in new window

0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877387
Just tested in Dreamweaver CS6, using your test data.

before replaceafter replace
0
 

Author Comment

by:sharingsunshine
ID: 39877389
I have RegexBuddy too and it won't highlight the 1st and 2nd examples.  Consequently, when I do the replace it only removes the <b> tags from the 3rd one.

Are you using Javascript as the regex engine?
0
 

Author Comment

by:sharingsunshine
ID: 39877393
then there must be something different about my test data from yours.  So how do we find the differences?
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877418
RegexBuddy is just a testing tool. Instead of figuring out what's different between my setup and yours (I'm on a PC, you're on a Mac, and the line breaks are different - /r/n vs /n) let's focus on the end result.

Where do you want to use that regex? In Dreamweaver on multiple files?
On a web page?
0
 

Author Comment

by:sharingsunshine
ID: 39877483
in dreamwever CS5 and on on all the files in the site.  If this matters, my RegexBuddy is on Windows 7 too.  I have VMWare Fusion to do both.
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39877496
OK, that means your sample was mangled by the formatting on EE.

Please post the sample as a file, so the line breaks aren't modified.
0
 

Author Comment

by:sharingsunshine
ID: 39877653
this is a page like I am trying to pattern match.  

123.html

Thanks,
0
 
LVL 35

Accepted Solution

by:
Dan Craciun earned 2000 total points
ID: 39877738
Yup, it's the \r\n vs \n problem.

Opened your file in HxD. The line endings are all 0x0A (\n)

I opened your file in DW, and my regex worked, no problems. You have the result attached.

After saving, the end lines were all 0x0D0A (\r\n). Which means DW changed the endings on open to match Windows style.

Long story short, if you use DW on mac, use the following:
find: <b>[\s|\n]*<a href="(.*).html">((.|\n)*?)</a>[\s|\n]*</b>
replace: <a href="$1.html">$2</a>

Open in new window

123-mod.html
0
 

Author Closing Comment

by:sharingsunshine
ID: 39877922
that's great, thanks for getting down to the bottom of the problem.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Adobe Dreamweaver CS5 is a WYSIWYG web page editor that has advanced HTML, CSS, and Javascript rendering functionality and is probably the most well-known HTML editor available. Much of Dreamweaver's appeal centers around the Design View interfac…
This article shows how to use a free utility called 'Parkdale' to easily test the performance and benchmark any Hard Drive(s) installed in your computer. We also look at RAM Disks and their speed comparisons.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …

604 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question