?
Solved

Get Rid of Spaces between > and <

Posted on 2014-01-28
11
Medium Priority
?
232 Views
Last Modified: 2014-01-31
I use the following code:

        $page_entire_code =~ s/> +?</></g;

to remove spaces between > and < in my HTML web pages. However, I noticed that it messes up my web page's breadcrumbs. For example:

  <div id="breadcrumb" itemprop="breadcrumb">
    <b>
      You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> > <a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship
    </b>
  </div>

Open in new window


gets compressed to:

<div id="breadcrumb" itemprop="breadcrumb"><b>You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> ><a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship</b></div>

Open in new window


The > after the RomanceStuck link doesn't have a space after it like it should. How can I change my Perl substitution line so that it doesn't mess up my breadcrumbs? I was thinking maybe I could say replace > that come after any characters except a space.

Thanks!
0
Comment
Question by:webstuck5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +1
11 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39817135
Have you read
perldoc -q html
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39817165
Using regexs for html is generally not a good idea.  Something like HTML::Packer comes to mind for minifying html.

Though, if you really need something quick and dirty, you can try using &gt; instead of a >
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819074
As Phil suggests, you should never use > (or <) in html - you should always use &gt; or &lt;.  If you do that, your regex will work fine.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:webstuck5
ID: 39819291
I thought about using &gt; but https://support.google.com/webmasters/answer/185417?hl=en shows to use >. I don't want to chance Google not using my breadcrumb because of my use of &gt;.

$page_entire_code =~ s/([^ ]>) +?</$1</g; looks to work!

Thanks for all your help!
0
 

Author Comment

by:webstuck5
ID: 39820030
I've requested that this question be closed as follows:

Accepted answer: 0 points for webstuck5's comment #a39819291

for the following reason:

I figured out how to do it.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819335
I am very, very surprised that Google says to use >.  The HTML spec strongly suggests never using < or > (always use &lt; and &gt;).  XHTML being based on XML means that < and > are illegal characters (the doc is invalid if it contains either in text).
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39819379
That example is actually using a single right-pointing angle quotation mark (&rsaquo;), which looks awfully similar to a >.  I don't think they're saying that you *have* to do it that way though - it just happened to be the example.

I'd shy away from <> when possible, but glad to hear you came up with something that works for you.
0
 

Author Comment

by:webstuck5
ID: 39820031
I now see that the Google example doesn't actually use > but now I am more confused. In the Google example's HTML source, it shows the symbol as ›. I put &rsaquo; on my page as suggested but it shows as &rsaquo; when I view the page's HTML source. What is the difference between my page and the Google example page?
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39820042
In the source, it'll show the raw html, so all of the codes (such as &rsaquo;) will be in their original form.

On the google example page, they used the actual symbol ›.
0
 

Author Comment

by:webstuck5
ID: 39820075
So, how can I use the actual › symbol? When I put it in my code, it showed as a weird question mark when I viewed the web page.
0
 
LVL 14

Accepted Solution

by:
Phil Phillips earned 2000 total points
ID: 39820093
As far as Google is concerned, I'm sure &rsaquo; is fine.  If you *really* want to use the actual symbol, you'd have to save the file with UTF-8 encoding.
0

Featured Post

On Demand Webinar: Networking for the Cloud Era

Did you know SD-WANs can improve network connectivity? Check out this webinar to learn how an SD-WAN simplified, one-click tool can help you migrate and manage data in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Building a website can seem like a daunting task to the uninitiated but it really only requires knowledge of two basic languages: HTML and CSS.
This article discusses how to create an extensible mechanism for linked drop downs.
In this tutorial viewers will learn how to style transparent/translucent elements using alpha transparency in CSS Start with a normal styled element, such as a div.: Define its "background-color" property as "rgba (255, 255, 255, .5): The numbers in…
In this tutorial viewers will learn how to embed an audio file in a webpage using HTML5. Ensure your DOCTYPE declaration is set to HTML5: : The declaration should display (CODE) HTML5 is supported by the most recent versions of all major browsers…
Suggested Courses

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question