Solved

Get Rid of Spaces between > and <

Posted on 2014-01-28
11
229 Views
Last Modified: 2014-01-31
I use the following code:

        $page_entire_code =~ s/> +?</></g;

to remove spaces between > and < in my HTML web pages. However, I noticed that it messes up my web page's breadcrumbs. For example:

  <div id="breadcrumb" itemprop="breadcrumb">
    <b>
      You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> > <a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship
    </b>
  </div>

Open in new window


gets compressed to:

<div id="breadcrumb" itemprop="breadcrumb"><b>You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> ><a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship</b></div>

Open in new window


The > after the RomanceStuck link doesn't have a space after it like it should. How can I change my Perl substitution line so that it doesn't mess up my breadcrumbs? I was thinking maybe I could say replace > that come after any characters except a space.

Thanks!
0
Comment
Question by:webstuck5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +1
11 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39817135
Have you read
perldoc -q html
0
 
LVL 13

Expert Comment

by:Phil Phillips
ID: 39817165
Using regexs for html is generally not a good idea.  Something like HTML::Packer comes to mind for minifying html.

Though, if you really need something quick and dirty, you can try using &gt; instead of a >
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819074
As Phil suggests, you should never use > (or <) in html - you should always use &gt; or &lt;.  If you do that, your regex will work fine.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:webstuck5
ID: 39819291
I thought about using &gt; but https://support.google.com/webmasters/answer/185417?hl=en shows to use >. I don't want to chance Google not using my breadcrumb because of my use of &gt;.

$page_entire_code =~ s/([^ ]>) +?</$1</g; looks to work!

Thanks for all your help!
0
 

Author Comment

by:webstuck5
ID: 39820030
I've requested that this question be closed as follows:

Accepted answer: 0 points for webstuck5's comment #a39819291

for the following reason:

I figured out how to do it.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819335
I am very, very surprised that Google says to use >.  The HTML spec strongly suggests never using < or > (always use &lt; and &gt;).  XHTML being based on XML means that < and > are illegal characters (the doc is invalid if it contains either in text).
0
 
LVL 13

Expert Comment

by:Phil Phillips
ID: 39819379
That example is actually using a single right-pointing angle quotation mark (&rsaquo;), which looks awfully similar to a >.  I don't think they're saying that you *have* to do it that way though - it just happened to be the example.

I'd shy away from <> when possible, but glad to hear you came up with something that works for you.
0
 

Author Comment

by:webstuck5
ID: 39820031
I now see that the Google example doesn't actually use > but now I am more confused. In the Google example's HTML source, it shows the symbol as ›. I put &rsaquo; on my page as suggested but it shows as &rsaquo; when I view the page's HTML source. What is the difference between my page and the Google example page?
0
 
LVL 13

Expert Comment

by:Phil Phillips
ID: 39820042
In the source, it'll show the raw html, so all of the codes (such as &rsaquo;) will be in their original form.

On the google example page, they used the actual symbol ›.
0
 

Author Comment

by:webstuck5
ID: 39820075
So, how can I use the actual › symbol? When I put it in my code, it showed as a weird question mark when I viewed the web page.
0
 
LVL 13

Accepted Solution

by:
Phil Phillips earned 500 total points
ID: 39820093
As far as Google is concerned, I'm sure &rsaquo; is fine.  If you *really* want to use the actual symbol, you'd have to save the file with UTF-8 encoding.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article describes how to create custom column layout styles for Bootstrap. The article uses 5 columns to illustrate the concept, but the principle can be extended to any number of columns.
Today, the web development industry is booming, and many people consider it to be their vocation. The question you may be asking yourself is – how do I become a web developer?
In this tutorial viewers will learn how to embed videos in a webpage using HTML5. Ensure your DOCTYPE declaration is set to HTML5: "<!DOCTYPE html>": Use the <video> tag to insert a video. Define the src as the URL of your video; this is similar to …
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question