Solved

Get Rid of Spaces between > and <

Posted on 2014-01-28
11
230 Views
Last Modified: 2014-01-31
I use the following code:

        $page_entire_code =~ s/> +?</></g;

to remove spaces between > and < in my HTML web pages. However, I noticed that it messes up my web page's breadcrumbs. For example:

  <div id="breadcrumb" itemprop="breadcrumb">
    <b>
      You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> > <a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship
    </b>
  </div>

Open in new window


gets compressed to:

<div id="breadcrumb" itemprop="breadcrumb"><b>You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> ><a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship</b></div>

Open in new window


The > after the RomanceStuck link doesn't have a space after it like it should. How can I change my Perl substitution line so that it doesn't mess up my breadcrumbs? I was thinking maybe I could say replace > that come after any characters except a space.

Thanks!
0
Comment
Question by:webstuck5
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 2
  • +1
11 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39817135
Have you read
perldoc -q html
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39817165
Using regexs for html is generally not a good idea.  Something like HTML::Packer comes to mind for minifying html.

Though, if you really need something quick and dirty, you can try using &gt; instead of a >
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819074
As Phil suggests, you should never use > (or <) in html - you should always use &gt; or &lt;.  If you do that, your regex will work fine.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:webstuck5
ID: 39819291
I thought about using &gt; but https://support.google.com/webmasters/answer/185417?hl=en shows to use >. I don't want to chance Google not using my breadcrumb because of my use of &gt;.

$page_entire_code =~ s/([^ ]>) +?</$1</g; looks to work!

Thanks for all your help!
0
 

Author Comment

by:webstuck5
ID: 39820030
I've requested that this question be closed as follows:

Accepted answer: 0 points for webstuck5's comment #a39819291

for the following reason:

I figured out how to do it.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819335
I am very, very surprised that Google says to use >.  The HTML spec strongly suggests never using < or > (always use &lt; and &gt;).  XHTML being based on XML means that < and > are illegal characters (the doc is invalid if it contains either in text).
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39819379
That example is actually using a single right-pointing angle quotation mark (&rsaquo;), which looks awfully similar to a >.  I don't think they're saying that you *have* to do it that way though - it just happened to be the example.

I'd shy away from <> when possible, but glad to hear you came up with something that works for you.
0
 

Author Comment

by:webstuck5
ID: 39820031
I now see that the Google example doesn't actually use > but now I am more confused. In the Google example's HTML source, it shows the symbol as ›. I put &rsaquo; on my page as suggested but it shows as &rsaquo; when I view the page's HTML source. What is the difference between my page and the Google example page?
0
 
LVL 14

Expert Comment

by:Phil Phillips
ID: 39820042
In the source, it'll show the raw html, so all of the codes (such as &rsaquo;) will be in their original form.

On the google example page, they used the actual symbol ›.
0
 

Author Comment

by:webstuck5
ID: 39820075
So, how can I use the actual › symbol? When I put it in my code, it showed as a weird question mark when I viewed the web page.
0
 
LVL 14

Accepted Solution

by:
Phil Phillips earned 500 total points
ID: 39820093
As far as Google is concerned, I'm sure &rsaquo; is fine.  If you *really* want to use the actual symbol, you'd have to save the file with UTF-8 encoding.
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

SASS allows you to treat your CSS code in a more OOP way. Let's have a look on how you can structure your code in order for it to be easily maintained and reused.
Is your Office 365 signature not working the way you want it to? Are signature updates taking up too much of your time? Let's run through the most common problems that an IT administrator can encounter when dealing with Office 365 email signatures.
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
The viewer will learn the basics of jQuery including how to code hide show and toggles. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery…

729 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question