Solved

Get Rid of Spaces between > and <

Posted on 2014-01-28
11
220 Views
Last Modified: 2014-01-31
I use the following code:

        $page_entire_code =~ s/> +?</></g;

to remove spaces between > and < in my HTML web pages. However, I noticed that it messes up my web page's breadcrumbs. For example:

  <div id="breadcrumb" itemprop="breadcrumb">
    <b>
      You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> > <a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship
    </b>
  </div>

Open in new window


gets compressed to:

<div id="breadcrumb" itemprop="breadcrumb"><b>You are here: <a href="http://www.romancestuck.com/">RomanceStuck</a> ><a href="http://www.romancestuck.com/marriage/love-and-marriage.htm">Marriage</a> > 11 Tips for Improving a Strained Relationship</b></div>

Open in new window


The > after the RomanceStuck link doesn't have a space after it like it should. How can I change my Perl substitution line so that it doesn't mess up my breadcrumbs? I was thinking maybe I could say replace > that come after any characters except a space.

Thanks!
0
Comment
Question by:webstuck5
  • 4
  • 4
  • 2
  • +1
11 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 39817135
Have you read
perldoc -q html
0
 
LVL 12

Expert Comment

by:Phil Phillips
ID: 39817165
Using regexs for html is generally not a good idea.  Something like HTML::Packer comes to mind for minifying html.

Though, if you really need something quick and dirty, you can try using &gt; instead of a >
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819074
As Phil suggests, you should never use > (or <) in html - you should always use &gt; or &lt;.  If you do that, your regex will work fine.
0
 

Author Comment

by:webstuck5
ID: 39819291
I thought about using &gt; but https://support.google.com/webmasters/answer/185417?hl=en shows to use >. I don't want to chance Google not using my breadcrumb because of my use of &gt;.

$page_entire_code =~ s/([^ ]>) +?</$1</g; looks to work!

Thanks for all your help!
0
 

Author Comment

by:webstuck5
ID: 39820030
I've requested that this question be closed as follows:

Accepted answer: 0 points for webstuck5's comment #a39819291

for the following reason:

I figured out how to do it.
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 26

Expert Comment

by:wilcoxon
ID: 39819335
I am very, very surprised that Google says to use >.  The HTML spec strongly suggests never using < or > (always use &lt; and &gt;).  XHTML being based on XML means that < and > are illegal characters (the doc is invalid if it contains either in text).
0
 
LVL 12

Expert Comment

by:Phil Phillips
ID: 39819379
That example is actually using a single right-pointing angle quotation mark (&rsaquo;), which looks awfully similar to a >.  I don't think they're saying that you *have* to do it that way though - it just happened to be the example.

I'd shy away from <> when possible, but glad to hear you came up with something that works for you.
0
 

Author Comment

by:webstuck5
ID: 39820031
I now see that the Google example doesn't actually use > but now I am more confused. In the Google example's HTML source, it shows the symbol as ›. I put &rsaquo; on my page as suggested but it shows as &rsaquo; when I view the page's HTML source. What is the difference between my page and the Google example page?
0
 
LVL 12

Expert Comment

by:Phil Phillips
ID: 39820042
In the source, it'll show the raw html, so all of the codes (such as &rsaquo;) will be in their original form.

On the google example page, they used the actual symbol ›.
0
 

Author Comment

by:webstuck5
ID: 39820075
So, how can I use the actual › symbol? When I put it in my code, it showed as a weird question mark when I viewed the web page.
0
 
LVL 12

Accepted Solution

by:
Phil Phillips earned 500 total points
ID: 39820093
As far as Google is concerned, I'm sure &rsaquo; is fine.  If you *really* want to use the actual symbol, you'd have to save the file with UTF-8 encoding.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Someone recently asked me about how to display a progress indicator on a page while an iframe is loading. And I remember when I first came across this myself. It was a bit tricky to get my head around, but really, it's very simple. The most impor…
This article demonstrates how to create a simple responsive confirmation dialog with Ok and Cancel buttons using HTML, CSS, jQuery and Promises
The viewer will the learn the benefit of plain text editors and code an HTML5 based template for use in further tutorials.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now