Alicia St Rose
asked on
Want to use Regex to dynamically encode ampersand in urls
Hi!
I've been scouring the web for an answer and I think my limitation is that I'm not that familiar with Regular Expressions and how they work. Especially, how to add the code to my loop or template file.
I found this code:
But I don't know how to add it to my file. I have a custom field for a Indiebound link. Most of the links contain the ampersand, so code isn't validationg. Here is the section of code:
I've been scouring the web for an answer and I think my limitation is that I'm not that familiar with Regular Expressions and how they work. Especially, how to add the code to my loop or template file.
I found this code:
text = Regex.Replace(text, @"
# Match & that is not part of an HTML entity.
& # Match literal &.
(?! # But only if it is NOT...
\w+; # an alphanumeric entity,
| \#[0-9]+; # or a decimal entity,
| \#x[0-9A-F]+; # or a hexadecimal entity.
) # End negative lookahead.",
"&",
RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace);
But I don't know how to add it to my file. I have a custom field for a Indiebound link. Most of the links contain the ampersand, so code isn't validationg. Here is the section of code:
if ( ! is_active_sidebar( 'sidebar-books' ) ) {
return;
}
?>
<div id="buy-it" class="widget-area" role="complementary">
<?php if (is_single() && is_post_type('book')) : ?>
<aside class="beige buy">
<h2>Buy It!</h2>
<ul>
<?php if(get_field('indie_bookstores_link') !=false) { ?>
<li><a href="<?php the_field('indie_bookstores_link'); ?>" target="_blank"><img src="<?php bloginfo('url'); ?>/wp-content/uploads/2015/08/indiebound.png"></a></li><?php } ?>
<?php if(get_field('amazon_link') !=false) { ?>
<li><a href="<?php the_field('amazon_link'); ?>" target="_blank"><img src="<?php bloginfo('url'); ?>/wp-content/uploads/2015/08/amazon.png"></a></li><?php } ?>
<?php if(get_field('barnes_n_noble_link') !=false) { ?>
<li><a href="<?php the_field('barnes_n_noble_link'); ?>" target="_blank"><img src="<?php bloginfo('url'); ?>/wp-content/uploads/2015/08/barnes-noble.png"></a></li><?php } ?>
</ul>
</aside>
<?php endif; ?>
</div><!-- #buy-it -->
Maybe the best way to ask this question would be to show us exactly what you have for input and exactly what you want for output. Ampersands are an "overloaded" character -- they have different meanings in different contexts, and there may be PHP functions that already address the context you are using. But to know that, we would have to see the inputs and outputs.
ASKER
I'm not a RegEx guru, but for designing and testing RegEx, I use the free tool Expresso.
frankhelk, thank you for the suggestion I'll look into it
Maybe the best way to ask this question would be to show us exactly what you have for input and exactly what you want for output.
Ray Paseur, here's a couple of examples of the links that have been added to the custom fields:
http://www.indiebound.org/search/book?searchfor=bruce+hale+chameleon+wore+chartreuse&x=0&y=0
http://www.amazon.com/Chameleon-Wore-Chartreuse-Gecko-Mystery/dp/0152024859/ref=sr_1_4?s=books&ie=UTF8&qid=1440576439&sr=1-4&refinements=p_82%3AB000APLXEC
They aren't validating in W3C
This one does not validate, but there is no instance of "amp" in the validator output.
https://validator.w3.org/check?uri=http%3A%2F%2Fwww.indiebound.org%2Fsearch%2Fbook%3Fsearchfor%3Dbruce%2Bhale%2Bchameleon%2Bwore%2Bchartreuse%26x%3D0%26y%3D0&charset=%28detect+automatically%29&doctype=Inline&group=0
The Amazon.com page does not validate either, but that may be more a matter of sensitivity of the W3 validator than an indication that anything is wrong. It's common to see URLs with ampersands in them. What kind of failure is this causing?
https://validator.w3.org/check?uri=http%3A%2F%2Fwww.indiebound.org%2Fsearch%2Fbook%3Fsearchfor%3Dbruce%2Bhale%2Bchameleon%2Bwore%2Bchartreuse%26x%3D0%26y%3D0&charset=%28detect+automatically%29&doctype=Inline&group=0
The Amazon.com page does not validate either, but that may be more a matter of sensitivity of the W3 validator than an indication that anything is wrong. It's common to see URLs with ampersands in them. What kind of failure is this causing?
ASKER
Hi Ray,
It's not the page on the Indiebound site I'm trying to validate. It's this page on a site I'm building:
https://validator.w3.org/nu/?doc=http%3A%2F%2Fsandbox.intrepidrealist.com%2Fbruce-hale%2Fbooks%2Fchet-gecko-series%2Fthe-chameleon-wore-chartreuse-chet-gecko-mystery-no-1%2F
The links to indiebound and Amazon are causing errors because of the ampersand. I need to encode them apparently. But I want this to happen dynamically, because my client isn't going to remember to do it. And I've already got loads of these links all over the site for his books:
http://sandbox.intrepidrealist.com/bruce-hale
It's not the page on the Indiebound site I'm trying to validate. It's this page on a site I'm building:
https://validator.w3.org/nu/?doc=http%3A%2F%2Fsandbox.intrepidrealist.com%2Fbruce-hale%2Fbooks%2Fchet-gecko-series%2Fthe-chameleon-wore-chartreuse-chet-gecko-mystery-no-1%2F
The links to indiebound and Amazon are causing errors because of the ampersand. I need to encode them apparently. But I want this to happen dynamically, because my client isn't going to remember to do it. And I've already got loads of these links all over the site for his books:
http://sandbox.intrepidrealist.com/bruce-hale
Have you tried translating the URLs with this?
http://php.net/manual/en/function.htmlspecialchars.php
I am not suggesting that is needed, just that it already exists and is what we usually use to "entitize" the special characters. I'm not sure that it's always needed, but it may be enough to satisfy your requirements. Just a thought.
http://php.net/manual/en/function.htmlspecialchars.php
I am not suggesting that is needed, just that it already exists and is what we usually use to "entitize" the special characters. I'm not sure that it's always needed, but it may be enough to satisfy your requirements. Just a thought.
ASKER
Hi Ray,
I found this code in the comments section of the page you linked to. It looks like what I need, though it's been voted down!
My issue is not know where to put the code!!
I'm still green on some things, I guess! ;)
I found this code in the comments section of the page you linked to. It looks like what I need, though it's been voted down!
<?php
function formspecialchars($var)
{
$pattern = '/&(#)?[a-zA-Z0-9]{0,};/';
if (is_array($var)) { // If variable is an array
$out = array(); // Set output as an array
foreach ($var as $key => $v) {
$out[$key] = formspecialchars($v); // Run formspecialchars on every element of the array and return the result. Also maintains the keys.
}
} else {
$out = $var;
while (preg_match($pattern,$out) > 0) {
$out = htmlspecialchars_decode($out,ENT_QUOTES);
}
$out = htmlspecialchars(stripslashes(trim($out)), ENT_QUOTES,'UTF-8',true); // Trim the variable, strip all slashes, and encode it
}
return $out;
}
?>
My issue is not know where to put the code!!
I'm still green on some things, I guess! ;)
Hmm... I am still not seeing a failure when I click links to Indiebound or Amazon. Can you please post a link to a page that illustrates the issue? Thanks.
ASKER
Hi Ray,
It has nothing to do with clicking the links. The links on this page do not validate in W3C because they have ampersands. Can you please tell me how to dynamically remove those ampersands and replace with HTML entities? It looks like I have to do it with regular expressions.
Are you not able to see the W3C errors on the link below? Numbers 7 through 12 are errors.
https://validator.w3.org/nu/?doc=http%3A%2F%2Fsandbox.intrepidrealist.com%2Fbruce-hale%2Fbooks%2Fchet-gecko-series%2Fthe-chameleon-wore-chartreuse-chet-gecko-mystery-no-1%2F
It has nothing to do with clicking the links. The links on this page do not validate in W3C because they have ampersands. Can you please tell me how to dynamically remove those ampersands and replace with HTML entities? It looks like I have to do it with regular expressions.
Are you not able to see the W3C errors on the link below? Numbers 7 through 12 are errors.
https://validator.w3.org/nu/?doc=http%3A%2F%2Fsandbox.intrepidrealist.com%2Fbruce-hale%2Fbooks%2Fchet-gecko-series%2Fthe-chameleon-wore-chartreuse-chet-gecko-mystery-no-1%2F
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I'm accepting your solution because you are giving me permission to cry uncle on this one!
Thanks!
Thanks!
:-)
I think you're on firm ground. Best of luck with the project! ~Ray
I think you're on firm ground. Best of luck with the project! ~Ray
I'm not a RegEx guru, but for designing and testing RegEx, I use the free tool Expresso. Given some basic understanding of RegEx, it's nice for learning and experimenting, too.