Solved

Regex: Remove Subdomain from URL

Posted on 2011-03-20
22
2,459 Views
Last Modified: 2012-05-11
I'm using PHP 5.x and I need to remove the http://subdomain. from my URL's. The subdomain can contain letters [aA-zZ] and numbers [0-9].

I've got this much,
<?php
      \$full_URL = get_bloginfo('wpurl') ;
      \$http_URL = str_replace(\"http://www.\",\"\",\$full_URL) ;
      \$sub_URL = str_replace(\"http://\",\"\",\$http_URL) ;
      \$root_URL = str_replace( ?? ) ;
?>

but that leaves this,
      subdomain.root.com

I need to also remove the subdomain and the dot "." so what remains is,
      root.com

Thanks for your help.
0
Comment
Question by:WizeOwl
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 8
  • 7
  • 4
  • +2
22 Comments
 
LVL 39

Accepted Solution

by:
Pratima Pharande earned 300 total points
ID: 35178518
function strip_out_subdomain($domain)  {      $only_my_domain = preg_replace("/^(.*?)\.(.*)$/","$2",$domain);      return $only_my_domain;  }
0
 

Author Comment

by:WizeOwl
ID: 35178541
Thanks, but can you convert that to the "str_replace" syntax with proper escape characters where needed? This code resides inside a Wordpress "page" using the execphp plugin.
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35179650
You won't be able to do this using only str_replace() unless you know what the exact subdomain string is. For more complex replacement rules (ie. regex) you have to use preg_replace().

If you do know the subdomain name you are trying to strip out, you could do this:

$subdomain = 'mysubdomain';
$mynewurl = str_replace("http://", '' $myurl);
$mynewurl = str_replace("$subdomain.", '' $myurl);

I do appreciate that if execphp does not support preg functions then this leaves you a little stuck.
0
Why Off-Site Backups Are The Only Way To Go

You are probably backing up your data—but how and where? Ransomware is on the rise and there are variants that specifically target backups. Read on to discover why off-site is the way to go.

 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35179662
That said, I can't see why you can't use preg unless it is not available in your php installation (which is unlikely).
0
 
LVL 13

Expert Comment

by:darren-w-
ID: 35179996
Perhaps use substr ?:

<?php
      $url = "http://devd.domainnamffe.co.uk";
        echo substr ($url,strpos($url,".")+1)."<br>";
        $url = "devd.domainnamffe.co.uk";
        echo substr ($url,strpos($url,".")+1);

?>
0
 
LVL 13

Expert Comment

by:darren-w-
ID: 35180393
ps ignore above, its incorrect
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35180610
What about something like this?
<?php
      $full_URL = get_bloginfo('wpurl') ;
      $pos = strpos($full_URL, "root.com");
      $clean_URL = substr($full_URL, $pos)
?>

Open in new window

0
 

Author Comment

by:WizeOwl
ID: 35184592
The problem with the above solution is I don't know what the root string is. It is not "root.com", and it can be anything, depending on what domain the script is running on. This needs to be generic.

So, basically, here's what I have so far:
 - I have stripped the http:// from the URL which provides $sub_URL
 - Now I need to strip all characters from the left to the next "dot".

This:  
      subdomain.root.com

Needs to become this:
      root.com

I just don't know the syntax to do this. Also, since it's being passed from within a Wordpress page, all the special characters need to be escaped.

      \$root_URL = str_replace ( \"\^\[a-z,A-Z,0-9\]\",\"\",\$sub_URL ) ;      // All alpha characters
      \$root_URL = str_replace( \"\.\",\"\",\$root_URL ) ;                            // the "dot"


0
 

Author Comment

by:WizeOwl
ID: 35184640
Perhaps I can rephrase the question...

I need to isolate the "root" domain. Perhaps there is another way to do this of which I am not aware, such as with a Wordpress function, or internal PHP method.

From this:
      http://subdomain.root.com

I need to access this:
      root.com

0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35184664
....

But you're trying to use regular expressions with a function that doesn't use such. Or am I not understanding how str_rpelace works?


>>  Now I need to strip all characters from the left to the next "dot"

How would you intend to handle multiple sub-domains? For instance:  www.hub1.example.com ?
0
 

Author Comment

by:WizeOwl
ID: 35186636
Please re-read all my posts from the beginning. It answers your question.
0
 

Author Comment

by:WizeOwl
ID: 35186994
Also, I only need to handle the following types of URL's:

http://www.Root.com
and
http://subdomain.Root.com
0
 
LVL 10

Assisted Solution

by:ollyatstithians
ollyatstithians earned 100 total points
ID: 35187646
OK, how about this:
Assuming that the domain is a .com (or a tld with a non reserved sub-domain, so NOT .co.uk or similar) you can use:

$rootdomain = preg_replace('~\.([a-z]+\.[a-z]+)~i', '$1', $url);

so if you want to eval() that code (which is what I think your plugin is doing) you need to escape the quotes:

$stringtoeval = '$rootdomain = preg_replace(\'~\.([a-z]+\.[a-z]+)~i\', \'$1\', $url);'

You should only need to escape $ characters in double quoted strings.

Just to reiterate what kaufmed said: You cannot use regular expressions with str_replace(). At all.
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35187664
Here is a regex that does it the other way round (ie. it loses the lowest order subdomain from the url):

preg_replace('~[a-z]\.((?:[a-z]+\.)+[a-z]+)$~i', '$1', $url);
0
 
LVL 75

Expert Comment

by:käµfm³d 👽
ID: 35190295
>>  Please re-read all my posts from the beginning. It answers your question.

Perhaps you should re-read mine. pratima_mcs gave you an example of how to use preg_replace and you asked to convert it to str_replace. Again I ask, is there some magic about str_replace you are privy to that the rest of the world is not?
0
 

Author Comment

by:WizeOwl
ID: 35194800
ollyatstithians, your two suggestions do not seem to be working, or perhaps I'm not using it correctly. It's not stripping the subdomain. Can you suggest a fix for it?

<?php
      \$full_URL = get_bloginfo( 'wpurl' ) ;
      \$strip_WWW = str_replace( 'http://www.','',\$full_URL ) ;
      \$strip_HTTP = str_replace( 'http://','',\$strip_WWW ) ;

      \$rootDomain1 = preg_replace( '~.([a-z]+.[a-z]+)~i', '\$1', \$strip_HTTP ) ;
      \$rootDomain2 = preg_replace( '~[a-z]\.((?:[a-z]+\.)+[a-z]+)\$~i', '\$1', \$strip_HTTP ) ;
?>

<?php echo \$full_URL ; ?>
<?php echo \$strip_HTTP ; ?>
<?php echo \$rootDomain1 ; ?>
<?php echo \$rootDomain2 ; ?>

[ I will comment on the other issues and suggestions by pratima_mcs and kaufmed in my next post. ]
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35196836
Looking at the Exec-PHP FAQ, I can't see any mention of having to escape $ characters. Have you tried it without the escapes?
I'll install the plugin and try it out myself.
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35196840
Also, please post the urls you tried that didn't work.
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35197036
Here is the code in the WP post and its outputted text.
I did get the regex a bit wrong, but there is no need to escape variable names.

The code:  
Normal text
<?php
  $url = 'http://www.donkey.bites.com';
  $rootdomain = preg_replace('~.*\.([a-z]+\.[a-z]+)~i', '$1', $url);
  $rootdomain2 = preg_replace('~.*\.((?:[a-z]+\.)+[a-z]+)$~i', '$1', $url);
?>
<p><?php echo $url; ?></p>
<p><?php echo $rootdomain; ?></p>
<p><?php echo $rootdomain2; ?></p>

Open in new window


The output:  
Normal text

http://www.donkey.bites.com

bites.com

bites.com

Open in new window

0
 
LVL 75

Assisted Solution

by:käµfm³d 👽
käµfm³d   👽 earned 100 total points
ID: 35197608
@ollyatstithians

>>  Here is the code in the WP post and its outputted text.

What about the site:  www.extreme1.com ?

Domain names' valid character list is:  a-zA-Z0-9-

You would need to modify the pattern to accommodate. All "[a-z]" should become "[a-z0-9-]" (leaving out A-Z since you added the "i" modifier).
0
 
LVL 10

Expert Comment

by:ollyatstithians
ID: 35197975
kaufmed:
I agree. Well spotted.
0
 

Author Comment

by:WizeOwl
ID: 35204719
Thanks to everyone for their input.

Here's the solution I was able to come up with:

SOLUTION

ConfigFile.php
(this file contains variables used by the main script to create the Wordpress page.)
$pageContent = "(variable definition begins with double quotes) Then lot's of text including html tags.
 Then the php code as follows that must be rendered as text on the web page (using php-exec plugin)...
<?php
	\$full_URL = get_bloginfo( 'wpurl' ) ;
	\$strip_www = str_replace( 'http://www.','',\$full_URL ) ;
	\$strip_http = str_replace( 'http://','',\$strip_www ) ;
	\$rootDomain = preg_replace( '/^(.*?)\.(.*)\$/','\$2', \$full_URL ) ;

	echo '<br />' . \$strip_www ;
	echo '<br />' . \$strip_http ;
	echo '<br />' . \$rootDomain ; 
?>

More page text. Variable definition terminated with closing double quotes."

Open in new window


NOTES

1. Although neither piece of code worked that was provided by ollyatstithians, it was those contributions that helped me sort out the regular expression syntax and determine which characters needed to be escaped. I was then able to apply that knowledge to the original suggestion by pratima_mcs.

Thanks also to kaufmed for valuable contributions.

2. Single quotes do not need to be escaped (as long as the surrounding block was using double quotes, $myVariable = "text & html content plus php code"
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The purpose of this video is to demonstrate how to reset a WordPress password if you are locked out and cannot reset the password. A typical use would be if you cannot access the email to which WordPress would send the password recovery email to…
The viewer will learn how to count occurrences of each item in an array.

628 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question