Regex: Remove Subdomain from URL

I'm using PHP 5.x and I need to remove the http://subdomain. from my URL's. The subdomain can contain letters [aA-zZ] and numbers [0-9].

I've got this much,
<?php
      \$full_URL = get_bloginfo('wpurl') ;
      \$http_URL = str_replace(\"http://www.\",\"\",\$full_URL) ;
      \$sub_URL = str_replace(\"http://\",\"\",\$http_URL) ;
      \$root_URL = str_replace( ?? ) ;
?>

but that leaves this,
      subdomain.root.com

I need to also remove the subdomain and the dot "." so what remains is,
      root.com

Thanks for your help.
WizeOwlAsked:
Who is Participating?
 
Pratima PharandeConnect With a Mentor Commented:
function strip_out_subdomain($domain)  {      $only_my_domain = preg_replace("/^(.*?)\.(.*)$/","$2",$domain);      return $only_my_domain;  }
0
 
WizeOwlAuthor Commented:
Thanks, but can you convert that to the "str_replace" syntax with proper escape characters where needed? This code resides inside a Wordpress "page" using the execphp plugin.
0
 
ollyatstithiansCommented:
You won't be able to do this using only str_replace() unless you know what the exact subdomain string is. For more complex replacement rules (ie. regex) you have to use preg_replace().

If you do know the subdomain name you are trying to strip out, you could do this:

$subdomain = 'mysubdomain';
$mynewurl = str_replace("http://", '' $myurl);
$mynewurl = str_replace("$subdomain.", '' $myurl);

I do appreciate that if execphp does not support preg functions then this leaves you a little stuck.
0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
ollyatstithiansCommented:
That said, I can't see why you can't use preg unless it is not available in your php installation (which is unlikely).
0
 
darren-w-Commented:
Perhaps use substr ?:

<?php
      $url = "http://devd.domainnamffe.co.uk";
        echo substr ($url,strpos($url,".")+1)."<br>";
        $url = "devd.domainnamffe.co.uk";
        echo substr ($url,strpos($url,".")+1);

?>
0
 
darren-w-Commented:
ps ignore above, its incorrect
0
 
käµfm³d 👽Commented:
What about something like this?
<?php
      $full_URL = get_bloginfo('wpurl') ;
      $pos = strpos($full_URL, "root.com");
      $clean_URL = substr($full_URL, $pos)
?>

Open in new window

0
 
WizeOwlAuthor Commented:
The problem with the above solution is I don't know what the root string is. It is not "root.com", and it can be anything, depending on what domain the script is running on. This needs to be generic.

So, basically, here's what I have so far:
 - I have stripped the http:// from the URL which provides $sub_URL
 - Now I need to strip all characters from the left to the next "dot".

This:  
      subdomain.root.com

Needs to become this:
      root.com

I just don't know the syntax to do this. Also, since it's being passed from within a Wordpress page, all the special characters need to be escaped.

      \$root_URL = str_replace ( \"\^\[a-z,A-Z,0-9\]\",\"\",\$sub_URL ) ;      // All alpha characters
      \$root_URL = str_replace( \"\.\",\"\",\$root_URL ) ;                            // the "dot"


0
 
WizeOwlAuthor Commented:
Perhaps I can rephrase the question...

I need to isolate the "root" domain. Perhaps there is another way to do this of which I am not aware, such as with a Wordpress function, or internal PHP method.

From this:
      http://subdomain.root.com

I need to access this:
      root.com

0
 
käµfm³d 👽Commented:
....

But you're trying to use regular expressions with a function that doesn't use such. Or am I not understanding how str_rpelace works?


>>  Now I need to strip all characters from the left to the next "dot"

How would you intend to handle multiple sub-domains? For instance:  www.hub1.example.com ?
0
 
WizeOwlAuthor Commented:
Please re-read all my posts from the beginning. It answers your question.
0
 
WizeOwlAuthor Commented:
Also, I only need to handle the following types of URL's:

http://www.Root.com
and
http://subdomain.Root.com
0
 
ollyatstithiansConnect With a Mentor Commented:
OK, how about this:
Assuming that the domain is a .com (or a tld with a non reserved sub-domain, so NOT .co.uk or similar) you can use:

$rootdomain = preg_replace('~\.([a-z]+\.[a-z]+)~i', '$1', $url);

so if you want to eval() that code (which is what I think your plugin is doing) you need to escape the quotes:

$stringtoeval = '$rootdomain = preg_replace(\'~\.([a-z]+\.[a-z]+)~i\', \'$1\', $url);'

You should only need to escape $ characters in double quoted strings.

Just to reiterate what kaufmed said: You cannot use regular expressions with str_replace(). At all.
0
 
ollyatstithiansCommented:
Here is a regex that does it the other way round (ie. it loses the lowest order subdomain from the url):

preg_replace('~[a-z]\.((?:[a-z]+\.)+[a-z]+)$~i', '$1', $url);
0
 
käµfm³d 👽Commented:
>>  Please re-read all my posts from the beginning. It answers your question.

Perhaps you should re-read mine. pratima_mcs gave you an example of how to use preg_replace and you asked to convert it to str_replace. Again I ask, is there some magic about str_replace you are privy to that the rest of the world is not?
0
 
WizeOwlAuthor Commented:
ollyatstithians, your two suggestions do not seem to be working, or perhaps I'm not using it correctly. It's not stripping the subdomain. Can you suggest a fix for it?

<?php
      \$full_URL = get_bloginfo( 'wpurl' ) ;
      \$strip_WWW = str_replace( 'http://www.','',\$full_URL ) ;
      \$strip_HTTP = str_replace( 'http://','',\$strip_WWW ) ;

      \$rootDomain1 = preg_replace( '~.([a-z]+.[a-z]+)~i', '\$1', \$strip_HTTP ) ;
      \$rootDomain2 = preg_replace( '~[a-z]\.((?:[a-z]+\.)+[a-z]+)\$~i', '\$1', \$strip_HTTP ) ;
?>

<?php echo \$full_URL ; ?>
<?php echo \$strip_HTTP ; ?>
<?php echo \$rootDomain1 ; ?>
<?php echo \$rootDomain2 ; ?>

[ I will comment on the other issues and suggestions by pratima_mcs and kaufmed in my next post. ]
0
 
ollyatstithiansCommented:
Looking at the Exec-PHP FAQ, I can't see any mention of having to escape $ characters. Have you tried it without the escapes?
I'll install the plugin and try it out myself.
0
 
ollyatstithiansCommented:
Also, please post the urls you tried that didn't work.
0
 
ollyatstithiansCommented:
Here is the code in the WP post and its outputted text.
I did get the regex a bit wrong, but there is no need to escape variable names.

The code:  
Normal text
<?php
  $url = 'http://www.donkey.bites.com';
  $rootdomain = preg_replace('~.*\.([a-z]+\.[a-z]+)~i', '$1', $url);
  $rootdomain2 = preg_replace('~.*\.((?:[a-z]+\.)+[a-z]+)$~i', '$1', $url);
?>
<p><?php echo $url; ?></p>
<p><?php echo $rootdomain; ?></p>
<p><?php echo $rootdomain2; ?></p>

Open in new window


The output:  
Normal text

http://www.donkey.bites.com

bites.com

bites.com

Open in new window

0
 
käµfm³d 👽Connect With a Mentor Commented:
@ollyatstithians

>>  Here is the code in the WP post and its outputted text.

What about the site:  www.extreme1.com ?

Domain names' valid character list is:  a-zA-Z0-9-

You would need to modify the pattern to accommodate. All "[a-z]" should become "[a-z0-9-]" (leaving out A-Z since you added the "i" modifier).
0
 
ollyatstithiansCommented:
kaufmed:
I agree. Well spotted.
0
 
WizeOwlAuthor Commented:
Thanks to everyone for their input.

Here's the solution I was able to come up with:

SOLUTION

ConfigFile.php
(this file contains variables used by the main script to create the Wordpress page.)
$pageContent = "(variable definition begins with double quotes) Then lot's of text including html tags.
 Then the php code as follows that must be rendered as text on the web page (using php-exec plugin)...
<?php
	\$full_URL = get_bloginfo( 'wpurl' ) ;
	\$strip_www = str_replace( 'http://www.','',\$full_URL ) ;
	\$strip_http = str_replace( 'http://','',\$strip_www ) ;
	\$rootDomain = preg_replace( '/^(.*?)\.(.*)\$/','\$2', \$full_URL ) ;

	echo '<br />' . \$strip_www ;
	echo '<br />' . \$strip_http ;
	echo '<br />' . \$rootDomain ; 
?>

More page text. Variable definition terminated with closing double quotes."

Open in new window


NOTES

1. Although neither piece of code worked that was provided by ollyatstithians, it was those contributions that helped me sort out the regular expression syntax and determine which characters needed to be escaped. I was then able to apply that knowledge to the original suggestion by pratima_mcs.

Thanks also to kaufmed for valuable contributions.

2. Single quotes do not need to be escaped (as long as the surrounding block was using double quotes, $myVariable = "text & html content plus php code"
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.