If you want to remove the part of the domain that is administrated by domain name registrars, you will need to use a list of such suffixes like the Public Suffix List.
But since a walk through this list and testing the suffix on the domain name is not that efficient, rather use this list only to build an index like this:
$tlds = array(
// ac : http://en.wikipedia.org/wiki/.ac
'ac',
'com.ac',
'edu.ac',
'gov.ac',
'net.ac',
'mil.ac',
'org.ac',
// ad : http://en.wikipedia.org/wiki/.ad
'ad',
'nom.ad',
// …
);
$tldIndex = array_flip($tlds);
Searching for the best match would then go like this:
$levels = explode('.', $domain);
for ($length=1, $n=count($levels); $length<=$n; ++$length) {
$suffix = implode('.', array_slice($levels, -$length));
if (!isset($tldIndex[$suffix])) {
$length--;
break;
}
}
$suffix = implode('.', array_slice($levels, -$length));
$prefix = substr($domain, 0, -strlen($suffix) - 1);
Or build a tree that represents the hierarchy of the domain name levels as follows:
$tldTree = array(
// ac : http://en.wikipedia.org/wiki/.ac
'ac' => array(
'com' => true,
'edu' => true,
'gov' => true,
'net' => true,
'mil' => true,
'org' => true,
),
// ad : http://en.wikipedia.org/wiki/.ad
'ad' => array(
'nom' => true,
),
// …
);
Then you can use the following to find the match:
$levels = explode('.', $domain);
$r = &$tldTree;
$length = 0;
foreach (array_reverse($levels) as $level) {
if (isset($r[$level])) {
$r = &$r[$level];
$length++;
} else {
break;
}
}
$suffix = implode('.', array_slice($levels, - $length));
$prefix = substr($domain, 0, -strlen($suffix) - 1);