views:

1290

answers:

8

Is it possible without using regular expression?

For example, I want to check that a string is a valid domain:

domain-name
abcd
example

Are valid domains. These are invalid of course:

domaia@name
ab$%cd

And so on. So basically it should start with an alphanumeric character, then there may be more alnum characters plus also a hyphen. And it must end with an alnum character, too.

If it's not possible, could you suggest me a regexp pattern to do this?

EDIT:

Why doesn't this work? Am I using preg_match incorrectly?

$domain = '@djkal';
$regexp = '/^[a-zA-Z0-9][a-zA-Z0-9\-\_]+[a-zA-Z0-9]$/';
if (false === preg_match($regexp, $domain)) {
    throw new Exception('Domain invalid');
}
+3  A: 

Regular expression is the most effective way of checking for a domain validation. If you're dead set on not using a Regular Expression (which IMO is stupid), then you could split each part of a domain:

  • www. / sub-domain
  • domain name
  • .extension

You would then have to check each character in some sort of a loop to see that it matches a valid domain.

Like I said, it's much more effective to use a regular expression.

James Brooks
+1  A: 
/^[a-zA-Z0-9][a-zA-Z0-9\-\_]+[a-zA-Z0-9]$/

should match the domain name part (without .).

kender
What about i18n domains?
meder
When I use the above to validate domain with preg_match(), domain like dahjd#hjkasfh is valid.
Richard Knop
@Richard Knop - the regex looks look to me. Are you sure you're using it correctly? Post some code!
Dominic Rodger
I have added a code to my post.
Richard Knop
this regexp is completely wrong. It won't accept domain names with less than three characters.
Alnitak
+1  A: 

If you don't want to use regular expressions, you can try this:

$str = 'domain-name';

if (ctype_alnum(str_replace('-', '', $str)) && $str[0] != '-' && $str[strlen($str) - 1] != '-') {
 echo "Valid domain\n";
} else {
 echo "Invalid domain\n";
}

but as said regexp are the best tool for this.

kemp
+1  A: 

Here is another way without regex.

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];
$ipAddress = gethostbyname($myDomainName);
if($ipAddress === $myDomainName)
{
   echo "There is no url";
}
else
{
   echo "url found";
}
Erkan BALABAN
valid != existing
kemp
+1  A: 

Your regular expression is fine, but you're not using preg_match right. It returns an int (0 or 1), not a boolean. Just write if(!preg_match($regex, $string)) { ... }

Arthur Reutenauer
+1  A: 

I think once you have isolated the domain name, say, using Erklan's idea:

$myUrl = "http://www.domain.com/link.php";
$myParsedURL = parse_url($myUrl);
$myDomainName= $myParsedURL['host'];

you could use :

if( false === filter_var( $myDomainName, FILTER_VALIDATE_URL ) ) {
// failed test

}

PHP5s Filter functions are for just such a purpose I would have thought.

It does not strictly answer your question as it does not use Regex, I realise.

Cups
A: 

This is simple. Some php egnine has a problem with split(). This code below will work.

<?php
$email = "[email protected]"; 
$domain = strtok($email, "@");
$domain = strtok("@");
if (@getmxrr($domain,$mxrecords)) 
   echo "This ". $domain." EXIST!"; 
else 
   echo "This ". $domain." does not exist!"; 
?>

bong
+1  A: 

Firstly, you should clarify whether you mean:

  1. individual domain name labels
  2. entire domain names (i.e. multiple dot-separate labels)
  3. host names

The reason the distinction is necessary is that a label can technically include any characters, including the NUL, @ and '.' characters. DNS is 8-bit capable and it's perfectly possible to have a zone file containing an entry reading "an\0odd\.l@bel". It's not recommended of course, not least because people would have difficulty telling a dot inside a label from those separating labels, but it is legal.

However, URLs require a host name in them, and those are governed by RFCs 952 and 1123. Valid host names are a subset of domain names. Specifically only letters, digits and hyphen are allowed. Furthermore the first and last characters cannot be a hyphen. RFC 952 didn't permit a number for the first character, but RFC 1123 subsequently relaxed that.

Hence:

  • a - valid
  • 0 - valid
  • a- - invalid
  • a-b - valid
  • xn--dasdkhfsd - valid (punycode encoding of an IDN)

Off the top of my head I don't think it's possible to invalidate the a- example with a single simple regexp. The best I can come up with to check a single _host_ label is:

if (preg_match('/^[a-z\d][a-z\d-]{0,62}$/i', $label) &&
   !preg_match('/-$/', $label))
{
    # label is legal within a hostname
}

To further complicate matters, some domain name entries (typically SRV records) use labels prefixed with an underscore, e.g. _sip._udp.example.com. These are not host names, but are legal domain names.

Alnitak