views:

59

answers:

4

Hi,

Which function in php validate if the string is html? My target to take input from user and check if input html and not just string.

Example for not html string:

sdkjshdk<div>jd</h3>ivdfadfsdf or sdkjshdkivdfadfsdf

Example for html string:

<div>sdfsdfsdf<label>dghdhdgh</label> fdsgfgdfgfd</div>

Thanks

+2  A: 

Do you mean HTML or XHTML?

The HTML standard and interpretation are so loose that your first snippet might work. It won't be pretty but you might get something.

XHTML is quite a bit more strict and at minimum will expect your snippet to be well-formed (all opened tags are closed; tags can nest but not overlap) and may throw warnings if you have unrecognized elements or attributes.

Something like Tidy - http://php.net/manual/en/book.tidy.php - is probably a good start. Once you load your snippet using that, you can use *tidy_error_count* or *tidy_get_error_buffer* to see if it's "okay enough" for your needs.

CaseySoftware
My target to take input from user and check if input html and not just string.
Yosef
Ok. And both are HTML... the HTML spec is *so* loose that it almost doesn't matter. What the second is additionally is XHTML. If that's what you're looking for, explore Tidy and see what you can do.
CaseySoftware
+1  A: 

You can use DomDocument's method loadHTML

a1ex07
+1  A: 

Maybe you need to check if the string is well formed.

I would use a function like this

function check($string) {
  $start =strpos($string, '<');
  $end  =strrpos($string, '>',$start);
  if ($end !== false) {
    $string = substr($string, $start);
  } else {
    $string = substr($string, $start, $len-$start);
  }
  libxml_use_internal_errors(true);
  libxml_clear_errors();
  $xml = simplexml_load_string($string);
  return count(libxml_get_errors())==0;
}

Just a warning: html permits unbalanced string like the following one. It is not an xml valid chunk but it is a legal html chunk

<ul><li>Hi<li> I'm another li</li></ul>

Disclaimer I've modified the code (without testing it). in order to detect well formed html inside the string.

A last though Maybe you should use strip_tags to control user input (As I've seen in your comments)

Eineki
+1  A: 

Are you trying to prevent users from posting html tags instead of strings? Cause if this is what you want to do you just need striptags()

Wich will remove any html tags from the string.

Iznogood