tags:

views:

42

answers:

1

Given a block of content, I'm looking to create a function in PHP to check for the existence of a keyword or keyword phrase inside an h1-h3 header tags...

For example, if the keyword was "Blue Violin" and the block of text was...

You don't see many blue violins. Most violins have a natural finish. <h1>If you see a blue violin, its really a rarity</h1>

I'd like my function to return:

  • The keyword phrase does appear in an h1 tag
  • The keyword phrase does not appear in an h2 tag
  • The keyword phrase does not appear in an h2 tag
+3  A: 
Gordon
Can DOM/XPATH be executed from a PHP script?
Scott B
@Scott did you follow the links? ;) Yes.
Gordon
Just beware that loading the DomDocument object is fairly heavy. So if you wanted to search multiple keywords, it would be best to just evaluate each query rather than re-building the dom and xpath objects each time (A class may work better depending on what you're exactly trying to do)...
ircmaxell
@Gordon: Sweet! Thanks for the help.
Scott B
I need to also check for keyword in the first and last sentence of the content. Should I create a function specific to that task or could this one be used to do that?
Scott B
It will be evaluating the post/page content in a wordpress editor, so I don't need to evaluate the whole body. The editor wraps its content in a div with id="content". Can I seed the evaluate statement with the #content div?
Scott B
@Scott sentences are not a concept for DOM. It knows nodes. If you have two sentences in a TextNode, you will have to match those with string functions. However, you can `$xpath->query("/path/to/div[@id='content']")` to get the DOMElement for that div and then access the content via it's API. Or use `$dom->getElementById('content)`. Have a look around SO for some DOM snippets or [look through some of my former answers](http://stackoverflow.com/search?q=user%3A208809+dom)
Gordon
Gordon, I think I'm there but I can't seem to convert your example from loading a static file to loading the wordpress content... $dom->load(??post_content or content div??);
Scott B
@Scott `load` is for XML/XHTML files. `loadHTMLFile` is for HTML files. Both can be used with URLs if fopen wrappers are enabled. To load an XML/XHTML string, use `loadXML`. To load HTML, use `loadHTML`.
Gordon
Got it. Works now! Thanks :)
Scott B