views:

188

answers:

3

As usual I have trouble writing a good regex.

I am trying to make a plugin for Joomla to add a button to the optional print, email and PDF buttons produced by the core on the right of article titles. If I succeed I will distribute it under the GPL. None of the examples I found seem to work and I would like to create a php-only solution.

The idea is to use the unique pattern of the Joomla output for article titles and buttons for one or more regex. One regex would find the right table by looking for a table with class "contentpaneopen" (of which there are several in a page) and containing a cell with class "contentheading". A second regex could check if in that table there is a cell with class "buttonheading". The number of these cells could be from zero to three but I could use this check if the first regex returns more than one match. With this, I would like to replace the table by the same table but with an extra cell holding the button I want to add. I could do that by taking off the last row and table closing tags and inserting my button cell before adding those closing tags again.

The normal Joomla output looks like this:

<table class="contentpaneopen">
    <tbody>
     <tr>
      <td width="100%" class="contentheading">
       <a class="contentpagetitle" href="url">Title Here</a>
      </td>
      <td width="100%" align="right" class="buttonheading">
       <a rel="nofollow" onclick="etc" title="PDF" href="url"><img alt="PDF" src="/templates/neutral/images/pdf_button.png"/></a>
      </td>
      <td width="100%" align="right" class="buttonheading">
       <a rel="nofollow" onclick="etc" title="Print" href="url"><img alt="Print" src="/templates/neutral/images/printButton.png" ></a>
      </td>
     </tr>
    </tbody>
</table>

The code would very roughly be something like this:

$subject = $article;
$pattern1 = '[regex1]'; //<table class="contentpaneopen">etc</table>
preg_match($pattern, $subject, $match);
$pattern2 = '[regex2]'; //</tr></tbody></table>
$replacement = [mybutton];
echo preg_replace($pattern2, $replacement, $match);

Without a good regex there is little point doing the rest of the code, so I hope someone can help with that!

+1  A: 

Is there a reason that you need to use regex for this? DOM parsing would be much more straightforward.

Amber
+1 dom parsing would make the things really easy to do.
RageZ
+2  A: 

This is a common question on SO and the answer is always the same: regular expressions are a poor choice for parsing or processing HTML or XML. There are many ways they can break down. PHP comes with at least three built-in HTML parsers that will be far more robust.

Take a look at Parse HTML With PHP And DOM and use something like:

$html = new DomDocument;
$html->loadHTML($source); 
$html->preserveWhiteSpace = false; 
$tables = $html->getElementsByTagName('table'); 
foreach ($tables as $table) {
  if ($table->getAttribute('class') == 'contentpaneopen') {
    // replace it with something else
  }
}
cletus
I had no idea that was possible! It looks to be very convenient. But since I have not yet seen any Joomla plugins use this way of doing it I'll have to check if and how it can be done in such a way that the source is updated and fed back to the system..
E Wierda
+1  A: 

Since a plugin in the scenario you provided is called everytime you load a page, a regex approach is faster than a dom call, that's why a lot of people use this approach. In Joomla's documentation, you can see too why a regex in the provided scenario is better than trying to use a dom approach.

The problem with your solution is that it's tied with Joomla's default template. I don't remember if it uses the same class="contentheading" structure in all templates. If you plan to GPL such an extension, you should be careful about that.

What you're trying to do seems to me as a template override, explained in more details here. Is a much more simpler solution. For example, the php that creates your article title's:

<div class="componentheading<?php echo $this->params->get('pageclass_sfx')?>">
    <h2><?php echo $this->escape($this->params->get('page_title')); ?></h2>
</div>

You just need to override the com_content article template, and echo the html for the pdf buttons after the >get('page_title') call. If you don't want to echo the html, you can create a module or a component, import it in the template and after the >get('page_title') you call the methods in your component that show the html.

This component could have various checkboxes "show pdf (yes/no)" and other interesting actions.

GmonC