Well, the obvious easy answer is to present your "summary" without any bbcode-driven markup at all (regex below taken from here)
$summary = substr( preg_replace( '|[[\/\!]*?[^\[\]]*?]|si', '', $article ), 0, 200 );
However, do do the job you explicitly describe is going to require more than just a regex. A lexer/parser would do the trick, but that's a moderately complicated topic. I'll see if I can come up w/something.
Here's a pretty ghetto version of a lexer, but for this example it works. This converts an input string into bbcode tokens.
class SimpleBBCodeLexer
$tokens = array()
, $patterns = array(
self::TOKEN_OPEN_TAG => "/\\[[a-z].*?\\]/"
, self::TOKEN_CLOSE_TAG => "/\\[\\/[a-z].*?\\]/"
const TOKEN_TEXT = 'TEXT';
public function __construct( $input )
for ( $i = 0, $l = strlen( $input ); $i < $l; $i++ )
$this->processChar( $input{$i} );
protected function processChar( $char=null )
static $tokenFragment = '';
$tokenFragment = $this->processTokenFragment( $tokenFragment );
if ( is_null( $char ) )
$this->addToken( $tokenFragment );
} else {
$tokenFragment .= $char;
protected function processTokenFragment( $tokenFragment )
foreach ( $this->patterns as $type => $pattern )
if ( preg_match( $pattern, $tokenFragment, $matches ) )
if ( $matches[0] != $tokenFragment )
$this->addToken( substr( $tokenFragment, 0, -( strlen( $matches[0] ) ) ) );
$this->addToken( $matches[0], $type );
return '';
return $tokenFragment;
protected function addToken( $token, $type=self::TOKEN_TEXT )
$this->tokens[] = array( $type => $token );
public function getTokens()
return $this->tokens;
$l = new SimpleBBCodeLexer( 'some [b]sample[/b] bbcode that [i] should [url="http://www.google.com"]support[/url] what [/i] you need.' );
echo '<pre>';
print_r( $l->getTokens() );
echo '</pre>';
The next step would be to create a parser that loops over these tokens and takes action as it encounters each type. Maybe I'll have time to make it later...