views:

3106

answers:

6

I'm searching for a PHP syntax highlighting engine that can be customized (i.e. I can provide my own tokenizers for new languages) and that can handle several languages simultaneously (i.e. on the same output page). This engine has to work well together with CSS classes, i.e. it should format the output by inserting <span> elements that are adorned with class attributes. Bonus points for an extensible schema.

I do not search for a client-side syntax highlighting script (JavaScript).

So far, I'm stuck with GeSHi. Unfortunately, GeSHi fails abysmally for several reasons. The main reason is that the different language files define completely different, inconsistent styles. I've worked hours trying to refactor the different language definitions down to a common denominator but since most definition files are in themselves quite bad, I'd finally like to switch.

Ideally, I'd like to have an API similar to CodeRay, Pygments or the JavaScript dp.SyntaxHighlighter.

Clarification:

I'm looking for a code highlighting software written in PHP, not for PHP (since I need to use it from inside PHP).

+5  A: 

[I marked this answer as Community Wiki because you're specifically not looking for Javascript]

http://softwaremaniacs.org/soft/highlight/ is a PHP (plus the following list of other languages supported) syntax highlighting library:

Python, Ruby, Perl, PHP, XML, HTML, CSS, Django, Javascript, VBScript, Delphi, Java, C++, C#, Lisp, RenderMan (RSL and RIB), Maya Embedded Language, SQL, SmallTalk, Axapta, 1C, Ini, Diff, DOS .bat, Bash

It uses <span class="keyword"> style markup.

It has also been integrated in the dojo toolkit (as a dojox project: dojox.lang.highlight)

Though not the most popular way to run a webserver, strictly speaking, Javascript is not only implemented on the client-side, but there are also Server-Side Javascript engine/platform combinations too.

micahwittman
RIB highlighting? Awesome! I guess I should get around to getting my editor to highlight it properly one of these days...
Mike Boers
+1  A: 

Another option is to use the GPL Highlight GUI program by Andre Simon which is available for most platforms. It converts PHP (and other languages) to HTML, RTF, XML, etc. which you can then cut and paste into the page you want. This way, the processing is only done once.

The HTML is also CSS based, so you can change the style as you please.

Personally, I use dp.SyntaxHighlighter, but that uses client side Javascript, so it doesn't meet your needs. It does have a nice Windows Live plugin though which I find useful.

Rob Prouse
+1 - Exactly what I was looking for. Thanks!
Alan
+2  A: 

It might be worth looking at Pear_TextHighlighter (documentation)

I think it won't by default output html exactly how you want it, but it does provide extensive capabilities for customisation (i.e. you can create different renderers/parsers)

Tom Haigh
A: 

Krijn Hoetmer's PHP Highlighter provides a completely customizable PHP class to highlight PHP syntax. The HTML it generates, validates under a strict doctype, and is completely stylable with CSS.

Mathias Bynens
This only works for PHP though, not for other languages as well.
Konrad Rudolph
+6  A: 

Since no existing tool satisfied my needs, I wrote my own. Lo and behold:

Hyperlight

Usage is extremely easy: just use

 <?php hyperlight($code, 'php'); ?>

to highlight code. Writing new language definitions is relatively easy, too – using regular expressions and a powerful but simple state machine. By the way, I still need a lot of definitions so feel free to contribute.

For now, I've hosted the source code on Google code (see link above) which makes collaboration very easy.

Konrad Rudolph
+1  A: 

A little late to chime in here, but I've been working on my own PHP syntax highlighting library. It is still in its early stages, but I am using it for all code samples on my blog.

Just checked out Hyperlight. It looks pretty cool, but it is doing some pretty crazy stuff. Nested loops, processing line by line, etc. The core class is over 1000 lines of code.

If you are interested in something simple and lightweight check out Nijikodo: http://www.craigiam.com/nijikodo

Craig
@Craig: “The core class is over 1000 lines of code” Wait, what? Noo, I’d know that. It’s considerably shorter – I just tried to put all the core functionality (i.e. *several* classes) into one *file* to make it easier distributable. A mistake, in hindsight. Furthermore, there’s no line by line processing. It’s basically a normal lexical analyzer (only it can also handle recursive token definitions). – That said, your code looks nice too. I’ll definitely have a look at it.
Konrad Rudolph