This is a VERY convoluted question. Sure, it seems simple on the surface, but there's so much to it that it's far from trivial.
There are the basics, like the meta "generator" tag. If that's set, it can tell you something. But it's not always set (and you can't trust it even when it is set).
Then you come down to other things. Things such as quirks in output generation and header order. Obviously you would need to create a metric to figure these out.
Then you can hit directory structure. For example, Wordpress has the wp_includes
directory.
You can also look for known files (like an xml file).
And you can also look at the server to get a hint (detect what kind of web server it's running (which is a non-trivial problem in itself) and you can get clues to the application). For example, if the server was Mongrel, you know it's ROR. If the server is IIS, you have a decent idea it's .NET (It could be others too), but if it's not IIS, you know it's not .NET.
It's a VERY hard problem. There's no easy solution that works most of the time, and every method can be bypassed by a competent developer/system administrator. But best of luck trying...