views:

87

answers:

3

Hi,
I'm trying to learn some aspects of the Chomsky Hierarchy which are related to programming languages, and i still have to read the Dragon Book.

I've read that most programming languages can be parsed as a context free grammar (CFG). In term of computational power, it equals the one of a pushdown non deterministic automaton. Am I right?

If it's true, then how could a CFG hold an unrestricted grammar (UG), which is turing complete? I'm asking because, even if programming languages are described by CFGs, they are actually used to describe turing machines, and so via an UG.

I think that's because of at least two different levels of computing, the first, which is the parsing of a CFG focuses on the syntax related to the structure ( representation ? ) of the language, while the other focuses on the semantic ( sense, interpretation of the data itself ? ) related to the capabilities of the programming language which is turing complete. Again, are these assumptions right?

Thanks a lot.

+1  A: 

This is absolutely not true. Most programming languages have a syntax that can be described by a CFG or BNG, but conforming to the syntax does not guarantee a legal program. There are all sorts of extra conditions such as "variables must be declared before use" or "the types in this expression must be combined in a legal way" that are not covered by the grammar, and that is what makes the languages non-context-free. (This is a bit like XML, which has a formally verifiable definition, but usually also extra constraints that a parser cannot verify.)

Kilian Foth
+1  A: 

Very good example of a language,that does not have CFG for its syntax is C++. You seem not to understand the UG exactly. The universal grammar is a problem of interpretation described as a language of words which contain code for turing machine and word which is accepted by that turing machine. So you do not encode the language itself (set of words), but the turing machine for it. Now comes the point - you can have a language of infinite words, but you cannot have a word of infinite symbols. This means, that UG as well contains finite words and therefore all descriptions of turing machines are finite. The description of the turing machine (program in a programming language) has therefore finite number of symbols (statements), so the language of descriptions (programming language syntax grammar) can be even regular. Look for example at the Binary Combinatory Logic.

Gabriel Ščerbák
+1  A: 
Norman Ramsey
FYI, you *can* do a BNF for Tcl. It's just less informative than in most languages because the usual recursive terms (`if`, `while`, program blocks in general) are defined entirely at the semantic level. That is, they're standard library functions, nothing more. (The flip side of this is that it is *really* easy to embed foreign syntaxes inside Tcl programs, provided they're parenthetically balanced. Virtually everything is…)
Donal Fellows
@Donal: Yes, except any program can add arbitrary new productions to the "grammar", dynamically. Having a parser is not much use in practice---you can't really analyze a Tcl program---and Tcl doesn't have much of a parser. But embedding strangeness is indeed very, *very* easy.
Norman Ramsey
@Norman,Thx a lot ! It was the kind of response i was looking for. Not sure that everything about this is clear, but it's clearer.And I think I got the point, "the magic is in the interpretation, not in the syntax".
dader51