views:

77

answers:

1

Hi,

The usual way to represent way the grammar associated with a textual network protocol is using ABNF.

Just like any EBNF-related meta-syntax, ABNF enables to describe context-free grammars.

These context-free grammars can represent a non-regular language, right ?

The usual way to implement a network stack is developing a state machine. Is there any textual network protocol which is not a regular language ?

A: 

I suppose you refer to "traditional" line-based textual protocols. For instance, any protocol that uses XML is not regular, since XML is not a regular language (in fact, XML is not even context-free if you look at the level of individual characters). In that case, I cannot really think of any non-regular protocols. The most common way to go non-regular in language syntax is to require the parser to be able to count, and it seems to me that a protocol that would require that kind of ability to parse the messages would just be complex (e.g., matching parentheses) or limited (e.g., by having explicit counts instead of allowing arbitrarily long lists).

The use of BNF is probably because it is easy to understand as a syntax description and not because context-freedom gives you any necessary additional power. The main benefit, I would think, is in the ability to use variables to stand for common pieces of syntax. If you look at the BNFs in common Internet protocol specifications, you'll note that they really use just features of regular languages: unbounded repetition, choice, optionality.

Your statement referring to state-machine implementations of protocols sounds to me like some misunderstanding. It is not the parser that is implemented as a state machine but the protocol engine, and state transitions are not triggered by individual characters or tokens in the input but by complete messages. And usually, the states in a protocol state machine are more concerned with establishment and tear-down of communication than the actual communication. For instance, the TCP state machine has 11 states, of which only one suffices for the connection-established state, which is where all the actual data transmission happens, and the rest are all about opening and closing the connection. (Yes, I know TCP is not textual, but it is a well-known protocol with an established state machine, so it serves as a good example; at the level of protocol engines, it does not matter whether the message syntax is text or binary.)

jk