views:

85

answers:

0

How to define a working set of lexer and parser (exempli gratia: flex and bison) to support the C++0x styled raw string literals?

As you may already know, new string literals in C++0x can be expressed in a very flexible way.

R"<delim>...<delim>"; - in this code the <delim> can be pretty much everything and also no escape characters are needed.

Any kind of parentheses can be used to delimit the end of string:

R"(I love those who yearn for the impossible. (Von Goethe, "Faust"))";

Blocks of text can be simply defined using equal occurrences of same characters:

R";***************************(
  ; TINY BASIC FOR INTEL 8080  
  ;       VERSION 2.0  
  ;     BY LI-CHEN WANG  
  ; MODIFIED AND TRANSLATED  
  ;    TO INTEL MNEMONICS  
  ;     BY ROGER RAUSKOLB  
  ;     10 OCTOBER, 1976  
  ;       @COPYLEFT  
  ;  ALL WRONGS RESERVED      )
  ;***************************";

More information can be found here(wikipedia) and here(att).

I would like to use this fantastic feature in a language I am developing now.

So, how can I define a proper tokenizer and syntax analyzer to achive the result?

Thanks in advance for your answers!