views:

529

answers:

6

I'm implementing a programming language and I'm considering the following syntax:

@NamespaceX
 {
   +@ClassY <> : BaseTypeA 
    {
      +@NestedClassW<>
       {
       }

      +@MethodZ() : ReturnTypeC
       { 
         //".?" is a null-coallescing member access operator 
         @varD : ClassY =   predicateP ? objectQ.?PropertyS 
                          : predicateR ? valueB;
         @varE = predicateP ? varD  : throw Exception1();
         @f : ClassY -> ClassZ = param1 => MethodV(param1);
       }
    }

   +@UnitTest() [Test]
    { 
    }
 }

The following is the equivelant code in C# 3.0:

namespace NamespaceX
 { 
   public class ClassY : BaseTypeA
    { 
      public class NestedClassW
       { 
       }

      public ReturnTypeC MethodZ()
       { 
         ClassY varD =   predicateP ? ((objectQ!=null) ? objectQ.PropertyS
                                                       : null) 
                       : predicateR ? valueB
                       : null;

         ClassY varE;
         if (predicateP) varE = varD;
         else throw new Exception1();

         Func<ClassY,ClassZ> f = param1 => MethodV(param1);                 
       }
    } 

   public _NamespaceX() //non-type namespace members
    { 
      [Test] 
      public void UnitTest() {}
    }
 }

Which one would you rather write in? Which one is more readable? The code is terse in both examples but in the case of the C# example, once you're familiar with the idioms used I think it should be readable enough. I just want to know what you find good or bad about this syntax, not necessarily suggested improvements although those are fine, too, to explain what you're more into.

The language I'm designing is primarily an alternative to C# (but not necessarily meant for existing C# users to wholeheartedly adopt). The main consideration I had for readability is for having a single character declaration operator "@" with a single character accessibility operator ("+" for public or "-" for private) so that the name is close to the left margin where it can be easily scanned for instead of rooting through "public static void TheName ()" and such. I think that the "@" also makes it easier to scan for declarations of any kind as opposed to using a keyword which is indistinguishable from other keywords when scanning.

I'm going for both the stream-of-consciousness writing and at-a-glance reading aspects so I want feedback on more than just the "+@" part—for me, english keywords get in the way for these two activities (moreso in VB than C#) so I've decided to go with terse symbols to imply the declaration constructs:

 @foo { }          //namespace foo {}
 +@foo() : Bar {}  //public Bar foo() {}
 +@Foo<> : Bar {}  //public class Foo : Bar {}
 @foo : Bar = baz; //Bar foo = baz;
 @foo = p ? bar    //var foo = p ? bar : null;
 @foo = bar.?baz;  //var foo = (bar != null) ? bar.baz : null; 
 foo  = p  ? bar : throw E(); //if (p) foo = bar; else throw new E();

Feel free to change the identifiers to something more meaningful but please preserve the code's structure. I am open to suggestions, however, for making the C# example more readable while remaining faithful as a translation of the first example.

[I put in a close vote for this "being no longer relevant" since to really do this justice would take too much work for my purposes - Mark]

+3  A: 

I'd write in http://en.wikipedia.org/wiki/Brainfuck

Mehrdad Afshari
I'm not trying to implement an esoteric language—I'm actually trying to design a language that's easier to read and write than C#.
Mark Cidade
I read your question. My answer was pretty much a joke. But does readability only depend on the syntax? I believe what you are used to matters much more. For instance, as a mostly .NET guy, I can't really look at Win32 C code (those uppercase constants)
Mehrdad Afshari
heh, that was my first thought too.
Charlie Martin
Mark, it already exists. It's called python :)
Stefano Borini
After the boilerplate stuff is done with, I find C# easier to read/write (particularly with LINQ and lambda syntax)
Mark Cidade
+4  A: 

Well, as a novice, it is far easier to guess what the C# code is doing than what your code is doing. What you've got might make sense to someone immersed in the language, but you've beaten Perl at its own game with your notation.

I think that it would be very hard to teach your notation; it would also be hard to learn. When familiar with it, you might be productive because there is less to type, but I think that redundancy helps the human (and compiler, and debugger) understand what is going on. You have to strike a balance between minimalism and verbosity (maximalism?).

Is this piece a syntax error:

     @d : y = p ? q.?s : 
              r ? b;

At least, there'd have to be an explanation of what the 'q.?s' bit means (it might somehow combine a non-null test with the member access, but it is hardly transparent).

If you're going down this route of more extensive use of symbols, you need to target Unicode. See also Fortress. (Yes, you run into problems with current systems that are not as conversant with Unicode as they should be, but you have a far wider choice of symbols that can be used to good effect, if you are careful.)


For the benefit of those who come later, the notation used at the time I wrote my answer was:

@x
 {
   +@y <> : a 
    {
      +@z() : c
       { 
         @d : y = p ? q.?s : 
                  r ? b;
         @e = p ? d : throw exception1();
         @f : y -> y2 = x1 => z2(x1);
       }
    }

   +@x() [g]
    { 
    }
 }

When I last looked, the identifiers had been changed to the long forms now on display - so that @x became @NamespaceX, for example. While I understand that it is an illustrative example, the need to explain that what follows the first @is a namespace and that what follows other @ symbols are classes, methods, functions, etc is indicative of the problem with overly compact notation. Keywords used sparingly - but not too sparingly - help a lot. COBOL and SQL both have lots of keywords; C has a few keywords.

Jonathan Leffler
I can agree and disagree at the same time on the Unicode part. The issue is mapping unicode characters to allow for good flow in writing code.
DouglasH
I didn't change the syntax at all. I only changed the identifiers.
Mark Cidade
OK - fair enough...I'll fix the comment accordingly.
Jonathan Leffler
+6  A: 

I would go with the C# style syntax.

I'm assuming that because you are comparing this to C#, that you are targeting your language at C# programmers.

If that's the case, you would be better off using the syntax they are already familiar with. That would allow them to focus on the unique value added by your language, and would eliminate the need for them to spend time learning a lot of new syntax. If it looks familiar, except for some cool new features, people will think "wow.. look at the cool stuff you can do in Mark's language". If the language looks too different, many people will get scared away by it, and will never notice all the cool new things your language enables.

If the only value you are planning on adding is a "shortened syntax", then I would rethink your design. People pick programming languages because of the cool things they can do with them, not because of the syntax.

Scott Wisniewski
I have to disagree with respect to why people pick programming languages, otherwise we'd all be programming in Common Lisp since it's unbounded as to what you can do with it. I think the main reason there aren't more lispers is because of Lisp's (lack of) syntax.
Mark Cidade
I actually agree with Mark that people are somewhat irrational and will prefer a language with syntax that "feels good" to them...
j_random_hacker
... But OTOH, I agree with Scott that if you want to build a base of users of your new language, making your language syntactically similar to a popular existing language (e.g. C#) is a no-brainer.
j_random_hacker
@Mark: I guess what I'm trying to say is that marketing a language as "just like C#, but with terse syntax" probably won't work. It's not compelling enough of a reason to switch.
Scott Wisniewski
@MarkAbout the LISP thing. I would say that people use LISP despite its syntax (I known many LISPers would disagree, saying they love the syntax, but that's more of a learned affection). If Lisp didn't have it's cool features, and just had odd syntax, then no one would use it.
Scott Wisniewski
I'm not marketing the language to all C# programmers, but I would like for it to apeal for those who are experienced in C#, build large complex applications or prefer a more pseudocode-like syntax, and find C# too tedious.
Mark Cidade
Many Lispers do like Lisp's syntax but I think that the majority of mainstream programmers are not comfortable with S-expressions and so avoid lisp despite its cool features.
Mark Cidade
But, Lisp has a community of VERY dedicated users. That dedication comes from the expressiveness LISP offers, not it's syntax. If you had LISP without the cool features it would have 0 users.
Scott Wisniewski
I think if your goal is just to convert keywords into single character symbols, then you are going to have problems attracting people to your language.
Scott Wisniewski
+3  A: 

Readability and writeability of a language tend to be opposites. Your example is very writable – once one becomes familiar with the syntax and idioms of your language, they will be able to easily write in it. You have removed some of the verbosity of C#, but as a result, you have lost readability.

As an example, imagine a programmer who has worked in OOP languages, but never one with C-style syntax. Which do you think will be easier for them to reverse engineer, C# or your example? With C#, it's very clear what each code block is – a namespace, a class, and it's not hard to guess about methods and properties. Jonathan Leffler's example shows what your language is like when removing the name of the construct from the type or member name. It gets a lot more confusing.

And, what about nested types? In order to read into nested types, you'll have to start counting brackets!

Matt Olenik
+3  A: 

I actually prefer languages with no more symbols than than neccesary. This is because I'm a touch-typist and find the effort to type a symbol, especially ones on the number keys, to be slower than typing a lower-case word with a few letters. That's not to say I'm a big fan of cobol (actually can't even write hello world in it), but I usually get annoyed when, in any language, I have to type long sequences of ^&^%$)(&^$$^&*('s, So I'd prefer those only appear infrequently, and where it follows from an obvious notation, like arithmetic.

TokenMacGuy
+1  A: 

Personally, I like symbols, as long as they are used consistently and in a logical manner.
For example in C it's really straightforward: you use &var to get an address and use *ptr to dereference it, pointers are declared by appending a * to the type. Once you have understood the 2 characters and what they do, you just use them without thinking about it and are happy as you don't have to write "address(var)" or "deref(ptr)".

However, in Perl, you have scalars ($), arrays(@), hashes(%) and references($).
This is where the trouble starts: to create an array, you write @array=(1,2,3). To create a hash reference, you can either prefix the @ with a backslash ($arrayrefref=\@array) or create an anonymous hash reference by putting the elements in brackets ($arrayref=[1,2,3]), for dereferencing you can either prefix the reference with the corresponding sign (@$arrayref[1]) or put -> between name and subscript operator ($arrayref->[1]).

So if you have to create your own syntax, plan it from the beginning and only use them in ways it's intuitive to understand, for readability you should make sure the operator precedence isn't different from what users coming from a similiar language would expect and makes commonly used operations (e.g. *ptr++) possible without using many braces.

tstenner