views:

564

answers:

5

Some people say that every programming language has its "complexity budget" which it can use to accomplish its purpose. But if the complexity budget is depleted, every minor change becomes increasingly complicated and hard to implement in a backward-compatible way.

After reading the current provisional syntax for Lambda (≙ Lambda expressions, exception transparency, defender methods and method references) from August 2010 I wonder if people at Oracle completely ignored Java's complexity budget when considering such changes.

These are the questions I'm thinking about - some of them more about language design in general:

  • Are the proposed additions comparable in complexity to approaches other languages chose?
  • Is it generally possible to add such additions to a language and protecting the developer from the complexity of the implementation ?
  • Are these additions a sign of reaching the end of the evolution of Java-as-a-language or is this expected when changing a language with a huge history?
  • Have other languages taken a totally different approach at this point of language evolution?

Thanks!

+1  A: 

It's not much more complicated then lambda expressions in other languages.

Consider...

int square(x) {
    return x*x;
}

Java:

#(x){x*x}

Python:

lambda x:x*x

C#:

x => x*x

I think the C# approach is slightly more intuitive. Personally I would prefer...

x#x*x
Pace
It's not the syntax I'm concerned about, it is the actual technical implementation of the whole thing. But yes, I'm still wondering why they chose such an arcane syntax ... personally I think something like x : x*x would be good considering that the enhanced for-loop also uses the : as a sign of "magic happens here".
soc
@soc: Using `:` for lambda functions might interfere with conditional operator, and the for each (i.e. in) operator grammar wise. And it might require a lot of hard work to disambiguate `:` in different contexts. That's not the syntax of the lambda operator that worries me, but I don't like it to be half-baked and half-implemented like Java Generics because of backward compatibility or other issues. If it is not possible to introduce them elegantly into the language, the Java team should just omit it.
Bytecode Ninja
I think the current proposal for Java is `{x -> x*x}`.
Tom Hawtin - tackline
The problem is not really how the syntax looks, which is just the upper layer, but what using that syntax means. If you ask me, I would go for the C++ syntax, which much more explicit: `[](int x) { return x*x; }` for your example. In more complex scenarios, the list of captured variables can be made explicit and whether they are captured by value or reference is explicit `[ }` --The C# version, by comparison implicitly takes variables by reference potentially causing weird results...
David Rodríguez - dribeas
@David: Implicit capturing of `final` variables (plus the object context) would be in keeping with existing Java inner and anonymous classes. Whether or not there *should* be such capturing, that's what current users are used to and so should be favored; minimizing the number of basic semantic patterns is a very good goal for any language designer.
Donal Fellows
+3  A: 

Modulo some scope-disambiguation constructs, almost all of these methods follow from the actual definition of a lambda abstraction:

λx.E

To answer your questions in order:

I don't think there are any particular things that make the proposals by the Java community better or worse than anything else. As I said, it follows from the mathematical definition, and therefore all faithful implementations are going to have almost exactly the same form.

Anonymous first-class functions bolted onto imperative languages tend to end up as a feature that some programmers love and use frequently, and that others ignore completely - therefore it is probably a sensible choice to give it some syntax that will not confuse the kinds of people who choose to ignore the presence of this particular language feature. I think hiding the complexity and particulars of implementation is what they have attempted to do by using syntax that blends well with Java, but which has no real connotation for Java programmers.

It's probably desirable for them to use some bits of syntax that are not going to complicate existing definitions, and so they are slightly constrained in the symbols they can choose to use as operators and such. Certainly Java's insistence on remaining backwards-compatible limits the language evolution slightly, but I don't think this is necessarily a bad thing. The PHP approach is at the other end of the spectrum (i.e. "hey guys, let's break everything every time there is a new point release!"). I don't think that Java's evolution is inherently limited except by some of the fundamental tenets of its design - e.g. adherence to OOP principles, VM-based.

I think it's very difficult to make strong statements about language evolution from Java's perspective. It is in a reasonably unique position. For one, it's very, very popular, but it's relatively old. Microsoft had the benefit of at least 10 years worth of Java legacy before they decided to even start designing a language called "C#". The C programming language basically stopped evolving at all. C++ has had few significant changes that found any mainstream acceptance. Java has continued to evolve through a slow but consistent process - if anything I think it is better-equipped to keep on evolving than any other languages with similarly huge installed code bases.

Gian
@Gian: C++0x added lambdas, redefined auto (type inference), r-value references, variadic template args and more. To me these are huge additions.I think C++ committe has changed the way it works and only change the language when it's impossible or virtually impossible to add a useful feature as a library. There has been very brave efforts to do lambdas, type inference, return value optimization as libraries (See boost) but in the end it won't work without language support.I think that is a sound approach.
FuleSnabel
@FuleSnabel, true, I was deliberately ignoring C++0x because I considered that it was a reasonably bold redesign that remains somewhat unproven as to whether it will be fully adopted or not. Your point is well-taken though, and I think it will be a credit to the C++ community if they manage to get these language features well-integrated, widely supported in implementatons and used by the community.
Gian
@Gian: I have to disagree in that 'C++ has had few significant changes that found any mainstream acceptance'. Most changes to the core language are widely used, and big part of the library changes are also being used. C++ evolves slowly, but safely. Very few changes to the standard have turned out to be considered regrets, with `std::vector<bool>` specialization being the one I can remember, followed by `auto_ptr` that has been found to be less than perfect but is still widely used and effective --just not with usable in containers.
David Rodríguez - dribeas
@David, fair enough. I'm willing to concede that point, because I am not a C++ programmer, so I'll take your word for it! I was referring specifically to core syntax changes, rather than library-based extensions.
Gian
+1  A: 

Maybe this is not really an answer to your question, but this may be comparable to the way objective-c (which of course has a very narrow user base in contrast to Java) was extended by blocks (examples). While the syntax does not fit the rest of the language (IMHO), it is a useful addition and and the added complexity in terms of language features is rewarded for example with lower complexity of concurrent programming (simple things like concurrent iteration over an array or complicated techniques like Grand Central Dispatch).

In addition, many common tasks are simpler when using blocks, for example making one object a delegate (or - in Java lingo - "listener") for multiple instances of the same class. In Java, anonymous classes can already be used for that cause, so programmers know the concept and can just spare a few lines of source code using lambda expressions.

In objective-c (or the Cocoa/Cocoa Touch frameworks), new functionality is now often only accessible using blocks, and it seems like programmers are adopting it quickly (given that they have to give up backwards compatibility with old OS versions).

FRotthowe
You mean: let's make it so ugly nobody wants to use it :) I prefer Smalltalk blocks...
Stephan Eggermont
+1  A: 

This is really really close to Lambda functions proposed in the new generation of C++ (C++0x) so I think, Oracle guys have looked at the other implementations before cooking up their own.

http://en.wikipedia.org/wiki/C%2B%2B0x

[](int x, int y) { return x + y; }
SleepyCod
There are HUGE differences between the C++ proposal and that in Java. The c++ proposal is `[capture-list](argument-list)->return-type { expression-list }`, Java lacks the `capture-list` (together with the ability to define how each parameter is to be captured --reference/value), it offers two differnt syntaxes `#(argument-list)(expression)` and `#return-type(argument-list){ expression-list }` and the semantics can never be matched as Java does not have the concept of C++ references. The syntax in the simplest case might be similar, but for all others it is not.
David Rodríguez - dribeas
The Java syntax is closer to the C# version of lambdas, with the difference that the semantics are turned completely around --C# captures *by-reference*, with changes to the external variable being visible in the lambda. Java requires captured variables to be *effectively final*, so no changes to the variables can be performed there.
David Rodríguez - dribeas
+3  A: 

I have not followed the process and evolution of the Java 7 lambda proposal, I am not even sure of what the latest proposal wording is. Consider this as a rant/opinion rather than statements of truth. Also, I have not used Java for ages, so the syntax might be rusty and incorrect at places.

First, what are lambdas to the Java language? Syntactic sugar. While in general lambdas enable code to create small function objects in place, that support was already preset --to some extent-- in the Java language through the use of inner classes.

So how much better is the syntax of lambdas? Where does it outperform previous language constructs? Where could it be better?

For starters, I dislike the fact that there are two available syntax for lambda functions (but this goes in the line of C#, so I guess my opinion is not widespread. I guess if we want to sugar coat, then #(int x)(x*x) is sweeter than #(int x){ return x*x; } even if the double syntax does not add anything else. I would have preferred the second syntax, more generic at the extra cost of writting return and ; in the short versions.

To be really useful, lambdas can take variables from the scope in where they are defined and from a closure. Being consistent with Inner classes, lambdas are restricted to capturing 'effectively final' variables. Consistency with the previous features of the language is a nice feature, but for sweetness, it would be nice to be able to capture variables that can be reassigned. For that purpose, they are considering that variables present in the context and annotated with @Shared will be captured by-reference, allowing assignments. To me this seems weird as how a lambda can use a variable is determined at the place of declaration of the variable rather than where the lambda is defined. A single variable could be used in more than one lambda and this forces the same behavior in all of them.

Lambdas try to simulate actual function objects, but the proposal does not get completely there: to keep the parser simple, since up to now an identifier denotes either an object or a method that has been kept consistent and calling a lambda requires using a ! after the lambda name: #(int x)(x*x)!(5) will return 25. This brings a new syntax to use for lambdas that differ from the rest of the language, where ! stands somehow as a synonim for .execute on a virtual generic interface Lambda<Result,Args...> but, why not make it complete?

A new generic (virtual) interface Lambda could be created. It would have to be virtual as the interface is not a real interface, but a family of such: Lambda<Return>, Lambda<Return,Arg1>, Lambda<Return,Arg1,Arg2>... They could define a single execution method, which I would like to be like C++ operator(), but if that is a burden then any other name would be fine, embracing the ! as a shortcut for the method execution:

 interface Lambda<R> {
    R exec();
 }
 interface Lambda<R,A> {
    R exec( A a );
 }

Then the compiler need only translate identifier!(args) to identifier.exec( args ), which is simple. The translation of the lambda syntax would require the compiler to identify the proper interface being implemented and could be matched as:

 #( int x )(x *x)
 // translated to
 new Lambda<int,int>{ int exec( int x ) { return x*x; } }

This would also allow users to define Inner classes that can be used as lambdas, in more complex situations. For example, if lambda function needed to capture a variable annotated as @Shared in a read-only manner, or maintain the state of the captured object at the place of capture, manual implementation of the Lambda would be available:

 new Lambda<int,int>{ int value = context_value;
     int exec( int x ) { return x * context_value; }
 };

In a manner similar to what the current Inner classes definition is, and thus being natural to current Java users. This could be used, for example, in a loop to generate multiplier lambdas:

 Lambda<int,int> array[10] = new Lambda<int,int>[10]();
 for (int i = 0; i < 10; ++i ) {
    array[i] = new Lambda<int,int>{ final int multiplier = i;
       int exec( int x ) { return x * multiplier; }
    };
 }
 // note this is disallowed in the current proposal, as `i` is
 // not effectively final and as such cannot be 'captured'. Also
 // if `i` was marked @Shared, then all the lambdas would share
 // the same `i` as the loop and thus would produce the same
 // result: multiply by 10 --probably quite unexpectedly.
 //
 // I am aware that this can be rewritten as:
 // for (int ii = 0; ii < 10; ++ii ) { final int i = ii; ...
 //
 // but that is not simplifying the system, just pushing the
 // complexity outside of the lambda.

This would allow usage of lambdas and methods that accept lambdas both with the new simple syntax: #(int x){ return x*x; } or with the more complex manual approach for specific cases where the sugar coating interferes with the intended semantics.

Overall, I believe that the lambda proposal can be improved in different directions, that the way it adds syntactic sugar is a leaking abstraction (you have deal externally with issues that are particular to the lambda) and that by not providing a lower level interface it makes user code less readable in use cases that do not perfectly fit the simple use case. :

David Rodríguez - dribeas