Inlining
It's hard to live with none of: lexically scoped local functions; a macro system; and
inlined functions.
It's far from obvious how one hints that a method should be inlined, or otherwise go
real fast. Does final
do it? Does private final
do it? Given that there is no
preprocessor to let you do per-function shorthand, and no equivalent of Common Lisp's
flet (or even macrolet), one ends up either duplicating code, or allowing the code to
be inefficient. Those are both bad choices.
The distinction between slots and methods is stupid. Doing foo.x should be defined to
be equivalent to foo.x(), with lexical magic for "foo.x = ..." assignment. Compilers
should be trivially able to inline zero-argument accessor methods to be inline
object+offset loads. That way programmers wouldn't break every single one of their
callers when they happen to change the internal implementation of something from
something which happened to be a "slot" to something with slightly more complicated
behavior.
Java in 1.0 with some JITs did method inlining (the Symantec one for certain, the Borland one probably). So these points were wrong when he wrote the article. Hotspot and similar VMs do a decent job of inlining code. They can even inline code that regular C++ compilers cannot (virtual method inlining is possible). In C++ you can hint that things should be inlined. The compiler can ignore that hint. In Java you hint at inlining things by making small methods (and probably ones without loops).
new
By having new
be the only possible interface to allocation, and by having no back
door through which you can escape from the type safety prison, there are a whole class
of ancient, well-known optimizations that one just cannot perform. If something isn't
done about this, the language is never going to be fast enough for some tasks, no
matter how good the JITs get. And "write once run everywhere" will continue to be the
marketing fantasy that it is today.
Has not changed, except of course that the JVMs keep getting better - go with the "marketing fantasy" :-)
Function pointers
I really hate the lack of downward-funargs; anonymous classes are a lame substitute. (I
can live without long-lived closures, but I find lack of function pointers a huge
pain.)
Closures are in the works. A bit of a controversial topic in the Java community.
Properties
The distinction between slots and methods is stupid. Doing foo.x should be defined to
be equivalent to foo.x(), with lexical magic for "foo.x = ..." assignment. Compilers
should be trivially able to inline zero-argument accessor methods to be inline
object+offset loads. That way programmers wouldn't break every single one of their
callers when they happen to change the internal implementation of something from
something which happened to be a "slot" to something with slightly more complicated
behavior.
Part of this is his misunderstanding of how inlining worked at the time he wrote the article. The other part would be addresed by Properties if they make it into Java.
Multi-dispatch
I sure miss multi-dispatch. (The CLOS notion of doing method lookup based on the types
of all of the arguments, rather than just on the type of the implicit zero'th argument,
this).
No change, but if you search in google you will find "ways" of "doing it" (what you think of those ways is another thing :-)
Adding methods to other classes (I am sure there is a proper word for that)
The notion of methods "belonging" to classes is lame. Anybody anytime should be allowed
to defined new, non-conflicting methods on any class (without overriding existing
methods.) This causes no abstraction-breakage, since code which cares couldn't, by
definition, be calling the new, "externally-defined" methods.
This is just another way of saying that the pseudo-Smalltalk object model loses and
that generic functions (suitably constrained by the no-external-overrides rule) win.
This is an Objective-Cism. AspectJ seems to be the only solution for this. I'm not entirely convinced of the utility of doing this, though.
Static methods
The fact that static methods aren't really class methods (they're actually global
functions: you can't override them in a subclass) is pretty dumb.
This has not changed.
Arrays and hashCode
Two identical byte[] arrays aren't equal and don't hash the same.
This is somewhat solved by java.util.Arrays.deepHashCode(), although arrays still won't work in HashMaps. There are also cases when you would not want them to hash to the same value (you can please some of the people some of the time).
His possible solution ("What you can do is wrap an Object around an Array and let that implement hashCode and equals by digging around in its contained array, but that adds not-insignificant memory overhead (16 bytes per object, today.)") is less of a problem nowadays since memory is cheap, but still can be annoying in embedded development. I don't know of any better solution.
String iteration
I can't seem to manage to iterate the characters in a String without implicitly
involving half a dozen method calls per character.
The other alternative is to convert the String to a byte[] first, and iterate the
bytes, at the cost of creating lots of random garbage.
Still can't access things directly, but with HotSpot, those "half a dozen method calls" should be inlined if they're a problem. Also CharSequence could be made Iterable.
The alternative is to convert to a char[] (not a byte[]).
Unicode and Strings in general
Generally, I'm dissatisfied with the overhead added by Unicode support in those cases
where I'm sure that there are no non-ASCII characters. There ought to be two subclasses
of an abstract String class, one that holds Unicode, and one that holds 8-bit
quantities. They should offer identical APIs and be indistinguishable, except for the
fact that if a string has only 8-bit characters, it takes up half as much memory!
Of course, String being final eliminates even the option of implementing that.
Hasn't changed. Memory prices have (small consolation if you are keen to reduce memory).
The benefit is that things are likely easier to internationalize.
If String were not final there would be a whole lot of defensive copying going on... and I'd be sitting here agreeing with the argument he would have had 10 years ago about why String should have been made final :-)
Interfaces
Interfaces seem a huge, cheesy copout for avoiding multiple inheritance; they really
seem like they were grafted on as an afterthought. Maybe there's a good reason for them
being the way they are, but I don't see it; it looks like they were just looking for a
way to multiply-inherit methods without allowing call-next-method and without allowing
instance variables?
They were grafted on as an afterthought (early in the langauge development). They haven't changed since he wrote the article.
Covariant Return Types
There's something kind of screwy going on with type promotion that I don't totally
understand yet but that I'm pretty sure I don't like. This gets a compiler error about
type conflicts:
abstract class List {
abstract List next();
}
class foo extends List {
foo n;
foo next() { return n; }
}
I think that's wrong, because every foo is-a List. The compiler seems to be using
type-of rather than typep.
JDK 1.5 addressed this.
Array iteration
And in related news, it's a total pain that one can't iterate over the contents of an
array without knowing intimate details about its contents: you have to know whether
it's byte[], or int[], or Object[]. I mean, it is not rocket science to have a language
that can transparently access both boxed and unboxed storage. It's not as if Java isn't
doing all the requisite runtime type checks already! It's as if they went out of their
way to make this not work...
Is there some philosophical point I'm missing? Is the notion of separating your
algorithms from your data structures suddenly no longer a part of the so-called
"object oriented" pantheon?
Generics, autoboxing/unboxing, and the new style for loop probably address this.
ints are not Objects
This "integers aren't objects" nonsense really pisses me off. Why did they do that?
Is the answer as lame as, "we wanted the 'int' type to be 32 bits instead of 31"?
(You only really need one bit of type on the pointer if you don't need small conses,
after all.)
The way this bit me is, I've got code that currently takes an array of objects, and
operates on them in various opaque ways (all it cares about is equality, they're just
cookies.) I was thinking of changing these objects to be shorts instead of objects, for
compactness of their containing objects: they'd be indexes into a shared table, instead
of pointers to shared objects.
To do this, I would have to rewrite that other code to know that they're shorts instead
of objects. Because one can't assign a short to a variable or argument that expects an
Object, and consequently, one can't invoke the equal method on a short.
Wrapping them up in Short objects would kind of defeat the purpose: then they'd be
bigger than the pointer to the original object rather than smaller.
At the start (before it was public) Java did have ints as objects... it was too slow. Rather than fix it the Smalltalk way they went the C route... hasn't changed since the article was written. We now have a half-baked solution to this: Autoboxing!
ints are not Objects part II
After all this time, people still think that integer overflow is better than degrading
to bignums, or raising an exception?
Of course, they have Bignums now (ha!) All you have to do (ha!) is rewrite your code to
look like this:
result = x.add(y.multiply(BigInteger.valueOf(7))).pow(3).abs().setBit(27);
Note that some parameters must be BigIntegers, and some must be ints, and some must be
longs, with largely no rhyme or reason. (This complaint is in the "language" section
and not the "library" section because this shit should be part of the language, i.e.,
at the syntax level.)
That related to ints not being objects... still the same as when he wrote the article.
typedef
I miss typedef. If I have integers that represent something, I can't make type
assertions about them except that they are ints. Unless I'm willing to swaddle them in
blankets by wrapping Integer objects around them.
No typedef (for various reasons). However given that he was wrong about inlining... and given advances in the VM wrt object creation and GC making classes to represent this is actually better (you can validate things in the constructor). So no change since his original article from a typing point, but from a VM point the state is better.
enums
Similarly, I think the available idioms for simulating enum and :keywords are fairly
lame. (There's no way for the compiler to issue that life-saving warning, "enumeration
value 'x' not handled in switch", for example.)
They go to the trouble of building a single two-element enumerated type into the
language (Boolean) but won't give us a way to define our own?
Enums were added in Java 5, but for some reason, they never bothered to add that particular warning to the compiler. (ponders modifying javac... shudders when remembers the code that is in javac... wonders again how compiler writers actually manage to get stuff to compile... forgets about javac again :-)
assert
As far as I can see, there's no efficient way to implement assert
or #ifdef DEBUG
.
Java gets half a point for this by promising that if you have a static final boolean,
then conditionals that use it will get optimized away if appropriate. This means you
can do things like
if (randomGlobalObject.DEBUG) { assert(whatever, "whatever!"); }
but that's so gratuitously verbose that it makes my teeth hurt. (See also, lack of any
kind of macro system.)
assert
was added in Java 1.4. #ifdef DEBUG
is still hard to simulate.
Finalization
The finalization system is lame. Worse than merely being lame, they brag about how lame
it is! To paraphrase the docs: "Your object will only be finalized once, even if it's
resurrected in finalization! Isn't that grand?!" Post-mortem finalization was figured
out years ago and works well. Too bad Sun doesn't know that.
No changes since he wrote it (except that people don't use finalization normally).
References
Relatedly, there are no "weak pointers." Without weak pointers and a working
finalization system, you can't implement a decent caching mechanism for, e.g., a
communication framework that maintains proxies to objects on other machines, and
likewise keeps track of other machines' references to your objects.
References added in 1.2 (WeakReference, etc...) deal with this.
Inner classes and final variables
You can't close over anything but final variables in an inner class! Their rationale is
that it might be "confusing." Of course you can get the effect you want by manually
wrapping your variables inside of one-element arrays. The very first time I tried using
inner classes, I got bitten by this -- that is, I naively attempted to modify a
closed-over variable and the compiler complained at me, so I in fact did the
one-element array thing. The only other time I've used inner classes, again, I needed
the same functionality; I started writing it the obvious way and let out a huge sigh of
frustration when, half way through, I realized what I had done and manually walked back
through the code turning my
Object foo = <whatever>;
into
final Object[] foo = { <whatever> };
and all the occurence of foo into foo[0]. Arrrgh!
No change yet, but closures might "fix" that.
Access model and final
The access model with respect to the mutability (or read-only-ness) of objects blows.
Here's an example:
System.in, out and err (the stdio streams) are all final variables. They didn't used to
be, but some clever applet-writer realized that you could change them and start
intercepting all output and do all sorts of nasty stuff. So, the whip-smart folks at
Sun went and made them final. But hey! Sometimes it's okay to change them! So, they
also added System.setIn, setOut, and setErr methods to change them!
"Change a final variable?!" I hear you cry. Yep. They sneak in through native code
and change finals now. You might think it'd give 'em pause to think and realize that
other people might also want to have public read-only yet privately writable variables,
but no.
Oh, but it gets even better: it turns out they didn't really have to sneak in through
native code anyway, at least as far as the JVM is concerned, since the JVM treats final
variables as always writable to the class they're defined in! There's no special case
for constructors: they're just always writable. The javac compiler, on the other hand,
pretends that they're only assignable once, either in static init code for static
finals or once per constructor for instance variables. It also will optimize access to
finals, despite the fact that it's actually unsafe to do so.
It is possible to change final fields with reflection (it's actually pretty easy).
final variables
Something else related to this absurd lack of control over who can modify an object and
who cannot is that there is no notion of constant space: constantry is all per-class,
not per-object. If I've got a loop that does
String foo = "x";
it does what you'd expect, because the loader happens to have special-case magic that
interns strings, but if I do:
String foo[] = { "x", "y" };
then guess what, it conses up a new array each time through the loop! Um, thanks, but
don't most people expect literal constants to be immutable? If I wanted to copy it, I
would copy it. The language also should impose the contract that literal constants are
immutable.
Even without the language having immutable objects, a non-losing compiler could
eliminate the consing in some limited situations through static analysis, but I'm not
holding my breath.
Using final on variables doesn't do anything useful in this case; as far as I can tell,
the only reason that final works on variables at all is to force you to specify it on
variables that are closed over in inner classes.
No changes since he wrote this. Final is very useful (I have about 90%-95% of every variable I make as final... it avoids a lot of silly mistakes).
Locking/Synchronization
The locking model is broken.
First, they impose a full word of overhead on each and every object, just in case
someone somewhere sometime wants to grab a lock on that object. What, you say that you
know that nobody outside of your code will ever get a pointer to this object, and that
you do your locking elsewhere, and you have a zillion of these objects so you'd like
them to take up as little memory as possible? Sorry. You're screwed.
Any piece of code can assert a lock on an object and then never un-lock it,
causing deadlocks. This is a gaping security hole for denial-of-service attacks.
In any half-way-rational design, the lock associated with an object would be treated
just like any other slot, and only methods statically "belonging" to that class could
frob it.
But then you get into the bug of Java not doing closures properly. See, you want
to write a method:
public synchronized void with_this_locked (thunk f)
{
f.funcall ();
}
but then actually writing any code becomes a disaster because of the mind-blowing
worthlessness of inner classes.
A number of changes have been made in the concurrency area.
Exception handling
There is no way to signal without throwing: that is, there is no way to signal an
exceptional condition, and have some condition handler tell you "go ahead and proceed
anyway." By the time the condition handler is run, the excepting scope has already
been exited.
Termination is still the exception handling model used.
Collections
It comes with hash tables, but not qsort? Thanks!
Since 1.2, there is a modified merge sort. Still no quicksort, though.
String and memory usage
String has length+24 bytes of overhead over byte[]:
class String implements java.io.Serializable {
private char value[]; // 4 bytes + 12 bytes of array header
private int offset; // 4 bytes
private int count; // 4 bytes
}
The only reason for this overhead is so that String.substring() can return strings
which share the same value array. Doing this at the cost of adding 8 bytes to each and
every String object is not a net savings...
If you have a huge string, pull out a substring() of it, hold on to the substring and
allow the longer string to become garbage (in other words, the substring has a longer
lifetime) the underlying bytes of the huge string never go away.
There have been several proposals for changing String, but I believe all of them have been rejected. I agree that there is room for improvement.
File System
The file manipulation primitives are inadequate; for example, there's no way to ask
questions like "is the file system case-insensitive?" or, "what is the maximum file
name length?", or "is it required that file extensions be exactly three characters
long?" Which could be worked around, but for:
The architecture-interrogation primitives are inadequate; there is no robust way to ask
"am I running on Windows" or "am I running on Unix."
There is no way to access link() on Unix, which is the only reliable way to implement
file locking.
There is no way to do ftruncate(), except by copying and renaming the whole file.
NIO may help with some of this, but overall there are many issues with the file system still.
printf
Is "%10s %03d" really too much to ask? Yeah, I know there are packages out on the net
trying to reproduce every arcane nuance of printf(), but controlling field width and
padding seems pretty darned basic to me.
Since Java 5, System.out.format() and System.out.printf() handles this.
RandomAccessFile
A RandomAccessFile cannot be used as a FileInputStream. More specifically, there is no
class or interface which those two classes have in common. So, despite the fact that
both implement read() and a slew of other like-functioning methods, there is no way to
write a method which works on streams of either type.
Identical lossage exists for the pairing of RandomAccessFile and FileOutputStream. WHAT
WERE THEY THINKING?
NIO probably deals with this, but the RandomAccessFile is still essentially the same.
markSupported is stupid.
markSupported is stupid.
markSupported is still stupid.
System and Runtime
What in the world is the difference between System and Runtime? The division seems
completely random and arbitrary to me.
Probably is random and arbitrary... was cleaned up a bit... but still there.
Library bloat
What in the world is application-level crap like checkPrintJobAccess() doing in the
base language class library? There's all kinds of special-case abstraction-breaking
garbage like this.
Good question. I am sure that he would think the library is worse now.
TofuBeer's and mmyers' posts were both great, both extremely long and have now been merged into this wiki post (by TofuBeer again), which is even better. Kudos & many thanks!
-- Hanno