views:

373

answers:

5

Assuming you have a hypothetical enum in java like this (purely for demonstration purposes, this isn't code i'm seriously expecting to use):

enum Example{
    FIRST,
    SECOND,
    THIRD,
    ...
    LAST;
}

What's the maximum number of members you could have inside that enum before the compiler stops you?

Secondly, is there any performance difference at runtime when your code is referencing an enum with say, 10 members as opposed to 100 or 1,000 (other than just the obvious memory overhead required to store the large class)?

A: 

If you have to ask, you're probably doing something wrong. The actual limit is probably fairly high, but an enum with more than 10 or so values would be highly suspect, I think. Break that up into related collections, or a type hierarchy, or something.

Mark Bessey
Java enums aren't integer constants. They're instances of a subclass of the Enum class.
Laurence Gonsalves
Oops. I suppose I could try to weasel out of this by claiming that *in the common usage case* the JIT compiler will translate them to integer constants, but the truth is I was thinking of C/C++/C# enumerated types.
Mark Bessey
Edited to remove my previous mention of integer constants, since Java Enums are a thing unto themselves.
Mark Bessey
Just to clarify, the reason I'm asking about this is that I'm using a tool that does some code generation and was curious about any technical limits of the compiler and runtime environment that might cause it to fail. I'm very aware of the potential design problems in having a gigantic enum, but there's more to the story than what you're assuming here.
mpobrien
Thanks for the clarification. You should probably include that in the question.
Mark Bessey
+9  A: 

The language specification itself doesn't have a limit. Yet, there are many limitations that classfile has that bound the number of enums, with the upper bound being aruond 65,536 (2^16) enums:

Number of Fields The JVMS 4.1 specifies that ClassFile may have up to 65,536 (2^16) fields. Enums get stored in the classfile as static field, so the maximum number of enum values and enum member fields is 65,536.

Constant Pool The JVMS also specifies that the Constant Pool may have up to 65,536. Constant Pools store all String literals, type literals, supertype, super interfaces types, method signatures, method names, AND enum value names. So there must be fewer than 2^16 enum values, since the names strings need to share that Constant Pool limit.

Static Method Initialization The maximum limit for a method is 65,535 bytes (in bytecode). So the static initializer for the Enum has to be smaller than 64Kb. While the compiler may split it into different methods (Look at Bug ID: 4262078) to distribute the initializations into small blocks, the compiler doesn't do that currently.

Long story short, there is no easy answer, and the answer depends not only on the number of enum values there are, but also the number of methods, interfaces, and fields the enums have!

notnoop
You don't need to put the enum values in the constant pool: the BIPUSH and SIPUSH operands allow you to "inline" the int values into the bytecode.
Neil Coffey
My post was referring to the names of the enum names being in the constant pool, not their values or ordinals.
notnoop
OK, sorry, I misunderstood the bit where you say "This count includes both the enum values themselves and the member fields of the enums".
Neil Coffey
I guess the code size of the static initialiser is going to be a limit. Looks about 23 bytes a constant (`javap -c java.lang.Thread.State`). It could be broken up, done more efficiently or introduce a special feature into the class file format, but it doesn't seem worth it.
Tom Hawtin - tackline
@Tom Thanks, updated the post to include your comment.
notnoop
+8  A: 

The best way to find out the answer to this type of question is to try it. Start with a little Python script to generate the Java files:

n = input()
print "class A{public static void main(String[] a){}enum B{"
print ','.join("C%d" % x for x in range(n))
print '}}'

Now try with 1,10,100,1000... works fine, then BAM:

A.java:2: code too large C0,C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11,C12,C13,C14,C15,C16,C17,C18,C19,C20,C21,C22,...

Seems like I hit some sort of internal limit. Not sure if it's a documented limit, if it's dependent on the specific version of my compiler, or if its some system dependant limit. But for me the limit was around 3000 and appears to be related to the source code size. Maybe you could write your own compiler to bypass this limit.

Mark Byers
Why would you rewrite a Java compiler for the sake of having over 3000 enums?
Earlz
I probably wasn't being entirely serious.
Mark Byers
try inserting some newlines and see if the error still occurs...
Jason S
Interesting - I tried it with newlines, and it seems to fail at exactly 2747 items. Any fewer and it works fine.
mpobrien
A single method in Java is limited to 64K bytes in length. This is a JVM issue but since there are workarounds its a complier issue in this case. My guess is that 2747 enums results in an initialisation methos which is about 64K in length.
Peter Lawrey
A: 

This is an extension of the comments to the original question.

There are multiple problems with having a LOT of enums.

The main reason is that when you have a lot of data it tends to change, or if not you often want to add new items. There are exemptions to this like unit conversions that would never change, but for the most part you want to read data like this from a file into a collection of classes rather than an enum.

To add new items is problematic because since it's an enum, you need to physically modify your code unless you are ALWAYS using the enums as a collection, and if you are ALWAYS using them as a collection, why make them enums at all?

The case where your data doesn't change--like "conversion units" where you are converting feet, inches, etc. You COULD do this as enums and there WOULD be a lot of them, but by coding them as enums you lose the ability to have data drive your program. For instance, a user could select from a pull-down list populated by your "Units", but again, this is not an "ENUM" usage, it's using it as a collection.

The other problem will be repetition around the references to your enum. You will almost certainly have something very repetitive like:

if(userSelectedCard() == cards.HEARTS)
    graphic=loadFile("Heart.jpg");
if(userSelectedCard() == cards.SPADES)
    graphic=loadFile("Spade.jpg");

Which is just wrong (If you can squint to where you can't read the letters and see this kind of pattern in your code, you KNOW you are doing it wrong).

If the cards were stored in a card collection, it would be easier to just use:

graphic=cards.getGraphicFor(userSelectedCard());

I'm not saying that this can't be done with an enum as well, but I am saying that I can't see how you would use these as enums without having some nasty code-block like the one I posted above.

I'm also not saying that there aren't cases for enums--there are lots of them, but when you get more than a few (7 was a good number), you're probably better off with some other structure.

I guess the exception is when you are modeling real-world stuff that has that many types and each must be addressed with different code, but even then you are probably better off using a data file to bind a name to some code to run and storing them in a hash so you can invoke them with code like: hash.get(nameString).executeCode(). This way, again, your "nameString" is data and not hard-coded, allowing refactoring elsewhere.

If you get in the habit of brutally factoring your code like this, you can reduce many programs by 50% or more in size.

Bill K
Enums are objects: why not just define `#asFilename()` on your enum type?
Adrian
-1: A string can be misspelled, no? If you type "nameStirng" somewhere you're hosed. If you try to change all "nameString" constants in your code base to "nameAndAddressString" and miss one, you're hosed. If you type Operation.SUTRACT the compiler tells you.
Jim Ferrans
That's why nameString shouldn't be in quotes--that should be data as well. You should be iterating over a dataset with NO unique data in your code. I'm surprised at how many people don't really get factoring.
Bill K
Oh and @Adrian, you are absolutely right, but if you do use it correctly like that, when do you actually ever use the fact that it is an enum? Pretty much never, so why make it an enum (where you have to define it in code) instead of an object collection where you can abstract out the data from your code?
Bill K
+1  A: 

The maximum number of enum values will I think be just under the 65536 maximum number of fields/constant pool entries in the class. (As I mentioned in a comment above, the actual values shouldn't take up constant pool entries: they can be "inlined" into the bytecode, but the names will.)

As far as the second question is concerned, there's no direct performance difference, but it's conceivable that there'll be small indirect performance differences, partly because of the class file size as you say. Another thing to bear in mind is that when you use enum collections, there are optimised versions of some of the classes for when all of the enum values fit within a certain range (a byte, as I recall). So yes, there could be a small difference. I woudln't get paranoid, though.

Neil Coffey