tags:

views:

369

answers:

10

I've been writing C# for seven years now, and I keep wondering, why do enums have to be of an integral type? Wouldn't it be nice to do something like:

enum ErrorMessage 
{ 
     NotFound: "Could not find",
     BadRequest: "Malformed request"
}

Is this a language design choice, or are there fundamental incompatibilities on a compiler, CLR, or IL level?

Do other languages have enums with string or complex (i.e. object) types? What languages?

(I'm aware of workarounds; my question is, why are they needed?)

EDIT: "workarounds" = attributes or static classes with consts :)

+6  A: 

What are the advantages, because I can only see drawbacks:

  • ToString will return a different string to the name of the enumeration. That is, ErrorMessage.NotFound.ToString() will be "Could not find" instead of "NotFound".
  • Conversely, with Enum.Parse, what would it do? Would it still accept the string name of the enumeration as it does for integer enumerations, or does it work with the string value?
  • You would not be able to implement [Flags] because what would ErrorMessage.NotFound | ErrorMessage.BadRequest equal in your example (I know that it doesn't really make sense in this particular case, and I suppose you could just say that [Flags] is not allowed on string-based enumerations but that still seems like a drawback to me)
  • While the comparison errMsg == ErrorMessage.NotFound could be implemented as a simple reference comparison, errMsg == "Could not find" would need to be implemented as a string comparison.

I can't think of any benefits, especially since it's so easy to build up your own dictionary mapping enumeration values to "custom" strings.

Dean Harding
With regard to your first bullet, that's exactly the point my friend!
JamesBrownIsDead
But couple it with my second point: if `ToString` returns "Could not find" what would `Enum.Parse` do?
Dean Harding
+7  A: 

The purpose of an Enum is to give more meaningful values to integers. You're looking for something else besides an Enum. Enums are compatible with older windows APIs and COM stuff, and a long history on other platforms besides.

Maybe you'd be satisfied with public const members of a struct or a class.

Or maybe you're trying to restrict some specialized types values to only certain string values? But how it's stored and how it's displayed can be two different things - why use more space than necessary to store a value?

And if you want to have something like that readable in some persisted format, just make a utility or Extension method to spit it out.

This response is a little messy because there are just so many reasons. Comparing two strings for validity is much more expensive than comparing two integers. Comparing literal strings to known enums for static type-checking would be kinda unreasonable. Localization would be ... weird. Compatibility with would be broken. Enums as flags would be meaningless/broken.

It's an Enum. That's what Enums do! They're integral!

uosɐſ
You can't have `static` and `const` members in c# because const is implicitly static. But maybe you didn't mean the actual keyword static.
John K
Good point. Also, static public readonly could work, right?
uosɐſ
+1  A: 

Perhaps because then this wouldn't make sense:

enum ErrorMessage: string
{
    NotFound,
    BadRequest
}
Vilx-
Moo goo gai pan is good food!
JamesBrownIsDead
@JamesBrownIsDead - what? O_o
Vilx-
+2  A: 

Not really answering your question but presenting alternatives to string enums.

public struct ErrorMessage  
{  
     public const string NotFound="Could not find";
     public const string BadRequest="Malformed request";
} 
Syd
Yes, this is probably the only other option besides attributing an enum. Good thinking, but this is an "alternative" to which I was referring.
JamesBrownIsDead
+6  A: 

Perhaps use the description attribute from System.ComponentModel and write a helper function to retrieve the associated string from an enum value? (I've seen this in a codebase I work with and seemed like a perfectly reasonable alternative)

enum ErrorMessage 
{ 
     [Description("Could not find")]
     NotFound,
     [Description("Malformed request")]
     BadRequest
}
Kilanash
He said he's "aware of the alternatives".... but I still like this answer.
Mark
Yes, attribute-based workarounds is what I was referring to when I said "alternatives." If this workaround is prevalent, doesn't that indicate a language deficiency?
JamesBrownIsDead
@James: No, it does not indicate a language deficiency, it indicates a fundamental misunderstanding of enumerations. Note the root word **numer** in there - the type exists *specifically* for the purpose of giving semantic meaning to arbitrary numbers.
Aaronaught
@Aaronaught: Enumeration has to do with *counting* and *listing* (sets of) things; from **numerare** "to count", not from "number".See the [MW](http://www.merriam-webster.com/dictionary/enumeration) definition or the [Wikipedia](http://en.wikipedia.org/wiki/Enumeration) article.
Stephen P
@JamesBrownIsDead: My fault then for not reading the question closely. I jumped to an immediate 'solution' to the problem instead of trying to enumerate (haha) why enumerations as a language feature in the CLR don't allow this. I'm inclined to agree with Rob Paulsen's answer -it just wasn't needed til now. Attributes are an acceptable way of adding extra meaning through metadata to existing language features without too much hassle, except the unsightly reflection code one has to write. If and when the language designers decide C# needs Java-like enums, I'm fine with this as is.
Kilanash
+1  A: 

It's a language decision - eg., Java's enum doesn't directly correspond to an int, but is instead an actual class. There's a lot of nice tricks that an int enum gives you - you can bitwise them for flags, iterate them (by adding or subtracting 1), etc. But, there's some downsides to it as well - the lack of additional metadata, casting any int to an invalid value, etc.

I think the decision was probably made, as with most design decisions, because int enums are "good enough". If you need something more complex, a class is cheap and easy enough to build.

Static readonly members give you the effect of complex enums, but don't incur the overhead unless you need it.

 static class ErrorMessage {
     public string Description { get; private set; }
     public int Ordinal { get; private set; }
     private ComplexEnum() { }

     public static readonly NotFound = new ErrorMessage() { 
         Ordinal = 0, Description = "Could not find" 
     };
     public static readonly BadRequest = new ErrorMessage() { 
         Ordinal = 1, Description = "Malformed Request" 
     };
 }
Mark Brackett
You shouldn't iterate over enums with integer addition/subtraction because enums are not guaranteed to be sequential, and you can easily set an enum to an invalid value. Use `Enum.GetValues()` instead.
Robert Paulson
@Robert - That's not entirely accurate. Undefined values *are* valid as far as C# is concerned. It's up to the callee to handle an undefined case. > "The set of values...is not limited by its enum members."> "[A]ny value of the underlying type...is a distinct valid value..."http://msdn.microsoft.com/en-us/library/aa664601(v=VS.71).aspx
Mark Brackett
@Mark your comment only reinforces my point. Fine I'll rephrase. AVOID iterating over enums with integer addition/subtraction. Just because you can doesn't mean you should (and how do you know where to start or stop). If you want to know and/or iterate over all values in an enum, use `Enum.GetValues()`
Robert Paulson
@Robert - We're coming at the argument from 2 different sides. I'm pointing out that there is no "start or stop" - they're all valid values. It's the *callee's responsibility* to handle all underlying values - not the caller's responsibility to only pass defined values. That being said, I'll agree that it's rare to *want* to pass an undefined value (unless they're Flags), so `Enum.GetValues` is more often appropriate.
Mark Brackett
@Mark you say it's "a nice trick" to be able to iterate via the underlying type, and this is what I disagree to - it's bad practice. When _iterating_, code needs to start somewhere and end somewhere (else). For C# int enums, that's 4294967295 possible values unless you hardcode your loop with knowledge of what the smallest and biggest values are (bad), and this also assumes all values are sequential (bad). Code that did this would fail any code review I've ever done. Callee's verifying an enum parameters validity is best practice, but I never said nor meant to imply this wasn't the case.
Robert Paulson
A: 

The main advantage of integral enums is that they don't take up much space in memory. An instance of a default System.Int32-backed enum takes up just 4-bytes of memory and can be compared quickly to other instances of that enum.

In constrast, string-backed enums would be reference types that require each instance to be allocated on the heap and comparisons to involve checking each character in a string. You could probably minimize some of the issues with some creativity in the runtime and with compilers, but you'd still run into similar problems when trying to store the enum efficiently in a database or other external store.

C. Dragon 76
While not wrong per se, I doubt storage is/was a consideration. A bunch of references to the same interned string in memory is, for all intents, the same as a bunch of integer values, and just as cheap to compare.
Robert Paulson
+1  A: 

Strictly speaking, the intrinsic representation of an enum shouldn't matter, because by definition, they are enumerated types. What this means is that

public enum PrimaryColor { Red, Blue, Yellow }

represents a set of values.

Firstly, some sets are smaller, whereas other sets are larger. Therefore, the .NET CLR allows one to base an enum on an integral type, so that the domain size for enumerated values can be increased or decreased, i.e., if an enum was based on a byte, then that enum cannot contain more than 256 distinct values, whereas one based on a long can contain 2^64 distinct values. This is enabled by the fact that a long is 8 times larger than a byte.

Secondly, an added benefit of restricting the base type of enums to integral values is that one can perform bitwise operations on enum values, as well as create bitmaps of them to represent more than one values.

Finally, integral types are the most efficient data types available inside a computer, therefore, there is a performance advantage when it comes to comparing different enum values.

For the most part, I would say representing enums by integral types seems to be a CLR and/or CLS design choice, though one that is probably not very difficult to arrive at.

Umar Farooq Khawaja
+3  A: 

The real answer why: There's never been a compelling reason to make enums any more complicated than they are. If you need a simple closed list of values - they're it.

In .Net, enums were given the added benefit of internal representation <-> the string used to define them. This one little change adds some versioning downsides, but improves upon enums in C++.

The enum keyword is used to declare an enumeration, a distinct type that consists of a set of named constants called the enumerator list.

Ref: msdn

Your question is with the chosen storage mechanism, an integer. This is just an implementation detail. We only get to peek beneath the covers of this simple type in order to maintain binary compatibility. Enums would otherwise have very limited usefulness.

Q: So why do enums use integer storage? As others have pointed out:

  1. Integers are quick and easy to compare.
  2. Integers are quick and easy to combine (bitwise for [Flags] style enums)
  3. With integers, it's trivially easy to implement enums.

* none of these are specific to .net, and it appears the CLR designers apparently didn't feel compelled to change anything or add any gold plating to them.

Now that's not to saying your syntax isn't entirely unappealing. But is the effort to implement this feature in the CLR, and all the compilers, justified? For all the work that goes into this, has it really bought you anything you couldn't already achieve (with classes)? My gut feeling is no, there's no real benefit. (There's a post by Eric Lippert I wanted to link to, but I couldn't find it)

You can write 10 lines of code to implement in user-space what you're trying to achieve without all the headache of changing a compiler. Your user-space code is easily maintained over time - although perhaps not quite as pretty as if it's built-in, but at the end of the day it's the same thing. You can even get fancy with a T4 code generation template if you need to maintain many of your custom enum-esque values in your project.

So, enums are as complicated as they need to be.


Robert Paulson
A: 

While it also counts as an "alternative", you can still do better than just a bunch of consts:

struct ErrorMessage
{
    public static readonly ErrorMessage NotFound =
        new ErrorMessage("Could not find");
    public static readonly ErrorMessage BadRequest =
        new ErrorMessage("Bad request");

    private string s;

    private ErrorMessage(string s)
    {
        this.s = s;
    }

    public static explicit operator ErrorMessage(string s)
    {
        return new ErrorMessage(s);
    }

    public static explicit operator string(ErrorMessage em)
    {
        return em.s;
    }
}

The only catch here is that, as any value type, this one has a default value, which will have s==null. But this isn't really different from Java enums, which themselves can be null (being reference types).

In general, Java-like advanced enums cross the line between actual enums, and syntactic sugar for a sealed class hierarchy. Whether such sugar is a good idea or not is arguable.

Pavel Minaev