views:

325

answers:

4

After becoming more engaged with training new engineers as well as reading Jon Skeet's DevDays presentation I have begun to recognize many engineers aren't clear when to use which numeric datatypes when. I appreciate the role a formal computer science degree plays in helping with this, but I see a lot of new engineers showing uncertainty because they have never worked with large data sets, or financial software, or programming phyiscs or statistics problems, or complex datastore issues.

My experience is that people really grok concepts when they are explained within context. I am looking for good examples of real programming problems where certain data is best represented using data type. Try to stay away from the textbook examples if possible. I am tagging this with Java, but feel free to give examples in other languages and retag:

Integer, Long, Double, Float, BigInteger, etc...

+3  A: 

BigDecimal is the best when it comes to maintaining accurate floating point calculations, and being able to specify the desired accuracy. I believe float (and to some extent double) offer performance benefits over BigDecimal, but at the cost of accuracy and usability.

Kaleb Brasee
+1 to counter some idiot who down voted this answer
kar
LOL, thank you!
Kaleb Brasee
+14  A: 

I really don't think you need examples or anything complex. This is simple:

  • Is it a whole number?
    • Can it be > 2^63? BigInteger
    • Can it be > 2^31? long
    • Otherwise int
  • Is it a decimal number?
    • Is an approximate value ok?
      • double
    • Does it need to be exact? (example: monetary amounts!)
      • BigDecimal

(When I say ">", I mean "greater in absolute value", of course.)

I've never used a byte or char to represent a number, and I've never used a short, period. That's in 12 years of Java programming. Float? Meh. If you have a huge array and you are having memory problems, I guess.

Note that BigDecimal is somewhat misnamed; your values do not have to be large at all to need it.

Kevin Bourrillion
I really like the way you broke it down. 100% agree with you on char, short, and Floats with Java.
Scanningcrew
One additional hint: If it's a decimal number, and needs to be exact, but the maximum number of decimal places is known in advance (such as with $ values: 2 dec. places), then you can just use int/long , and divide on output.That avoids the problems with BitDecimal (performance, awkward operators).
sleske
+1  A: 

normally numeric if we're talking machine independenat (32/64bit) data type size are as below,

integer: 4 bytes

long: 8 bytes

decimal/float: 4bytes

double : 8bytes

and the sizes reduced to half for signed values (eg: for 4bytes, unsigned=4billions, signed=2billions)

bigInt (depends on language implementation) sometimes up to 10bytes.

for high volumes data archiving (such as search engine) i would highly recommended byte and short to save spaces.

byte: 1 byte, (0-256 unsigned, -128 - 128 signed)

short: 2 byte (65k unsigned)


let's say you want to save record about AGE, since nobody ever lives over 150, so you used data type BYTE (read above for size) but if you use INTEGER you already wasted extra 3bytes and seriously tell me wth live over 4billions yrs.

kar
Classic example of premature optimization... Unless you are saving a HUGE array or database of peoples ages USE INT. There is no downside if size is not a problem (and in most modern cases it's not) and the upside is you don't fall pray to stupid bugs like the Y2K bug because of premature optimization.
Savvas Dalkitsis
i was talking about AGE, not birth date lol. take alook at topics i have created, i've been developing custom web-scale search engine (custom highly optimized index format) for almost 2 yrs, and trust me web indexes is a big deal when it come to disk space optimisation. and i use both INT(32) and LONG(64) for date to save space and avoid Y2K limits.
kar
also just FYI sawas, we keep some of our data in 3bytes and 5bytes, load into memory space in normal 4/8bytes for maximum space efficiency instead of VInt (lucene) for extra speed when loading it. So yes i know what im talking about.
kar
michaelc
+2  A: 

One important point you might want to articulate is that it's almost always an error to compare floating-point numbers for equality. For example, the following code is very likely to fail:

double euros = convertToEuros(item.getCostInDollars());
if (euros == 10.0) {
  // this line will most likely never be reached
}

This is one of many reasons why you want to use discrete numbers to represent currency.

When you absolutely must compare floating-point numbers, you can only do so approximately; something to the extent of:

double euros = convertToEuros(item.getCostInDollars());
if (Math.abs(euros - 10.0) < EPSILON) {
  // this might work
}

As for practical examples, my usual rule of thumb is something like this:

  • double: think long and hard before using it; is the pain worth it ?
  • float: don't use it
  • byte: most often used as byte[] to represent some raw binary data
  • int: this is your best friend; use it to represent most stuff
  • long: use this for timestamps and database IDs
  • BigDecimal and BigInteger: if you know about these, chances are you know what you're doing already, so you don't need my advice

I realize that these aren't terribly scientific rules of thumb, but if your target audience are not computer scientists, it might be best to stick to basics.

Bugmaster
I'm not a big fan of your example code, because you shouldn't be using double for monetary data to begin with. You should use BigDecimal. See e.g. this: http://stackoverflow.com/questions/965831/how-to-parse-a-currency-amount-us-or-eu-to-float-value-in-java/965858#965858 and Item 48 in the book Effective Java (2nd ed).
Jonik
Jonik -- he's showing us why double is bad. And he says "here's how to do it *if you absolutely have to* use floating-point". There's no foul here.
Kevin Bourrillion
btw, I think this is an excellent answer, at least as good as mine. the only exception is that BigDecimal really should be urged as the only way to handle decimal numbers precisely; not just "oh, you probably know what you're doing..".
Kevin Bourrillion
Kevin - Well, yeah, but that wasn't awfully clear from the answer, especially when there isn't a word about the fact that BigDecimal would be the right way! Such examples would better serve as an appendix to e.g. your own answer which lays out the basic rules of thumb way more clearly (as voters agree)
Jonik