views:

544

answers:

12

I have cells for whom the numeric value can be anything between 0 and Integer.MAX_VALUE. I would like to color code these cells correspondingly.

If the value = 0, then r = 0. If the value is Integer.MAX_VALUE, then r = 255. But what about the values in between?

I'm thinking I need a function whose limit as x => Integer.MAX_VALUE is 255. What is this function? Or is there a better way to do this?

I could just do (value / (Integer.MAX_VALUE / 255)) but that will cause many low values to be zero. So perhaps I should do it with a log function.

Most of my values will be in the range [0, 10,000]. So I want to highlight the differences there.

A: 

I could just do (value / (Integer.MAX_VALUE / 255)) but that will cause many low values to be zero.

One approach you could take is to use the modulo operator (r = value%256;). Although this wouldn't ensure that Integer.MAX_VALUE turns out as 255, it would guarantee a number between 0 and 255. It would also allow for low numbers to be distributed across the 0-255 range.

EDIT:

Funnily, as I test this, Integer.MAX_VALUE % 256 does result in 255 (I had originally mistakenly tested against %255, which yielded the wrong results). This seems like a pretty straight forward solution.

akf
It's not explicitly stated, but it seems implicit in the question that the colours should group similar values - i.e. if two cells share a colour then their value will be roughly equivalent. In your answer the colour has not relation to the magnitude of the value.
Kirk Broadhurst
What? This isn't even a remotely good idea.
rlbond
Kirk, I agree that the idea of distribution into buckets of similar values was *an* idea. It also appears that the OP was open to other solutions. It does seem, however, that the most of the answerers have taken the grouping of similar values as the only solution.
akf
+1  A: 

The value you're looking for is: r = 255 * (value / Integer.MAX_VALUE). So you'd have to turn this into a double, then cast back to an int.

Kaleb Brasee
-1. Totally wrong, sorry.
Artelius
Ugh, you're right, I typed that too fast. You could have specified the correction, but I took your subtle clue and ran with it.
Kaleb Brasee
Stupid SO. I can't reverse the downvote now.
Artelius
@Artelius: Yes you can. Click on the down-vote again (not on the up-vote arrow)
Eric J.
I have edited the post, you should now be able to reverse your vote.
akf
LOL that is OK, that is what I get for clicking Add Comment without reading what I typed.
Kaleb Brasee
@Eric J.: No, I couldn't. Until akf edited the post that is.
Artelius
+2  A: 

In general (since it's not clear to me if this is a Java or Language-Agnostic question) you would divide the value you have by Integer.MAX_VALUE, multiply by 255 and convert to an integer.

pavium
The only way to do that is to cast to a double at one point, it is more efficient to do it the other way around.
Erich
@Erich, But then you chance overflow.
strager
You won't overflow if you divide first!
Erich
That's what I did. What did you mean by 'do it the other way around'?
pavium
+1  A: 

This works! r= value /8421504;

8421504 is actually the 'magic' number, which equals MAX_VALUE/255. Thus, MAX_VALUE/8421504 = 255 (and some change, but small enough integer math will get rid of it.

if you want one that doesn't have magic numbers in it, this should work (and of equal performance, since any good compiler will replace it with the actual value:

r= value/ (Integer.MAX_VALUE/255);

The nice part is, this will not require any floating-point values.

Erich
Not exactly equal distribution, but good enough.
strager
How so? The values from 0-Int.Max will be distributed evenly through 0-255, won't they?
Erich
+11  A: 

The "fairest" linear scaling is actually done like this:

floor(256 * value / (Integer.MAX_VALUE + 1))

Note that this is just pseudocode and assumes floating-point calculations.

If we assume that Integer.MAX_VALUE + 1 is 2^31, and that / will give us integer division, then it simplifies to

value / 8388608

Why other answers are wrong

Some answers (as well as the question itself) suggsted a variation of (255 * value / Integer.MAX_VALUE). Presumably this has to be converted to an integer, either using round() or floor().

If using floor(), the only value that produces 255 is Integer.MAX_VALUE itself. This distribution is uneven.

If using round(), 0 and 255 will each get hit half as many times as 1-254. Also uneven.

Using the scaling method I mention above, no such problem occurs.

Non-linear methods

If you want to use logs, try this:

255 * log(value + 1) / log(Integer.MAX_VALUE + 1)

You could also just take the square root of the value (this wouldn't go all the way to 255, but you could scale it up if you wanted to).

Artelius
this works better when it's `log(Integer.MAX_VALUE)` instead of `log(Integer.MAX_VALUE + 1)`. Otherwise, every result is 0 because `log(Integer.MAX_VALUE)` is `NaN`.
Rosarch
Maybe the critics can now see how it is language agnostic, even though the coder is working in THE ONE TRUE LANGUAGE
Nicholas Jordan
@Rosarch: Good point. I might have tested my code before posting, but why would I break my vow never to use Java over something so trivial? ;) Still, as a C programmer, I should have known better...
Artelius
A statement that Java could ever be in any way construed to be THE ONE TRUE LANGUAGE has GOT to be intentional flame-bait. How did that get up-voted?
Bob Aman
A: 

The best answer really depends on the behavior you want.

If you want each cell just to generally have a color different than the neighbor, go with what akf said in the second paragraph and use a modulo (x % 256).

If you want the color to have some bearing on the actual value (like "blue means smaller values" all the way to "red means huge values"), you would have to post something about your expected distribution of values. Since you worry about many low values being zero I might guess that you have lots of them, but that would only be a guess.

In this second scenario, you really want to distribute your likely responses into 256 "percentiles" and assign a color to each one (where an equal number of likely responses fall into each percentile).

Eric J.
+2  A: 

For a linear mapping of the range 0-2^32 to 0-255, just take the high-order byte. Here is how that would look using binary & and bit-shifting:

r = value & 0xff000000 >> 24

Using mod 256 will certainly return a value 0-255, but you wont be able to draw any grouping sense from the results - 1, 257, 513, 1025 will all map to the scaled value 1, even though they are far from each other.

If you want to be more discriminating among low values, and merge many more large values together, then a log expression will work:

r = log(value)/log(pow(2,32))*256

EDIT: Yikes, my high school algebra teacher Mrs. Buckenmeyer would faint! log(pow(2,32)) is the same as 32*log(2), and much cheaper to evaluate. And now we can also factor this better, since 256/32 is a nice even 8:

r = 8 * log(value)/log(2)

log(value)/log(2) is actually log-base-2 of value, which log does for us very neatly:

r = 8 * log(value,2)

There, Mrs. Buckenmeyer - your efforts weren't entirely wasted!

Paul McGuire
Ooops, sorry, I slipped into Python there. java.math.log does not take a second arg, so you are stuck with log(value)/log(2).
Paul McGuire
A: 

If you are complaining that the low numbers are becoming zero, then you might want to normalize the values to 255 rather than the entire range of the values.

The formula would become:

currentValue / (max value of the set)

monksy
A: 

Note that if you want brighter and brighter, that luminosity is not linear so a straight mapping from value to color will not give a good result.

The Color class has a method to make a brighter color. Have a look at that.

Thorbjørn Ravn Andersen
A: 

The linear implementation is discussed in most of these answers, and Artelius' answer seems to be the best. But the best formula would depend on what you are trying to achieve and the distribution of your values. Without knowing that it is difficult to give an ideal answer.

But just to illustrate, any of these might be the best for you:

  • Linear distribution, each mapping onto a range which is 1/266th of the overall range.
  • Logarithmic distribution (skewed towards low values) which will highlight the differences in the lower magnitudes and diminish differences in the higher magnitudes
  • Reverse logarithmic distribution (skewed towards high values) which will highlight differences in the higher magnitudes and diminish differences in the lower magnitudes.
  • Normal distribution of incidence of colours, where each colour appears the same number of times as every other colour.

Again, you need to determine what you are trying to achieve & what the data will be used for. If you have been tasked to build this then I would strongly recommend you get this clarified to ensure that it is as useful as possible - and to avoid having to redevelop it later on.

Kirk Broadhurst
+1  A: 

Ask yourself the question, "What value should map to 128?" If the answer is about a billion (I doubt that it is) then use linear. If the answer is in the range of 10-100 thousand, then consider square root or log.

Another answer suggested this (I can't comment or vote yet). I agree.

r = log(value)/log(pow(2,32))*256

Rick
+2  A: 

I figured a log fit would be good for this, but looking at the results, I'm not so sure.

However, Wolfram|Alpha is great for experimenting with this sort of thing:

I started with that, and ended up with:

r(x) = floor(((11.5553 * log(14.4266 * (x + 1.0))) - 30.8419) / 0.9687)

Interestingly, it turns out that this gives nearly identical results to Artelius's answer of:

r(x) = floor(255 * log(x + 1) / log(2^31 + 1)

IMHO, you'd be best served with a split function for 0-10000 and 10000-2^31.

Bob Aman
Following my algebraic simplification below, I think this might serve well for you: `r(x) = floor((256./31) * log(x)/log(2))`
Paul McGuire
Yup, that's certainly simpler, though it still suffers from the issue of a bad fit compared to the values desired.
Bob Aman