tags:

views:

841

answers:

5

We have a heatmap we want to display. The numbers that will make up the values being displayed are unknown (except that they will be positive integers). The range of numbers is also unknown (again, except that they will be posiitive integars). The range could be between 0 and 200 or 578 and 1M or whatever. It depends on the data, which is unknown.

We want to take an unknown range of positive integers and turn it into a scaled (compressed) range to be displayed with RGB values in a heatmap. I hope this makes sense. Thanks!

I want to clarify that the min/max values need to be "plugged" into the forumla.

+4  A: 

You need to first find the range of those values to get the min and max. Then you need to create a colour scale like the bar below this image. You can experiment with different functions to map an integer to an RGB. You need 3 functions R(X), G(X), B(X). Looking at the image below it loks like B(X) peaks in the middle, R(X) peaks at the end and green is somewhere else. As long as you make sure that you never get two (RGBs) for some value of X then you've got your conversion.

alt text

EDIT: Come to think of it you could sample some unit circle around YUV space. alt text

Or even just download a high-res colour bar and sample that.

Chris H
I would add that creating the color bar will be done most easily in the HSV space
daveb
I like this idea . . . but since we won't know the min/max until we pull the data to populate the heatmap, how do we create the color bar values dynamically? The min/max values have to be part of the algoritm/formula . . .
Richard
@Richard: In that case you'd have to redraw the colour display every time a new data came in that was out of the original range. The R(X), G(X), B(X) functions could take normalized X values, and you just need to updated the normalization function when higher values come in. This would cause all colors to appear to get cooler as higher values came in. I don't see a way around this without knowing the range ahead of time.
Chris H
Yes we will redraw the colour display every time. My question is, how do I normalize these colors? And how do I do it in a logaritmic or exponential way, i.e. if the range of values is huge (for instance between 200 and 10,000,000), how do we still represent that 200. Especially if most of the values fall closer to the 200. Somehow it has to be weighted toward the end of the spectrum where most of the values are . . .
Richard
On the other hand, if it is fairly even, for instance 200, 500k, 1M, then the distribution should be even as well . ..
Richard
+1  A: 

Without knowing the range of values, there isn't much you can do to come up with a meaningful function mapping an arbitrary range of positive integers to a heat-map type range of colors.

I think you're going to have to run through your data at least once to get the min/max or know them ahead of time. Once you have that you can normalize appropriately and use any number of color schemes. The simplest solution would be to specify something like "hue" and convert from HSV to RGB.

job
There are no mathematical formula's or algorithms that can normalize a set of data using the min/max values as inputs? We need some logarithmic function I think, or something to that effect. We don't want some values lost in the heatmap because they are so different from other values.For example, if we had a value 20 and a value 1M, we want both to be represented, not for the 20 to appear as 0.
Richard
Then look at either a log distribution or histogram equilisation
Martin Beckett
That is getting beyond me im afraid . . .
Richard
A: 

Simple algorithm

// given a max and min value
float red,green,blue;
float range=max-min;
float mid=(max+min)/2.0;

//foreach value
    red = (value[ii]-mid)/range;            
    if (red>0.0) {
        //above mid = red-green
        blue=0.0;
        green = 1.0-red;
    } else {
        // lower half green-blue
        blue=-red;
        green = 1.0-blue;
        red=0.0;
    }

}

More complex:
If your range is a few million but most are around 0 you want to scale it so that 'red' in the above example is the log of the distance from the midpoint. The cod eis a little more complicated if the values are +/-

// assume equally distributed around 0 so max is the largest (or most negative number)
float range = log(fabs(max));
float mid=0.0

// foreach value
if (value[ii] > 0.0 ) {
    // above mid = red-green
    red = log(value[ii])/range;
    blue=0.0;
    green = 1.0 - red;
} else {
    // below mid = green-blue
    blue=-log(value[ii])/range;
    green = 1.0 - blue;
   red = 0.0;
}

note - I haven't tested this code, just spinning ideas!

Martin Beckett
Martin, I just tried that in excel . . . no luck. The rgb values are all around .5 or 0, expecially with big difference in values.
Richard
Yes - it was meant for the simple case, just a brain dump for the search engines. If all your values are around 0.5 with outliers can you simply take the log of value-mid ?
Martin Beckett
Not sure what you mean just a brain dump for the search engines . . .I don't think that will work. Some of the red values are negative, so the log doesn't work there. We really need a logarithmic compression of the numbers in the range.
Richard
Sorry - meant that i posted the simple algorithm here so i could point people to it, or they would find it in search.
Martin Beckett
A: 

man, you could probably use YUV color space and only for demonstration purposes convert it to RGB.

Alexander Solonsky
+1  A: 

You want to convert your data values to a frequency of light:

  • lower wavelength = cooler colors = blueish
  • higher wavelength = warmer colors = redder

The frequencies of visible light go from about 350nm (violet) to 650nm (red):

alt text

The following function converts numbers in your specified range to the the range of visible light, then gets the rgb:

function DataPointToColor(Value, MinValue, MaxValue: Real): TColor;
var
   r, g, b: Byte;
   WaveLength: Real;
begin
   WaveLength := GetWaveLengthFromDataPoint(Value, MinValue, MaxValue);
   WavelengthToRGB(Wavelength, r, g, b);
   Result := RGB(r, g, b);
end;

With the function i wrote off the top of my head:

function GetWaveLengthFromDataPoint(Value: Real; MinValues, MaxValues: Real): Real;
const
   MinVisibleWaveLength = 350.0;
   MaxVisibleWaveLength = 650.0;
begin
   //Convert data value in the range of MinValues..MaxValues to the 
   //range 350..650

   Result := (Value - MinValue) / (MaxValues-MinValues) *
         (MaxVisibleWavelength - MinVisibleWavelength) +
         MinVisibleWaveLength;
end;

And a function i found on the internets, that converts a wavelength into RGB:

PROCEDURE WavelengthToRGB(CONST Wavelength:  Nanometers;
                          VAR R,G,B:  BYTE);
  CONST
    Gamma        =   0.80;
    IntensityMax = 255;
  VAR
    Blue   :  DOUBLE;
    factor :  DOUBLE;
    Green  :  DOUBLE;
    Red    :  DOUBLE;
  FUNCTION Adjust(CONST Color, Factor:  DOUBLE):  INTEGER;
  BEGIN
    IF   Color = 0.0
    THEN RESULT := 0     // Don't want 0^x = 1 for x <> 0
    ELSE RESULT := ROUND(IntensityMax * Power(Color * Factor, Gamma))
  END {Adjust};
BEGIN
  CASE TRUNC(Wavelength) OF
    380..439:
      BEGIN
        Red   := -(Wavelength - 440) / (440 - 380);
        Green := 0.0;
        Blue  := 1.0
      END;
    440..489:
      BEGIN
        Red   := 0.0;
        Green := (Wavelength - 440) / (490 - 440);
        Blue  := 1.0
      END;
    490..509:
      BEGIN
        Red   := 0.0;
        Green := 1.0;
        Blue  := -(Wavelength - 510) / (510 - 490)
      END;
    510..579:
      BEGIN
        Red   := (Wavelength - 510) / (580 - 510);
        Green := 1.0;
        Blue  := 0.0
      END;
    580..644:
      BEGIN
        Red   := 1.0;
        Green := -(Wavelength - 645) / (645 - 580);
        Blue  := 0.0
      END;
    645..780:
      BEGIN
        Red   := 1.0;
        Green := 0.0;
        Blue  := 0.0
      END;
    ELSE
      Red   := 0.0;
      Green := 0.0;
      Blue  := 0.0
  END;
  // Let the intensity fall off near the vision limits
  CASE TRUNC(Wavelength) OF
    380..419:  factor := 0.3 + 0.7*(Wavelength - 380) / (420 - 380);
    420..700:  factor := 1.0;
    701..780:  factor := 0.3 + 0.7*(780 - Wavelength) / (780 - 700)
    ELSE       factor := 0.0
  END;
  R := Adjust(Red,   Factor);
  G := Adjust(Green, Factor);
  B := Adjust(Blue,  Factor)
END {WavelengthToRGB}; 

Sample use:

Data set in the range of 10..65,000,000. And this particular data point has a value of 638,328:

color = DataPointToColor(638328, 10, 65000000);
Ian Boyd
That's a great idea. The only problem is that the conversion to the wavelength has to be unevenly compressed. i.e. if the range of values is huge, we don't want decimal values for the smaller values. They still need to be represented.
Richard
@Richard: You have to decide what is uneven. e.g. if the entire world is in a deep freeze, except for one town in Angola, which is too hot to live in: then everything *should* be blue, except for that one dot which should be red. Unless of course you want to **exaggerate** values in the lower range. You're going to have to decide what kind of lying you want to perform, in order to best represent your data. If the source data isn't linear (as you suggest), then a common trick is to use a log scale; apply a natural log ln(value) to the values before plotting them.
Ian Boyd