I have a vertex shader (2.0) doing some instancing - each vertex specifies an index into an array.
If I have an array like this:
float instanceData[100];
The compiler allocates it 100 constant registers. Each constant register is a float4
, so it's allocating 4 times as much space as is needed.
I need a way to make it allocate just 25 constant registers and store four values in each of them.
Ideally I'd like a method where it still looks like a float[]
on both the CPU and GPU (Right now I am calling EffectParamter.SetValue(Single[])
, I'm using XNA). But manually packing and unpacking a float4[]
is an option, too.
Also: what are the performance implications for doing this? Is it actually worth it? (For me, this will save about one batch in every four or five).