How do I initialize a multi-dimensional array of a primitive type as fast as possible?
I am stuck with using multi-dimensional arrays. My problem is performance. The following routine initializes a 100x100 array in approx. 500 ticks. Removing the int.MaxValue initialization results in approx. 180 ticks just for the looping. Approximately 100 ticks to create the array without looping and without initializing to int.MaxValue.
- Routines similiar to this are called a few hundred-thousand to several million times during a "run".
- The array size will not change during a run and arrays are created one-at-a-time, used, then discarded, and a new array created.
- A "run" which may last from one minute (using 10x10 arrays) to forty-five minutes (100x100).
- The application creates arrays of int, bool, and struct.
- There can be multiple "runs" executing at same time, but are not because performance degrades terribly.
- I am using 100x100 as a base-line.
I am open to suggestions on how to optimize this non-default initialization of an array. One idea I had is to use a smaller primitive type when available. For instance, using byte instead of int, saves 100 ticks. I would be happy with this, but I am hoping that I don't have to change the primitive data type.
public int[,] CreateArray(Size size) {
int[,] array = new int[size.Width, size.Height];
for (int x = 0; x < size.Width; x++) {
for (int y = 0; y < size.Height; y++) {
array[x, y] = int.MaxValue;
}
}
return array;
}
Down to 450 ticks with the following:
public int[,] CreateArray1(Size size) {
int iX = size.Width;
int iY = size.Height;
int[,] array = new int[iX, iY];
for (int x = 0; x < iX; x++) {
for (int y = 0; y < iY; y++) {
array[x, y] = int.MaxValue;
}
}
return array;
}
CreateArray5; Accepted Implementation: Limitation: Unable to Resize, can be changed
Down to approximately 165 ticks after a one-time initialization of 2800 ticks. (See my answer below.) If I can get stackalloc
to work with multi-dimensional arrays, I should be able to get the same performance without having to intialize the private static
array.
private static bool _arrayInitialized5;
private static int[,] _array5;
public static int[,] CreateArray5(Size size) {
if (!_arrayInitialized5) {
int iX = size.Width;
int iY = size.Height;
_array5 = new int[iX, iY];
for (int x = 0; x < iX; x++) {
for (int y = 0; y < iY; y++) {
_array5[x, y] = int.MaxValue;
}
}
_arrayInitialized5 = true;
}
return (int[,])_array5.Clone();
}
CreateArray8; Accepted Implemention; Limitation: Requires Full Trust
Down to approximately 165 ticks without using the "clone technique" above. (See my answer below.) I am sure I can get the ticks lower, if I can just figure out the return of CreateArray9
.
public unsafe static int[,] CreateArray8(Size size) {
int iX = size.Width;
int iY = size.Height;
int[,] array = new int[iX, iY];
fixed (int* pfixed = array) {
int count = array.Length;
for (int* p = pfixed; count-- > 0; p++)
*p = int.MaxValue;
}
return array;
}
Notes
I am providing all code and notes regarding this question as answers. Hopefully, it will save someone's time in the future.
Arrays allocated on the Large Object Heap (LOH) are not part of this discussion. The performance improvements noted hear are only for arrays allocated on the heap.
Stackalloc
My idea of using stackalloc
to eliminate initializing array to default values did not work out because the allocated stack memory must be copied out of the method. Meaning, I would have to create another array to hold the results. This array would be initialized defeating the whole purpose of using stackalloc
.
CreateArray8; unsafe/fixed method
The CLR will only execute unsafe
code if it is in a fully trusted assembly.
CreateArray5; clone method
Requires variables to determine if array is initialized and to store the initialized array. Performance is the same as the unsafe/fixed method after initialization. See Dan Tao's answer for possible solution.
300% Performance Increase?
I suck at percentages, but 300% is what I figured (500 to 165 ticks).
Final Implementation for Application
For this application, I settled on using the "clone" method. Following is a "lean" Generic implementation used in the application with performance samples.
Initialization:
Grid<int>
; generic clone class initalize: 4348, 4336, 4339, 4654Grid<bool>
; generic clone class initalize: 2692, 2684, 3916, 2680Grid<Color>
; generic clone class initalize: 3747, 4630, 2702, 2708
Use:
Grid<int>
; generic clone class: 185, 159, 152, 290Grid<bool>
; generic clone class: 39, 36, 44, 46Grid<Color>
; generic clone class: 2229, 2431, 2460, 2496public class Grid<T> { private T[,] _array; private T _value; private bool _initialized; private int _x; private int _y; public Grid(Size size, T value, bool initialize) { _x = size.Width; _y = size.Height; _value = value; if (initialize) { InitializeArray(); } } private void InitializeArray() { int iX = _x; int iY = _y; _array = new T[iX, iY]; for (int y = 0; y < iY; y++) { for (int x = 0; x < iX; x++) { _array[x, y] = _value; } } _initialized = true; } public T[,] CreateArray() { if (!_initialized) { InitializeArray(); } return (T[,])_array.Clone(); } }