tags:

views:

85

answers:

1

The documentation on the Bigarray module is somewhat vague. It states that the purpose of arrays in that module is to hold "large arrays", but it doesn't really define what it means by "large array". When should I use a Bigarray over a regular array? Is there a certain number of elements beyond which I should just use a Bigarray? Is it in the thousands? Millions? Billions?

And what makes a Bigarray better at dealing with big arrays? What makes a regular array better at dealing with... non-big arrays?

+6  A: 

I found the answer to this (from this page):

The bigarray library implements large, multi-dimensional, numerical arrays. These arrays are called “big arrays” to distinguish them from the standard Caml arrays described in Module Array. The main differences between “big arrays” and standard Caml arrays are as follows:

  • Big arrays are not limited in size, unlike Caml arrays (float array are limited to 2097151 elements on a 32-bit platform, other array types to 4194303 elements).
  • Big arrays are multi-dimensional. Any number of dimensions between 1 and 16 is supported. In contrast, Caml arrays are mono-dimensional and require encoding multi-dimensional arrays as arrays of arrays.
  • Big arrays can only contain integers and floating-point numbers, while Caml arrays can contain arbitrary Caml data types. However, big arrays provide more space-efficient storage of integer and floating-point elements, in particular because they support “small” types such as single-precision floats and 8 and 16-bit integers, in addition to the standard Caml types of double-precision floats and 32 and 64-bit integers.
  • The memory layout of big arrays is entirely compatible with that of arrays in C and Fortran, allowing large arrays to be passed back and forth between Caml code and C / Fortran code with no data copying at all.
  • Big arrays support interesting high-level operations that normal arrays do not provide efficiently, such as extracting sub-arrays and “slicing” a multi-dimensional array along certain dimensions, all without any copying.
Jason Baker
The compatibility with C/Fortran is the big case I see for bigarrays. It can drastically reduce memory usage when interfacing with C or Fortran array-based libraries (e.g. BLAS).
Michael E
The size limit is 16Mb and affects float arrays, int arrays and strings (which are byte arrays). Big arrays are used to allow large arrays on 32-bit platforms. Your best bit is to use a 64-bit platform and forget about big arrays...
Jon Harrop