Hello all, currently I'm dealing with a video processing software in which the picture data (8bit signed and unsigned) is stored in arrays of 16-aligned integers allocated as
__declspec(align(16)) int *pData = (__declspec(align(16)) int *)_mm_malloc(width*height*sizeof(int),16);
Generally, wouldn't it enable faster reading and writing if one used signed/unsigned char arrays like this?:
__declspec(align(16)) int *pData = (__declspec(align(16)) unsigned char *)_mm_malloc(width*height*sizeof(unsigned char),16);
I know little about cache line size and data transfer optimization, but at least I know that it is an issue. Beyond that, SSE will be used in future, and in that case char-arrays - unlike int arrays - are already in a packed format. So which version would be faster?