I am working on auto vectorization with GCC. I am not in a position to use intrinsics or attributes due to customer requirement. (I cannot get user input to support vectorization)
If the alignment information of the array that can be vectorized is unknown, GCC invokes a pass for 'loop versioning'. Loop versioning will be performed when loop vectorization is done on trees. When a loop is identified to be vectorizable, and the constraint on data alignment or data dependence is hindering it, (because they cannot be determined at compile time), then two versions of the loop will be generated. These are the vectorized and non-vectorized versions of the loop along with runtime checks for alignment or dependence to control which version is executed.
My question is how we have to enforce the alignment? If I have found a loop that is vectorizable, I should not generate two versions of the loop because of missing alignment information.
For example. Consider the below code
short a[15]; short b[15]; short c[15];
int i;
void foo()
{
for (i=0; i<15; i++)
{
a[i] = b[i] ;
}
}
Tree dump (options: -fdump-tree-optimized -ftree-vectorize)
<SNIP>
vector short int * vect_pa.49;
vector short int * vect_pb.42;
vector short int * vect_pa.35;
vector short int * vect_pb.30;
bb 2>:
vect_pb.30 = (vector short int *) &b;
vect_pa.35 = (vector short int *) &a;
if (((signed char) vect_pa.35 | (signed char) vect_pb.30) & 3 == 0) ;; <== (A)
goto <bb 3>;
else
goto <bb 4>;
bb 3>:
</SNIP>
At 'bb 3' version of vectorized code is generated. At 'bb 4' code without vectorization is generated. These are done by checking the alignment (statement 'A'). Now without using intrinsics and other attributes, how should I get only the vectorized code (without this runtime alignment check.)