views:

60

answers:

1

(Sorry if this sounds like a rant, but it's a real question and I'd appreciate real answers)

I understand that since C is so old, it might have not made sense to add it back then(MMX didn't even exist back then). But since then there was C99, and still there are no standard for SIMD variables(as far as I know).

By "SIMD variables", I mean something like:

vec2_int a = {2, 2};
vec2_int b = {3, 3};
a += b;

I also understand that this can be done with structs and (in theory) the compiler should optimize it to use SIMD when appropriate anyway.

But I recently saw a post from Qt Labs which includes an example with types like "__m128i"(which look clearly non-standard), instead of relying on optimizations. Considering Qt is advertising this as greatly improving Qt's speed, I'm guessing compiler optimizations are being insufficient, at least for some programmers.

If it was just C, I'd think C was being stupid. But, as far as I know, newer languages such as C++, Java and C# don't include these either. C# has Mono.SIMD but it's not a primitive type(and since C# has a "decimal" keyword, I don't think they were trying to save types).

So here's what I'm noticing: Languages with vector primitive types seem to be the exception and not the rule. Because vector primitive types look so obvious, I'm guessing there's got to be some decent reasons NOT to include these types.

Does anyone here know why these types are so frequently excluded? Some links to rationales against adding them?

+4  A: 

Because not all processors support SIMD instructions. Languages like C and C++ (and even Java and C#) are designed to be used on different kinds of hardware, like microcontrollers, in addition to desktop computers.

Currently, vectorization of algorithms is not automatic (although that is being actively researched). Algorithms that are "vectorizable" must be explicitly written to take advantage of any SIMD capabilities of the execution environment.

In silico
Yes, but optimizing to SIMD-ify algorithms automatically is hard. Writing non-SIMD versions of SIMD algorithms is trivial.
luiscubal
True, but not all algorithms are SIMD-izable in the first place.
In silico
@In silico - Not all algorithms are related to strings, and yet C# has a string primitive type. Your arguments might be valid, but still not enough to justify such wide lack of native SIMD types. Like I said, if C was alone in this, I'd understand. But popular languages with no native SIMD support outnumber by far the ones with SIMD...
luiscubal
If popularity is the issue, then the answer would probably be that not a lot of people heard about it or used it much. At least for C++, the philosophy is to only add language features that are "generally useful" to programmers. SIMD probably doesn't make that cut, even though it can be extremely useful in certain contexts.
In silico
@In silico - I guess that explanation will have to be enough.
luiscubal
@luiscabal: you're missing a lot of important points - one in particular is that there is no *single standard SIMD implementation* - machine vector widths can be 64 bits, 128 bits, 256 bits and greater. Some SIMD implementations support only floating point, some support both integer and float, some support element sizes of 32 bits whereas others support variable width elements (from 1 bit to 32 or 64 bits). How would you propose to standardise all current and future SIMD implementations as a set of generic language extensions ?
Paul R
@Paul R: I don't think standardization would be that much of a problem. It could easily be solved like opencl did. The language just defines vectortypes of varying length for primitive types (e.g. float2, float4, float8, ...) and defines that elementwise operationes which behave as if they had been applied to each element independently. For platforms without simd support they can easily map to a bunch of scalar operations, giving at least some loop unrolling. It seems more like there just isn't enough reason to do so. Out of curiosity: 1bit/element simd? How does that make sense?
Grizzly
@Grizzly: the problem with abstracting SIMD types and operations as you suggest is that you then lose performance benefit as you have to compromise on the implementation for each supported architecture. This is also a problem with OpenCL - if you want maximum performance on each architecture you still have to optimise for each architecture, which rather defeats the purpose of the abstraction. Oh and 1 bit/element is useful for a lot of things - typically bitwise operations on large data structures (e.g. large arrays of booleans).
Paul R
@Paul R: Of course you won't get the maximum speed with an abstractian (I think most of the "lost" performance would be special operations which didn't make the cut into the interface but could still be used platformspecific).But hasn't "if you want best performance on X optimize your code to X" always been true? That doesn't mean abstractions which give some performance benefits while being platform independent and easier to code against aren't useful. Considering the 1Bit/element: since the operations are bitwise, wouldn't e.g. int fit that definition? So why call it SIMD implementation?
Grizzly
@Grizzly: based on experience of several different SIMD architectures, I'd say that the architectures are just too disjoint and non-orthogonal to get any useful benefits from abstraction. The two most similar architectures are AltiVec and SSSE3/SSE4, but even in this case it's pretty hard to have an abstraction layer that works effectively in all SIMD use cases, because you often need to take a different approach at implementation time depending on what instructions are available. If you throw away most/all of the benefits of SIMD in the interest of abstraction then it's a waste of effort.
Paul R
@Grizzly: most SIMD architectures support bitwise operations on entire vectors - how you interpret this is up to you (the processor doesn't know or care about the SIMD data type in this case) but it's effectively a 128 x 1 bit SIMD operation.
Paul R