views:

569

answers:

4

In the last couple of years, I've been doing a lot of SIMD programming and most of the time I've been relying on compiler intrinsic functions (such as the ones for SSE programming) or on programming assembly to get to the really nifty stuff. However, up until now I've hardly been able to find any programming language with built-in support for SIMD.

Now obviously there are the shader languages such as HLSL, Cg and GLSL that have native support for this kind of stuff however, I'm looking for something that's able to at least compile to SSE without autovectorization but with built-in support for vector operations. Does such a language exist?

This is an example of (part of) a Cg shader that does a spotlight and in terms of syntax this is probably the closest to what I'm looking for.

float4 pixelfunction(
 output_vs IN, 
 uniform sampler2D texture : TEX0, 
 uniform sampler2D normals : TEX1, 
 uniform float3 light, 
 uniform float3 eye ) : COLOR
{
 float4 color = tex2D( texture, IN.uv );
 float4 normal = tex2D( normals, IN.uv ) * 2 - 1;

 float3 T = normalize(IN.T);
 float3 B = normalize(IN.B);

 float3 N = 
  normal.b * normalize(IN.normal) +
  normal.r * T +
  normal.g * B;

 float3 V = normalize(eye - IN.pos.xyz);
 float3 L = normalize(light - IN.pos);
 float3 H = normalize(L + V);

 float4 diffuse = color * saturate( dot(N, L) );
 float4 specular = color * pow(saturate(dot(N, H)), 15);
 float falloff = dot(L, normalize(light));

 return pow(falloff, 5) * (diffuse + specular);
}

Stuff that would be a real must in this language is:

  • Built in swizzle operators
  • Vector operations (dot, cross, normalize, saturate, reflect et cetera)
  • Support for custom data types (structs)
  • Dynamic branching would be nice (for loops, if statements)
+1  A: 

Hi Jasper

That would be Fortran that you are looking for. If memory serves even the open-source compilers (g95, gfortran) will take advantage of SSE if it's implemented on your hardware.

Regards

Mark

High Performance Mark
Those Fortran implementations still use automatic vectorization in the same way most C++ compilers support this. The problem I have with this is that it's very hard to predict what code will be vectorized and what code won't. Now I don't know the state of this in Fortran compilers because my background is in C++, so I think I'd prefer a high-level shader-like approach that gives me more control over the final output.
Jasper Bekkers
+5  A: 

It's not really the language itself, but there is a library for Mono (Mono.Simd) that will expose the vectors to you and optimise the operations on them into SSE whenever possible:

viraptor
This solution looks nice; looks far better than the C++ intrinsics. However the solution is roughly equivalent and not what I'm looking for. (I was looking for actual languages designed with SIMD built-in instead of bolted on). However, it's definitely something to remember when doing a .Net based solution.
Jasper Bekkers
A: 

Currently the best solution is to do it myself by creating a back-end for the open-source Cg frontend that Nvidia released, but I'd like to save myself the effort so I'm curious if it's been done before. Preferably I'd start using it right away.

Jasper Bekkers
Cg isn't open source, it's proprietary to Nvidia. It would be an enormous amount of work to create a back-end that generated SIMD code for a CPU.As Louis answers, you should seriously check out OpenCL. You can write processing kernels in a C-based language (very similar to Cg and GLSL) and run it either on the GPU or CPU (where it will generate SIMD code for you). OpenCL is cross-platform, supported by many vendors (Nvidia, ATI, Apple, etc) and you can get an SDK right away.
gavinb
The Cg front-end source code is available at http://developer.nvidia.com/object/cg_compiler_code.html The code is made available specifically for creating a back-end for the compiler. However, I do prefer existing solutions such as OpenCL.
Jasper Bekkers
+3  A: 

Your best bet is probably OpenCL. I know it has mostly been hyped as a way to run code on GPUs, but OpenCL kernels can also be compiled and run on CPUs. OpenCL is basically C with a few restrictions:

  1. No function pointers
  2. No recursion

and a bunch of additions. In particular vector types:

float4 x = float4(1.0f, 2.0f, 3.0f, 4.0f);
float4 y = float4(10.0f, 10.0f, 10.0f, 10.0f);

float4 z = y + x.s3210 // add the vector y with a swizzle of x that reverses the element order

On big caveat is that the code has to be cleanly sperable, OpenCL can't call out to arbitrary libraries, etc. But if your compute kernels are reasonably independent then you basically get a vector enhanced C where you don't need to use intrinsics.

Here is a quick reference/cheatsheet with all of the extensions.

Louis Gerbarg
Can I still link OpenCL libs to a C application and hand it a set of vectors?
Jasper Bekkers
Comming to think about it, it doesn't need to be able to link, I just need to be able to pass it some data :-)
Jasper Bekkers
Basically, you compile an OpenCL compute kernel which has a C function as an entry point, then you run tell OpenCL to run the kernel using parameters you specify, which could be vectors, data sets, or even textures.
Louis Gerbarg
This seems to be the best solution for the problem I have at hand, thanks.
Jasper Bekkers