The .NET framework's Math functions mostly operate on double precision floats, there are no single precision (float) overloads. When working with single precision data in a high performance scenario this results in unnecessary casting and also calculating functions with more precision than is required, thus performance is affected to some degree.
Is there any way of avoiding some of this additional CPU overhead? E.g. is there an open source math library with float overloads that calls underlying FPU instructions directly? (My understanding is that this would require support in the CLR). And actually I'm not sure if modern CPUs even have single precision instructions.
This question has been partly inspired by this question about optimizing a sigmoid function: