ansaurus

Question

Is this C++ implementation for an Atomic float safe?

Answer 1

+3 A:

It looks like your implementation assumes that sizeof(size_t) == sizeof(float). Will that always be true for your target platforms?

And I wouldn't say threading heresy so much as casting heresy. :)

Greg Hewgill 2008-10-28 03:40:30

Well not necessarily, but I plan on putting a static assert that compares sizeof(float) == sizeof(size_t) as a guard for compilation

Robert Gould 2008-10-28 03:44:45

What does that gain you over just using uint32_t?

Greg Rogers 2008-10-28 03:46:13

Good point my friend!

Robert Gould 2008-10-28 03:53:22

It looks like your implementation assumes that sizeof(uint32_t) == sizeof(float). Will that always be true for your target platforms? Will that always be true for your compilers?

Windows programmer 2008-10-28 05:54:39

It's probably good enough for his current platform, if there are any future platforms, but using a static assert will let him know when that isn't the case, and if he wants to get really clever he can probably do different MACRO defines for different platforms.

Dan 2008-10-28 13:17:03

Answer 2

+5 A:

I would seriously advise against public inheritance. I don't know what the atomic implementation is like, but im assuming it has overloaded operators that use it as the integral type, which means that those promotions will be used instead of your float in many (maybe most?) cases.

I don't see any reason why that wouldn't work, but like you I have to way to prove that...

One note: your operator float() routine does not have load-acquire semantics, and shouldn't it be marked const volatile (or definitely at least const)?

EDIT: If you are going to provide operator--() you should provide both prefix/postfix forms.

Greg Rogers 2008-10-28 03:55:25

Doing composition is probably the better solution. I should probably refactor the class if the implementation is ok.

Robert Gould 2008-10-28 04:00:28

Fully agree with intheritance - composition.

xtofl 2008-10-28 04:51:25

Answer 3

A:

From my reading of that code, I would be really mad at such a compiler as to put out assembly for this that wasn't atomic.

Joshua 2008-10-28 04:00:07

Answer 4

A:

Have your compiler generate assembly code and take a look at it. If the operation is more than a single assembly-language instruction, then it's not an atomic operation, and requires locks to operate properly in multiprocessor systems.

Unfortunately, I'm not certain that the opposite is also true -- that single-instruction operations are guaranteed to be atomic. I don't know the details of multiprocessor programming down to that level. I could make a case for either result. (If anyone else has some definitive information on that, feel free to chime in.)

Head Geek 2008-10-28 04:39:19

Single ASM instructions should be considered non-atomic until proven otherwise, especially on x86 and other CISCy architectures, since an instruction is broken down into micro-ops, betwixt which you might have a context switch. Atomic insns like compare-and-swap disable interrupts to elide this.

Matt J 2008-10-28 04:50:43

Single assembly language instructions are non-atomic in multprocessor systems regardless of whether any of the processors does a context switch. The way to obtain atomicity is to use operations that are specially designed for it, such as compare-and-swap, or lock, or Dekker's algorithm.

Windows programmer 2008-10-28 05:52:54

Of course, in a multiprocessor system, the context switch itself is irrelevant, but the fact that you should examine every possible interleaving of thread execution doesn't change whether multiple threads are arbitrarily time-multiplexed onto a core, or time-multiplexed into shared memory.

Matt J 2008-10-28 08:48:07

Answer 5

A:

This is the state of the code as it stands now after talks on the intel boards, but still hasn't been thoroughly verified to work correctly in all scenarios.

  #include <tbb/atomic.h>
  typedef unsigned int   uint_32;
  typedef __TBB_LONG_LONG    uint_64;

  template<typename FLOATING_POINT,typename MEMORY_BLOCK>
  struct atomic_float_
  {
    /* CRC Card -----------------------------------------------------
    | Class:   atmomic float template class
    |
    | Responsability: handle integral atomic memory as it were a float,
    |     but partially bypassing FPU, SSE/MMX, so it is
    |     slower than a true float, but faster and smaller
    |     than a locked float.
    |      *Warning* If your float usage is thwarted by
    |     the A-B-A problem this class isn't for you
    |      *Warning* Atomic specification says we return,
    |     values not l-values. So  (i = j) = k doesn't work.
    |
    | Collaborators: intel's tbb::atomic handles memory atomicity
    ----------------------------------------------------------------*/
    typedef typename atomic_float_<FLOATING_POINT,MEMORY_BLOCK> self_t;

    tbb::atomic<MEMORY_BLOCK> atomic_value_;

    template<memory_semantics M>
    FLOATING_POINT fetch_and_store( FLOATING_POINT value ) 
    {
     const MEMORY_BLOCK value_ = 
      atomic_value_.tbb::atomic<MEMORY_BLOCK>::fetch_and_store<M>((MEMORY_BLOCK&)value);
     //atomic specification requires returning old value, not new one
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    FLOATING_POINT fetch_and_store( FLOATING_POINT value ) 
    {
     const MEMORY_BLOCK value_ = 
      atomic_value_.tbb::atomic<MEMORY_BLOCK>::fetch_and_store((MEMORY_BLOCK&)value);
     //atomic specification requires returning old value, not new one
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    template<memory_semantics M>
    FLOATING_POINT compare_and_swap( FLOATING_POINT value, FLOATING_POINT comparand ) 
    {
     const MEMORY_BLOCK value_ = 
      atomic_value_.tbb::atomic<MEMORY_BLOCK>::compare_and_swap<M>((MEMORY_BLOCK&)value,(MEMORY_BLOCK&)compare);
     //atomic specification requires returning old value, not new one
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    FLOATING_POINT compare_and_swap(FLOATING_POINT value, FLOATING_POINT compare)
    {
     const MEMORY_BLOCK value_ = 
      atomic_value_.tbb::atomic<MEMORY_BLOCK>::compare_and_swap((MEMORY_BLOCK&)value,(MEMORY_BLOCK&)compare);
     //atomic specification requires returning old value, not new one
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    operator FLOATING_POINT() const volatile // volatile qualifier here for backwards compatibility 
    {
     const MEMORY_BLOCK value_ = atomic_value_;
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    //Note: atomic specification says we return the a copy of the base value not an l-value
    FLOATING_POINT operator=(FLOATING_POINT rhs) 
    {
     const MEMORY_BLOCK value_ = atomic_value_.tbb::atomic<MEMORY_BLOCK>::operator =((MEMORY_BLOCK&)rhs);
     return reinterpret_cast<const FLOATING_POINT&>(value_);
    }

    //Note: atomic specification says we return an l-value when operating among atomics
    self_t& operator=(self_t& rhs) 
    {
     const MEMORY_BLOCK value_ = atomic_value_.tbb::atomic<MEMORY_BLOCK>::operator =((MEMORY_BLOCK&)rhs);
     return *this;
    }

    FLOATING_POINT& _internal_reference() const
    {
     return reinterpret_cast<FLOATING_POINT&>(atomic_value_.tbb::atomic<MEMORY_BLOCK>::_internal_reference());
    }

    FLOATING_POINT operator+=(FLOATING_POINT value)
    {
     FLOATING_POINT old_value_, new_value_;
     do
     {
      old_value_ = reinterpret_cast<FLOATING_POINT&>(atomic_value_);
      new_value_ = old_value_ + value;
     //floating point binary representation is not an issue because
     //we are using our self's compare and swap, thus comparing floats and floats
     } while(self_t::compare_and_swap(new_value_,old_value_) != old_value_);
     return (new_value_); //return resulting value
    }

    FLOATING_POINT operator*=(FLOATING_POINT value)
    {
     FLOATING_POINT old_value_, new_value_;
     do
     {
      old_value_ = reinterpret_cast<FLOATING_POINT&>(atomic_value_);
      new_value_ = old_value_ * value;
     //floating point binary representation is not an issue becaus
     //we are using our self's compare and swap, thus comparing floats and floats
     } while(self_t::compare_and_swap(new_value_,old_value_) != old_value_);
     return (new_value_); //return resulting value
    }

    FLOATING_POINT operator/=(FLOATING_POINT value)
    {
     FLOATING_POINT old_value_, new_value_;
     do
     {
      old_value_ = reinterpret_cast<FLOATING_POINT&>(atomic_value_);
      new_value_ = old_value_ / value;
     //floating point binary representation is not an issue because
     //we are using our self's compare and swap, thus comparing floats and floats
     } while(self_t::compare_and_swap(new_value_,old_value_) != old_value_);
     return (new_value_); //return resulting value
    }

    FLOATING_POINT operator-=(FLOATING_POINT value)
    {
     return this->operator+=(-value); //return resulting value
    }

    //Prefix operator
    FLOATING_POINT operator++()
    {
     return this->operator+=(1); //return resulting value
    }

    //Prefix operator
    FLOATING_POINT operator--() 
    {
     return this->operator+=(-1); //return resulting value
    }

    //Postfix operator
    FLOATING_POINT operator++(int)
    {
     const FLOATING_POINT temp = this;
     this->operator+=(1);
     return temp//return resulting value
    }

    //Postfix operator
    FLOATING_POINT operator--(int) 
    {
     const FLOATING_POINT temp = this;
     this->operator+=(1);
     return temp//return resulting value
    }

    FLOATING_POINT fetch_and_add( FLOATING_POINT addend ) 
    {
     const FLOATING_POINT old_value_ = atomic_value_;
     this->operator+=(addend);
     //atomic specification requires returning old value, not new one as in operator x=
     return old_value_; 
    }

    FLOATING_POINT fetch_and_increment() 
    {
     const FLOATING_POINT old_value_ = atomic_value_;
     this->operator+=(+1);
     //atomic specification requires returning old value, not new one as in operator x=
     return old_value_; 
    }

    FLOATING_POINT fetch_and_decrement() 
    {
     const FLOATING_POINT old_value_ = atomic_value_;
     this->operator+=(-1);
     //atomic specification requires returning old value, not new one as in operator x=
     return old_value_; 
    }
  };

  typedef atomic_float_<float,uint_32> AtomicFloat;
  typedef atomic_float_<double,uint_64> AtomicDouble;

Robert Gould 2008-10-28 05:31:33

Answer 6

A:

Either you or I have some studying to do about references to objects that were formerly on the stack.

Windows programmer 2008-10-28 05:59:53

Probably both of us :)

Robert Gould 2008-10-28 06:04:46

I get downvoted for it and you don't. Until we study it, what would happen if you change your casts in "return reinterpret_cast<const float" to lose the ampersands?

Windows programmer 2008-10-29 02:05:38

I'll up vote you, because this is not a silly point. However I suspect that you might have been down voted because this should be a comment and not an answer?As for your question if I loose the ampersands all hell breaks loose. They are there to interpret the memory as float, not just as refs

Robert Gould 2008-10-29 03:40:38

Answer 7

+1 A:

Although the size of a uint32_t may be equivalent to that of a float on a given arch, by reinterpreting a cast from one into the other you are implicitly assuming that atomic increments, decrements and all the other operations on bits are semantically equivalent on both types, which are not in reality. I doubt it works as expected.

Nicola Bonelli 2008-10-28 08:41:03

No, no I'm not, that's why I'm pulling the actual operations out into a transaction-while-loop (a known parallel pattern). Anyways I can assure you the code works correctly in a single thread. And it even has been working correctly in multithread. Just not sure if that's something I can trust..

Robert Gould 2008-10-28 09:02:06

I didn't pay much attention to the operators. But the question is: are you sure that fetch_and_add, fetch_and_increment etc. are working in the right way ?

Nicola Bonelli 2008-10-28 09:21:14

You are right! I hadn't actually given much thought to them since I was testing the operators. The fetch_xxxx are all wrong! silly I missed that, they need the same treatment of the operators.

Robert Gould 2008-10-28 09:30:24

Answer 8

+1 A:

I strongly doubt that you get the correct values in fetch_and_add etc, as float addition is different from int addition.

Here's what I get from these arithmetics:

1   + 1    =  1.70141e+038  
100 + 1    = -1.46937e-037  
100 + 0.01 =  1.56743e+038  
23  + 42   = -1.31655e-036

So yeah, threadsafe but not what you expect.

the lock-free algorithms (operator + etc.) should work regarding atomicity (haven't checked for the algorithm itself..)

Other solution: As it is all additions and subtractions, you might be able to give every thread its own instance, then add the results from multiple threads.

peterchen 2008-10-28 09:42:10

Note I'm not doing that. I'm casting the ints into float refs, which means they are handled correctly. old_value_ = reinterpret_cast<float new_value_ = old_value_ + value;

Robert Gould 2008-10-28 09:50:21

That would be fine for a solution for a "reduce", but I need floats as members of data structures (properties) that have long lives.But your comment does remind me floats are silly without multiplication and division. Gonna add those too

Robert Gould 2008-10-28 09:55:01

The revised code looks much better! :) And yes, the lock-free loops look ok to me, but I haven't done enough with those to really judge.

peterchen 2008-10-28 13:01:04

Answer 9

+1 A:

Just a note about this (I wanted to make a comment but apparently new users aren't allowed to comment): Using reinterpret_cast on references produces incorrect code with gcc 4.1 -O3. This seems to be fixed in 4.4 because there it works. Changing the reinterpret_casts to pointers, while slightly uglier, works in both cases.

Lutorm 2010-04-01 19:18:36

ansaurus

tags:

views:

answers:

Is this C++ implementation for an Atomic float safe?

related questions