Following on from a discussion which got going in the comments of this question.
How would one go about writing a Spinlock without CAS operations?
As the other question states:
The memory ordering model is such that writes will be atomic (if two concurrent threads write a memory location at the same time, the result will be one or the other). The platform will not support atomic compare-and-set operations.