Maybe I misunderstand the example but the "unaligned pointer" problem
is the same as on a single-core execution. If a datum can be partially
written to memory then different threads can see partial updates (if
there's no appropriate locking) on any machine with preemtive
multitasking (even on a single-CPU system).
You don't have to worry about the cache unless you are writing drivers
for DMA-capable peripherals. Modern multi-processors are cache
coherent so the hardware guarantees that a thread on processor A will
have the same view of memory as a thread on processor B. If the thread
on A reads a memory location that is cached on B then the thread on A
will get the correct value from Bs cache.
You do have to worry about values in registers and from a
programming standpoint that difference may not be a visible one, but
in my opinion involving the cache in a concurrency discussion often
just introduces unnecessary confusion.
Any operation that is labeled "indivisible" by the programming manual
for a ISA must reasonably keep being indivisible in a multiprocessing
system built with processors using that ISA or backwards compatibility
would break. However, this does not mean that operations that were
never promised to be indivisible, but happened to be in a particular
processor implementation, will be indivisible in future
implementations (such as in a multiprocessor system).
[Edit] Completion to the comment below
- Anything written to memory will be coherently visible to all
threads, regardless of the number of cores (in a cache coherent
system).
- Anything written to memory non-atomically can end up being partially
read by unsynchronized threads in the presence of preemption (even
on a single-core system).
If the pointer is written to an unaligned address in a single, atomic
write then the cache coherence hardware will make sure that all
threads see it completed, or not at all. If the pointer is written
non-atomically (such as with two separate write operations) then any
threads may see the partial update even on a single-core system
with true preemption.