Frequently updated data actually is the perfect application of cache. As jdt mentioned, modern CPU caches are quite large, and 0.5mb might well fit in cache. More importantly, though, read-modify-write to uncached memory is VERY slow - the initial read has to block on memory, then the write operation ALSO has to block on memory in order to commit. And just to add insult to injury, the CPU might implement no-cache memory by loading the data into cache, then immediately invalidating the cache line - thus leaving you in a position which is guaranteed to be worse than before.
Before you try outsmarting the CPU like this, you really should benchmark the entire program, and see where the real slowdown is. Modern profilers such as valgrind's cachegrind can measure cache misses, so you can find if that is a significant source of slowdown as well.
On another, more practical note, if you're doing 30 RMWs per second, this is at the worst case something on the order of 1920 bytes of cache footprint. This is only 1/16 of the L1 size of a modern Core 2 processor, and likely to be lost in the general noise of the system. So don't worry about it too much :)
That said, if by 'accessed simultaneously' you mean 'accessed by multiple threads simultaneously', be careful about cache lines bouncing between CPUs. This wouldn't be helped by uncached RAM - if anything it'd be worse, as the data would have to travel all the way back to physical RAM each time instead of possibly passing through the faster inter-CPU bus - and the only way to avoid it as a problem is to minimize the frequency of access to shared data. For more about this, see http://www.ddj.com/hpc-high-performance-computing/217500206