Is there a way to safely populate a byte array from multiple threads (e.g. first thread fills the first half, the second thread fills the second half using System.arraycopy) without synchronizing on the array itself using Java 6 or 7? The jsr166 related libraries only contain int arrays (AtomicIntegerArray, ParallelIntegerArray).
+3
A:
Yes it works. Writing to an array location does not interfere with nearby locations. However, you need to make sure that all threads have finished before reading (a happens-before relationship). The fact that you are using arrays makes no difference.
Tom Hawtin - tackline
2009-09-04 11:37:20
Yes, the join is there in form of Future.get(). I'm worried about cache alignments. For example a [8192] array written in 4096+4096 is likely to work without issues, but 4095+4097 might not?
kd304
2009-09-04 11:43:54
It seems to work, at least the MD5 of the data remains the same all the time. Thank you.
kd304
2009-09-04 11:52:38
Does this make Java byte arrays quite slow for single-threaded access, compared with writing bytes using non-atomic byte operations? I mean, assuming the JIT or the arraycopy implementation has moved any bounds checking outside the loop.
Steve Jessop
2009-09-04 12:16:06
onebyone: Not really. There's some optimisations that can't done. For instances, you can't read several bytes with a single 32/64-bit read, change one or two of them and write the entire thing back out. In multithreads, multiprocessor scenarios you might get poor performance accessing nearby elements of an array.
Tom Hawtin - tackline
2009-09-04 13:09:49
I've no objection to a slowdown in multi-core cases, because there you're only paying for what you're using. I'm slightly surprised that whatever instructions the JIT actually emits to modify a single byte atomically, couldn't be bettered if the atomicity requirement were to be removed. Or alternatively, if they can be bettered on a given CPU, then I'm very impressed that the JIT notices in single-threaded cases that atomicity is not required, and uses the non-atomic ops, thereby avoiding paying for something that isn't being used.
Steve Jessop
2009-09-04 14:29:42
(atomically might be the wrong word there - doing it atomically would satisfy the requirement, but there might be techniques which don't involved an atomic operation, but which nevertheless satisfy the concurrency requirement).
Steve Jessop
2009-09-04 14:30:58
Reading and writing bytes is always going to be atomic. `java.util.concurrent.atomic.Atomic*` add operations that combine reading and writing atomically in some way. There is no problem with replacing multiple write with a single 32/64-bit write, so long as the program really intends to write all of those bytes. I don't know what the current state of combining multiple reads/writes is (other than that `System.arraycopy` and some of `java.util.Arrays` are intrinsics.
Tom Hawtin - tackline
2009-09-04 15:36:13
Sorry, I wasn't clear. I was trying to compare Java (having been JIT compiled) with assembler having the same effect. Clearly for single-threaded cases, assembler code could use non-atomic byte operations. "Reading and writing bytes is always atomic" is true in Java, but not in other languages. So am I right to conclude that reading and writing bytes in Java is slow, compared with other languages, in situations where concurrency needn't be safe?
Steve Jessop
2009-09-04 17:13:32
Of course in theory I should know all this, but it's some years since JVM internals were paying my rent, and even then I don't think I ever directly dealt with the byte-access implementation...
Steve Jessop
2009-09-04 17:16:12
I am not aware of any machine language where byte reads and writes are not atomic (ICL DAP?).
Tom Hawtin - tackline
2009-09-04 17:57:40
For some reason I was under the impression that reading a word, masking, and writing it back, was faster than storing a byte on x86, at least for some versions. If so, then in the absence of the atomicity requirement, that's what you'd do. Maybe not, I'm not an x86 assembly programmer.
Steve Jessop
2009-09-04 19:05:54
I don't think so. Some [old] processors may have problems. For instance prior to the ARM7T, ARMs had no 16-bit write so where completely stuffed when it came to `char` and `short` arrays (I guess use a pair of LDRBs and have the thread-switch mechanism check it wasn't interrupting).
Tom Hawtin - tackline
2009-09-05 06:03:30
A:
Oh this sounds like a good way to get a headache. :) I think I would go for one array per thread, and later join them.
crunchdog
2009-09-04 11:38:15