This Paper (When the CRC and TCP checksum disagree) suggests that since the TCP checksumming algorithm is rather weak, there would occur an undetected error every 16 million to 10 billion packets using TCP.
Are there any application developers out there who protect the data against such kind of errors by adding checksums at the application level?
Are there any patterns available to protect against such errors while doing EJB remote method invocation (Java EE 5)? Or does Java already checksum serialized objects automatically (additionally to the underlying network protocol)?
Enterprise software has been running on computers doing not only memory ECC, but also doing error checking within the CPU at the registers etc (SPARC and others). Bit errors at storage systems (hard drives, cables, ...) can be prevented by using Solaris ZFS.
I was never afraid of network bit errors because of TCP - until I saw that article.
It might not be that much work to implement application level checksumming for some very few client server remote interfaces. But what about distributed enterprise software that runs on many machines in a single datacenter. There can be a really huge number of remote interfaces.
Is every Enterprise Software vendor like SAP, Oracle and others just ignoring this kind of problem? What about banks? What about stock exchange software?
Follow up: Thank you very much for all your answers! So it seems that it is pretty uncommon to check against undetected network data corruption - but they do seem to exist.
Couldn't I solve this problem simply by configuring the JEE Application Servers (or EJB deployment descriptors) to use RMI over TLS with the TLS configured to use MD5 or SHA1 and by configuring the Java SE clients to do the same? Would this be a way to get reliable transparent checksumming (although by overkill) so that I would not have to implement this at application level? Or am I completely confused network-stack wise?