We have a box that has terabytes of data (10-20TB) each day, where each file on the drive is anywhere from megabytes to gigabytes.
We want to send all these files to a set of 'pizza boxes', where they will consume and process the files.
I can't seem to find anything that is built to handle this amount of data besides distcp (hadoop). Robocopy/etc won't do.
Anyone know of a solution that can handle this type of delegation (share the work amongst the pizza boxes) and has reliable file transferring?