The bigger, the better
As a general rule, the more you send, the faster USB transfers will be (bulk).
I think we hit the sweet-spot at 2MB chunks.
The only limitation is the size of buffer your host controller can handle.
A bit of why
The protocol times the bus in 1ms (full speed) 1/8ms(high speed) chunks. During which 0-~15 bulk packets can be sent (64B/512B full/high speed).
It takes time to setup a USB transfer in the controller and to handle its completion.
An example of a 10byte transfer of full speed:
ms0 - setup OHCI to transfer 10bytes
ms1 - 10bytes are transferred (this might actually happen on the next 1ms interval)
ms2 - interrupt to notify of completion.
- 3ms to send 10bytes
Example of 640byte transfer:
ms0 - setup OHCI
ms1 - transfer 640bytes
ms3 - interrupt
- 3ms to send 640bytes.
I guess you get the picture.
The IO buffer size of the device does not change the above assertion, as larger host/device transfers avoid the setup/handling overhead.
Example of very slow device and 256byte transfer
ms0 - setup OHCI
ms1 - send 64, get NAKs..
ms2 - send 64, get NAKs..
ms3 - send 64, get NAKs..
ms4 - send 64, get NAKs..
ms5 - interrupt
Hope this helps