What is the best algorithm to use for scheduling an application that will support 10K concurrent threads with heavy I/O but low CPU usage? Links to papers are appreciated.
Why wouldn't you use SCHED_RR? You said it yourself: low CPU usage. You could even nice the process when you expect to do some heavy I/O so you're scheduled less often than other processes.
In general, though, why not let the OS do what it's best at, and just worry about writing efficient code? The OS will know you're doing a blocking I/O call and will put your thread/task in a waitqueue and select another task to run. You don't need to worry about those details.
Actually I believe no scheduling mechanism will handle this number of threads flawlesly as the management tables in the kernel will become quite large.
If possible, I'd suggest rewriting the app to use asynchronous I/O, select() or something similar on the OS of your choice.
You will likely want SCHED_RR for this. You might be interested in reading this question regarding the difference between SCHED_FIFO and SCHED_RR.
As grover suggested, you could also use some Thread pooling mechanisms, which are less resource intensive and will solve your purpose at least to some reasonable extent, if not fully
Your problem is more related to I/O scheduling than Thread scheduling. The linux kernel offers various I/O scheduler mplementations. You can find a good article on this subject in this edition of LWN.