I've been investigating different scheduling algorithms for a thread pool I am implementing. Due to the nature of the problem I am solving I can assume that the tasks being run in parallel are independent and do not spawn any new tasks. The tasks can be of varying sizes.
I went immediately for the most popular scheduling algorithm "work stealing" using lock-free deques for the local job queues, and I am relatively happy with this approach. However I'm wondering whether there are any common cases where work-stealing is not the best approach.
For this particular problem I have a good estimate of the size of each individual task. Work-stealing does not make use of this information and I'm wondering if there is any scheduler which will give better load-balancing than work-stealing with this information (obviously with the same efficiency).
NB. This question ties up with a previous question.