I am working on an application in which thousands of tasks associated with hundreds of devices, each task requiring, < 5ms to begin execution, and taking on average 100ms to complete.
The conditions are as such:
- Each device can only process a single task at a time, e.g., one task must finish running on its assigned device prior to subsequent task's being processeed.
- The scheduler should be efficient. Currently, processing a given device's work queue takes longer than the sum of it's tasks.
Here is basic description of the current implementation:
Each device contains a work queue which is filled with tasks associated with that device.
When a task is enqueued, that device's work queue is placed into a global run queue (a queue of queue's). The global run queue is consumed by a worker thread which dequeue's the device's task objects, processes one, then places the device queue at the back of the global run queue. When that given device has been dequeued again, the worker thread checks to see if the task has completed, if so, the next task is executed. This process continues, until all device queues have been depleted of tasks in the global runqueue.
Any suggestions for improvements? Have I stated this clearly? If not, please let me know, and I'll do my best to clarify.
Thanks for taking the time to look this over. Regards.