EDIT: I remember interesting interview definetely worth to view: Arun Kishan: Inside Windows 7 - Farewell to the Windows Kernel Dispatcher Lock
As @Steven Sudit stated I warn again: use it only as demo on how timer wheel works and some tasks you have to care about while implement it. Not as reference implementation. In real world you have to write far more complex logic to take into account available resources, scheduling logic and etc.
Here good points stated by Steven Sudit (read post comments for details):
1) Choose right structure to keep your jobs list (it depends as usually):
SortedList<> (or SortedDictionary<>) good on memory consumption and indexing but have to implement synchronized access
ConcurrentQueue<> will help you avoid locking but you have to implement ordering. It also very memory efficient
LinkedList<> is good on insert and retrieve (anyway we need head only) but requires synchronized access (thru it easily implemented via lock-free) and not so memory efficient as it stores two references (prev/next). But it become an issue when you have millions of jobs so all of them take significant amount of memory.
But:
I totally agree with @Steven:
It doesn't matter: neither one of these is a good fit. The right answer would be to use a regular queue and maintain its order yourself, since we most often need to access it only from the head or tail.
Generally, I would recommend using the most feature-full collection from the library, but that doesn't apply here because this is system-level code. We'd need to roll our own, either from scratch or on top of a less feature-rich collection
2) To simplify processing logic of simultaneous jobs you can add delegate list (e.g. via ConcurrentQueue to make it lock-free) into original Job class so when you need another job at same time you just add another delegate to start.
@Steven:
If two tasks are actually scheduled for the same time (whether actually or effectively), this is a normal case that does not require complicating our data structure. In other words, we don't need to group up simultaneous jobs so that we have to traverse two different collections; we can just make them adjacent
3) Start/stoping dispatcher not so straightful as it can be and so can lead to errors. Instead you can wait on an event while using a timeout.
@Steven:
This way, it will either wake up when the next job is ready or when a new job is inserted before the head. In the latter case, it could need to run it now or set a different wait. If presented with, say, 100 jobs all scheduled for the same instant, the best thing we can do is queue them all up.
If we needed to provide prioritization, that's a job for a priority dispatch queue and multiple pools in a producer/consumer relationship, but it still doesn't justify a start/stop dispatcher. The dispatcher should always be on, running in a single loop that sometimes cedes the core
4) About using ticks:
@Steven:
Sticking to one type of tick is fine, but mixing and matching gets ugly, particularly since it's hardware-dependent. I'm sure that ticks would be slightly faster than milliseconds, because it stores the former and has to divide by a constant to get the latter. Whether this operation winds up being costly is another matter, but I'm fine with using ticks to avoid the risk.
My thoughts:
Another good point, I agree with you. But sometimes division by constant becomes costly and it not so fast as it maybe seems to be. But when we talk about 100 000 of DateTimes it does not matter, you are right, thank you to point.
5) "Managing resources":
@Steven:
The problem I'm trying to highlight is that the call to GetAvailableThreads is expensive and naive; the answer is obsolete before you can even use it. If we really wanted to keep track, we could get initial values and keep a running count by calling the job from a wrapper that uses Interlocked.Increment/Decrement. Even then, it presumes that the rest of the program is not using the thread pool. If we really wanted fine control, then the right answer here is to roll our own thread pool
I absolutely agree that calling to GetAvailableThreads is naive method to monitor available resources thru CorGetAvailableThreads not so expensive. I want to demontrate there are needs to manage resources and seems to choose bad example.
By any means provided in source code example is must not be treated as right way to monitor available resources. I just want to demonstrate you have to think about it. Thru maybe coded no so good piece of code as example.
6) Using Interlocked.CompareExchange:
@Steven:
No, it's not a common pattern. The most common pattern is to briefly lock. Less common is to flag the variable as volatile. Much less common would be to use VolatileRead or MemoryBarrier. Using Interlocked.CompareExchange this way is obscure, even if Richter does it. using it without an explanatory comment is absolutely guaranteed to confuse, as the word "Compare" implies that we're doing a comparison, when in fact we're not.
You are right I have to point about its usage.
using System;
using System.Threading;
// Job.cs
// WARNING! Your jobs (tasks) have to be ASYNCHRONOUS or at least really short-living
// else it will ruin whole design and ThreadPool usage due to potentially run out of available worker threads in heavy concurrency
// BTW, amount of worker threads != amount of jobs scheduled via ThreadPool
// job may waits for any IO (via async call to Begin/End) at some point
// and so free its worker thread to another waiting runner
// If you can't achieve this requirements then just use usual Thread class
// but you will lose all ThreadPool's advantages and will get noticeable overhead
// Read http://msdn.microsoft.com/en-us/magazine/cc164139.aspx for some details
// I named class "Job" instead of "Task" to avoid confusion with .NET 4 Task
public class Job
{
public DateTime FireTime { get; private set; }
public WaitCallback DoAction { get; private set; }
public object Param { get; private set; }
// Please use UTC datetimes to avoid different timezones problem
// Also consider to _never_ use DateTime.Now in repeat tasks because it significantly slower
// than DateTime.UtcNow (due to using TimeZone and converting time according to it)
// Here we always work with with UTC
// It will save you a lot of time when your project will get jobs (tasks) posted from different timezones
public static Job At(DateTime fireTime, WaitCallback doAction, object param = null)
{
return new Job {FireTime = fireTime.ToUniversalTime(), DoAction = doAction, Param = param};
}
public override string ToString()
{
return string.Format("{0}({1}) at {2}", DoAction != null ? DoAction.Method.Name : string.Empty, Param,
FireTime.ToLocalTime().ToString("o"));
}
}
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading;
// Dispatcher.cs
// Take a look at System.Runtime IOThreadTimer.cs and IOThreadScheduler.cs
// in Microsoft Reference Source, its interesting reading
public class Dispatcher
{
// You need sorted tasks by fire time. I use Ticks as a key to gain some speed improvements during checks
// There are maybe more than one task in same time
private readonly SortedList<long, List<Job>> _jobs;
// Synchronization object to access _jobs (and _timer) and make it thread-safe
// See comment in ScheduleJob about locking
private readonly object _syncRoot;
// Queue (RunJobs method) is running flag
private int _queueRun;
// Flag to prevent pollute ThreadPool with many times scheduled JobsRun
private int _jobsRunQueuedInThreadPool;
// I'll use Stopwatch to measure elapsed interval. It is wrapper around QueryPerformanceCounter
// It does not consume any additional resources from OS to count
// Used to check how many OS ticks (not DateTime.Ticks!) elapsed already
private readonly Stopwatch _curTime;
// Scheduler start time. It used to build time delta for job
private readonly long _startTime;
// System.Threading.Timer to schedule next active time
// You have to implement syncronized access as it not thread-safe
// http://msdn.microsoft.com/en-us/magazine/cc164015.aspx
private readonly Timer _timer;
// Minimum timer increment to schedule next call via timer instead ThreadPool
// Read http://www.microsoft.com/whdc/system/pnppwr/powermgmt/Timer-Resolution.mspx
// By default it around 15 ms
// If you want to know it exactly use GetSystemTimeAdjustment via Interop ( http://msdn.microsoft.com/en-us/library/ms724394(VS.85).aspx )
// You want TimeIncrement parameter from there
private const long MinIncrement = 15 * TimeSpan.TicksPerMillisecond;
// Maximum scheduled jobs allowed per queue run (specify your own suitable value!)
// Scheduler will add to ThreadPool queue (and hence count them as processed) no more than this constant
// This is balance between how quick job will be scheduled after it time elapsed in one side, and
// how long JobsList will be blocked and RunJobs owns its thread from ThreadPool
private const int MaxJobsToSchedulePerCheck = 10;
// Queue length
public int Length
{
get
{
lock (_syncRoot)
{
return _jobs.Count;
}
}
}
public Dispatcher()
{
_syncRoot = new object();
_timer = new Timer(RunJobs);
_startTime = DateTime.UtcNow.Ticks;
_curTime = Stopwatch.StartNew();
_jobs = new SortedList<long, List<Job>>();
}
// Is dispatcher still working
// Warning! Queue ends its work when no more jobs to schedule but started jobs can be still working
public bool IsWorking()
{
return Interlocked.CompareExchange(ref _queueRun, 0, 0) == 1;
}
// Just handy method to get current jobs list
public IEnumerable<Job> GetJobs()
{
lock (_syncRoot)
{
// We copy original values and return as read-only collection (thread-safety reasons)
return _jobs.Values.SelectMany(list => list).ToList().AsReadOnly();
}
}
// Add job to scheduler queue (schedule it)
public void ScheduleJob(Job job)
{
// WARNING! This will introduce bottleneck if you have heavy concurrency.
// You have to implement lock-free solution to avoid botleneck but this is another complex topic.
// Also you can avoid lock by using Jeffrey Richter's ReaderWriterGateLock (http://msdn.microsoft.com/en-us/magazine/cc163532.aspx)
// But it can introduce significant delay under heavy load (due to nature of ThreadPool)
// I recommend to implement or reuse suitable lock-free algorithm.
// It will be best solution in heavy concurrency (if you have to schedule large enough job count per second)
// otherwise lock or maybe ReaderWriterLockSlim is cheap enough
lock (_syncRoot)
{
// We'll shift start time to quick check when it pasts our _curTime
var shiftedTime = job.FireTime.Ticks - _startTime;
List<Job> jobs;
if (!_jobs.TryGetValue(shiftedTime, out jobs))
{
jobs = new List<Job> {job};
_jobs.Add(shiftedTime, jobs);
}
else jobs.Add(job);
if (Interlocked.CompareExchange(ref _queueRun, 1, 0) == 0)
{
// Queue not run, schedule start
Interlocked.CompareExchange(ref _jobsRunQueuedInThreadPool, 1, 0);
ThreadPool.QueueUserWorkItem(RunJobs);
}
else
{
// else queue already up and running but maybe we need to ajust start time
// See detailed comment in RunJobs
long firetime = _jobs.Keys[0];
long delta = firetime - _curTime.Elapsed.Ticks;
if (delta < MinIncrement)
{
if (Interlocked.CompareExchange(ref _jobsRunQueuedInThreadPool, 1, 0) == 0)
{
_timer.Change(Timeout.Infinite, Timeout.Infinite);
ThreadPool.QueueUserWorkItem(RunJobs);
}
}
else
{
Console.WriteLine("DEBUG: Wake up time changed. Next event in {0}", TimeSpan.FromTicks(delta));
_timer.Change(delta/TimeSpan.TicksPerMillisecond, Timeout.Infinite);
}
}
}
}
// Job runner
private void RunJobs(object state)
{
// Warning! Here I block list until entire process done,
// maybe better will use ReadWriterLockSlim or somewhat (e.g. lock-free)
// as usually "it depends..."
// Here processing is really fast (a few operation only) so until you have to schedule many jobs per seconds it does not matter
lock (_syncRoot)
{
// We ready to rerun RunJobs if needed
Interlocked.CompareExchange(ref _jobsRunQueuedInThreadPool, 0, 1);
int availWorkerThreads;
int availCompletionPortThreads;
// Current thread stats
ThreadPool.GetAvailableThreads(out availWorkerThreads, out availCompletionPortThreads);
// You can check max thread limits by
// ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxCompletionPortThreads);
int jobsAdded = 0;
while (jobsAdded < MaxJobsToSchedulePerCheck && availWorkerThreads > MaxJobsToSchedulePerCheck + 1 && _jobs.Count > 0)
{
// SortedList<> implemented as two arrays for keys and values so indexing on key/value will be fast
// First element
List<Job> curJobs = _jobs.Values[0];
long firetime = _jobs.Keys[0];
// WARNING! Stopwatch ticks are different from DateTime.Ticks
// so we use _curTime.Elapsed.Ticks instead of _curTime.ElapsedTicks
// Each tick in the DateTime.Ticks value represents one 100-nanosecond interval.
// Each tick in the ElapsedTicks value represents the time interval equal to 1 second divided by the Frequency.
if (_curTime.Elapsed.Ticks <= firetime) break;
while (curJobs.Count > 0 && jobsAdded < MaxJobsToSchedulePerCheck && availWorkerThreads > MaxJobsToSchedulePerCheck + 1)
{
var job = curJobs[0];
// Time elapsed and we ready to start job
if (job.DoAction != null)
{
// Schedule new run
// I strongly recommend to look at new .NET 4 Task class because it give superior solution for managing Tasks
// e.g. cancel run, exception handling, continuation, etc
ThreadPool.QueueUserWorkItem(job.DoAction, job);
++jobsAdded;
// It may seems that we can just decrease availWorkerThreads by 1
// but don't forget about started jobs they can also consume ThreadPool's threads
ThreadPool.GetAvailableThreads(out availWorkerThreads, out availCompletionPortThreads);
}
// Remove job from list of simultaneous jobs
curJobs.Remove(job);
}
// Remove whole list if its empty
if (curJobs.Count < 1) _jobs.RemoveAt(0);
}
if (_jobs.Count > 0)
{
long firetime = _jobs.Keys[0];
// Time to next event
long delta = firetime - _curTime.Elapsed.Ticks;
if (delta < MinIncrement)
{
// Schedule next queue check via ThreadPool (immediately)
// It may seems we start to consume all resouces when we run out of available threads (due to "infinite" reschdule)
// because we pass thru our while loop and just reschedule RunJobs
// but this is not right because before RunJobs will be started again
// all other thread will advance a bit and maybe even complete its task
// so it safe just reschedule RunJobs and hence wait when we get some resources
if (Interlocked.CompareExchange(ref _jobsRunQueuedInThreadPool, 1, 0) == 0)
{
_timer.Change(Timeout.Infinite, Timeout.Infinite);
ThreadPool.QueueUserWorkItem(RunJobs);
}
}
else // Schedule next check via timer callback
{
Console.WriteLine("DEBUG: Next event in {0}", TimeSpan.FromTicks(delta)); // just some debug output
_timer.Change(delta / TimeSpan.TicksPerMillisecond, Timeout.Infinite);
}
}
else // Shutdown the queue, no more jobs
{
Console.WriteLine("DEBUG: Queue ends");
Interlocked.CompareExchange(ref _queueRun, 0, 1);
}
}
}
}
Quick example of usage:
// Test job worker
static void SomeJob(object param)
{
var job = param as Job;
if (job == null) return;
Console.WriteLine("Job started: {0}, [scheduled to: {1}, param: {2}]", DateTime.Now.ToString("o"),
job.FireTime.ToLocalTime().ToString("o"), job.Param);
}
static void Main(string[] args)
{
var curTime = DateTime.UtcNow;
Console.WriteLine("Current time: {0}", curTime.ToLocalTime().ToString("o"));
Console.WriteLine();
var dispatcher = new Dispatcher();
// Schedule +10 seconds to future
dispatcher.ScheduleJob(Job.At(curTime + TimeSpan.FromSeconds(10), SomeJob, "+10 sec:1"));
dispatcher.ScheduleJob(Job.At(curTime + TimeSpan.FromSeconds(10), SomeJob, "+10 sec:2"));
// Starts almost immediately
dispatcher.ScheduleJob(Job.At(curTime - TimeSpan.FromMinutes(1), SomeJob, "past"));
// And last job to test
dispatcher.ScheduleJob(Job.At(curTime + TimeSpan.FromSeconds(25), SomeJob, "+25 sec"));
Console.WriteLine("Queue length: {0}, {1}", dispatcher.Length, dispatcher.IsWorking()? "working": "done");
Console.WriteLine();
foreach (var job in dispatcher.GetJobs()) Console.WriteLine(job);
Console.WriteLine();
Console.ReadLine();
Console.WriteLine(dispatcher.IsWorking()?"Dispatcher still working": "No more jobs in queue");
Console.WriteLine();
foreach (var job in dispatcher.GetJobs()) Console.WriteLine(job);
Console.ReadLine();
}
Hope it will be helpful.
@Steven Sudit point me some issues, so here I try to give my vision.
1) I wouldn't recommend using SortedList here, or anywhere else, as it's an obsolete .NET 1.1 class
SortedList<> by any means is not obsolete. It still exists in .NET 4.0 and introduced in .NET 2.0 when generics was introduced into language. I can't see any point to remove it from .NET.
But real question here I trying to answer: What data structure can store values in sorted order and will be efficient in storing and indexing. There are two suitable ready to use data structures: SortedDictionary<> and SortedList<>. Here some info about how to choose. I just don't want waste implementation with my own code and hide main algorithm. Here I can implement priority-array or something other but it takes more lines to code. I don't see any reason do not use SortedList<> here...
BTW, I can't understand why you not recommend it? What are reasons?
2) In general, there's no need to complicate the code with special cases for simultaneous events.
When @Jrud says he probably will have numerous task to schedule I think it they may have heavy concurrency, so I demonstrate how to solve it. But my point: even if you have low concurrency you stil have chance to get events in same time. Also this is easy possible in multithreaded evironment or when there are many sources want to schedule jobs.
Interlocked functions not so complicated, cheap and since .NET 4.0 inlined so there are no problem to add guard in such situation.
3) The IsWorking method should just use a memory barrier and then read the value directly.
Im not so sure here that you are right. I would recommend to read two nice articles: Part 4: Advanced Threading of Threading in C# by Joseph Albahari and How Do Locks Lock? by Jeff Moser. And of cause Chapter 28 (Primitive Thread Synchronization Constructs) of CLR via C# (3rd edition) by Jeffrey Richter.
Here some qoute:
The MemoryBarrier method doesn’t access memory but it forces any earlier programorder
loads and stores to be completed before the call to MemoryBarrier. And it also
forces any later program-order loads and stores to be completed after the call to
MemoryBarrier. MemoryBarrier is much less useful than the other two methods
Important I know that this can be very confusing, so let me summarize it as a simple rule:
When threads are communicating with each other via shared memory, write the last value by
calling VolatileWrite and read the first value by calling VolatileRead.
I would also recommend: Intel® 64 and IA-32 Architectures Software Developer's Manuals if you care about it seriously.
So I don't use VolatileRead/VolatileWrite in my code neither volatile keyword, I don't think Thread.MemoryBarrier will be better here. Maybe you can point me what I miss? Some articles or in-depth discussion?
4) The GetJobs method looks like it could lock for an extended period. Is it necessary?
First of all its just handy method, sometime it is necessary to get all tasks in queue at least for debugging.
But you are not right. As I mentioned in code comments SortedList<> implemented as two arrays you can check this by Reference Source or just by viewing in Reflector. Here some comments from reference source:
// A sorted list internally maintains two arrays that store the keys and
// values of the entries.
I got from .NET 4.0 but it not changed much since 2-3.5
So my code:
_jobs.Values.SelectMany(list => list).ToList().AsReadOnly();
involve following:
- iterate thru values in array of references to List. Indexing array is very fast.
- iterate thru each List (which is implemented internally as array too). It very fast too.
- build new List of references (via ToList()) which is very fast too (just dynamic array) (.NET has very solid and fast implementation)
- build read-only wrapper (no copy, just iterator wrapper)
so consequently we have just flatten read-only list of references to Job's objects. It very fast even you have millions of task. Try to measure yourself.
Any way I added it to show what happens during execution cycle (for debug purposes) but I think it can be useful.
5) A lock-free queue is available in .NET 4.0.
I would recommend to read Patterns of parallel programming by Stephen Toub and Thread-safe Collections in .NET Framework 4 and Their Performance Characteristics, also here many interesting articles.
So I quote:
ConcurrentQueue(T) is a data structure in .NET Framework 4 that provides thread-safe access to FIFO (First-In First-Out) ordered elements. Under the hood, ConcurrentQueue(T) is implemented using a list of small arrays and lock-free operations on the head and tail arrays, hence it is quite different than Queue(T) which is backed by an array and relies on the external use of monitors to provide synchronization. ConcurrentQueue(T) is certainly more safe and convenient than manual locking of a Queue(T) but some experimentation is required to determine the relative performance of the two schemes. In the remainder of this section, we will refer to a manually locked Queue(T) as a self-contained type called SynchronizedQueue(T).
It don't have any methods to maintain ordered queue. Neither any of new thread-safe collection, they all maintain unordered collection. But reading original @Jrud description I think we have to maintain ordered list of time when task need to be fired. Am I wrong?
6) I wouldn't bother starting and stopping the dispatcher; just let it sleep until the next job
Do you know good way to make sleep ThreadPool's thread? How you will implement it?
I think dispatcher goes "sleep" when he does not process any task and schedule job wake-up it. Anyway there are no special processing to put it sleep or wake up so in my thoughts this process equals "sleep".
If you told that I should just reschedule RunJobs via ThreadPool when no jobs available when you are wrong it will eat too many resources and can impact started jobs. Try yourself. Why to do unnecessary job when we can easily avoid it.
7) Rather than worrying about different kinds of ticks, you could just stick to milliseconds.
You are not right. Either you stick to ticks or you don't care about it entirely. Check DateTime implementation, each access to milliseconds property involve converting internal representaion (in ticks) to ms including division. This can hurt performance on old (Pentium class) compulters (I measure it myself and you can too).
In general I will agree with you. We don't care about representation here because it does not give us noticeable performance boost.
It just my habbit. I process billions of DateTime in recent project so coded accordingly to it. In my project there are noticeable difference between processing by ticks and by other components of DateTime.
8) The attempt to keep track of available threads does not seem likely to be effective
I just want to demonstrate you have to care about it. In real world you have to implement far from my straightful logic of scheduling and monitoring resources.
I want to demonstrate timer wheel algorithm and point to some problem that author have to think when implement it.
You are absolutely right I have to warn about it. I thought "quickly ptototype" would be enough. My solution in any means can't be used in production.