Simple, really: instead of a simple counter for each 'bucket', it's a linked list.
That is, instead of
0(1) 1(1) 3(1) 4(1) 6(3) 8(1)
You get
0(.) 1(.) 3(.) 4(.) 6(a,b,c) 8(.)
(here I use .
to denote some item in the bucket).
Then just dump them back into one sorted list:
0 1 3 4 6a 6b 6c 8
That is, when you find an item with key x
, knowing that it may have other information that distinguishes it from other items with the same key, you don't just increment a counter for bucket x
(which would discard all those extra information).
Instead, you have a linked list (or similarly ordered data structure with constant time amortized append) for each bucket, and you append that item to the end of the list for bucket x
as you scan the input left to right.
So instead of using O(k)
space for k
counters, you have O(k)
initially empty lists whose sum of lengths will be n
at the end of the "counting" portion of the algorithm. This variant of counting sort will still be O(n + k)
as before.