I wonder if there is a common mechanism implemented in operating systems to minimize TLB flushes, by for instance grouping threads in the same process together in a "to be scheduled" list.
I think this is an important factor when deciding between using processes against threads. If OS doesn't care whether the next thread is in the same process space or not, the so called advantage of threads "minimizing TLB flushes" might be overrated. Is that the case?
Consider a system with hundreds of threads and tens of processes. If these are not optimized in a way to schedule threads in same process in tandem, our expectations on thread performance may not be that big.
I'll give examples if question isn't that clear.