Is there a way to see how many context switches each thread generates? (both in and out if possible) Either in X/s, or to let it run and give aggregated data after some time. (either on linux or on windows)
I have found only tools that give aggregated context-switching number for whole os or per process.
My program makes many context switches (50k/s), probably a lot not necessary, but I am not sure where to start optimizing, where do most of those happen.