Can we write a c program to find out time spent in context switch in Linux? Could you please share code if you have one? Thanks
Profiling the switching time is very difficult, but the in-kernel latency profiling tools, as well as oprofile (which can profile the kernel itself) will help you there.
For benchmarking the interactive application performance, I have written a small tool called latencybench that measures unexpected latency spikes:
// Compile with g++ latencybench.cc -o latencybench -lboost_thread-mt
// Should also work on MSVC and other platforms supported by Boost.
#include <boost/format.hpp>
#include <boost/thread/thread.hpp>
#include <boost/date_time.hpp>
#include <algorithm>
#include <cstdlib>
#include <csignal>
volatile bool m_quit = false;
extern "C" void sighandler(int) {
m_quit = true;
}
std::string num(unsigned val) {
if (val == 1) return "one occurrence";
return boost::lexical_cast<std::string>(val) + " occurrences";
}
int main(int argc, char** argv) {
using namespace boost::posix_time;
std::signal(SIGINT, sighandler);
std::signal(SIGTERM, sighandler);
time_duration duration = milliseconds(10);
if (argc > 1) {
try {
if (argc != 2) throw 1;
unsigned ms = boost::lexical_cast<unsigned>(argv[1]);
if (ms > 1000) throw 2;
duration = milliseconds(ms);
} catch (...) {
std::cerr << "Usage: " << argv[0] << " milliseconds" << std::endl;
return EXIT_FAILURE;
}
}
typedef std::map<long, unsigned> Durations;
Durations durations;
unsigned samples = 0, wrongsamples = 0;
unsigned max = 0;
long last = -1;
std::cout << "Measuring actual sleep delays when requesting " << duration.total_milliseconds() << " ms: (Ctrl+C when done)" << std::endl;
ptime begin = boost::get_system_time();
while (!m_quit) {
ptime start = boost::get_system_time();
boost::this_thread::sleep(start + duration);
long actual = (boost::get_system_time() - start).total_milliseconds();
++samples;
unsigned num = ++durations[actual];
if (actual != last) {
std::cout << "\r " << actual << " ms " << std::flush;
last = actual;
}
if (actual != duration.total_milliseconds()) {
++wrongsamples;
if (num > max) max = num;
std::cout << "spike at " << start - begin << std::endl;
last = -1;
}
}
if (samples == 0) return 0;
std::cout << "\rTotal measurement duration: " << boost::get_system_time() - begin << "\n";
std::cout << "Number of samples collected: " << samples << "\n";
std::cout << "Incorrect delay count: " << wrongsamples << boost::format(" (%.2f %%)") % (100.0 * wrongsamples / samples) << "\n\n";
std::cout << "Histogram of actual delays:\n\n";
unsigned correctsamples = samples - wrongsamples;
const unsigned line = 60;
double scale = 1.0;
char ch = '+';
if (max > line) {
scale = double(line) / max;
ch = '*';
}
double correctscale = 1.0;
if (correctsamples > line) correctscale = double(line) / correctsamples;
for (Durations::const_iterator it = durations.begin(); it != durations.end(); ++it) {
std::string bar;
if (it->first == duration.total_milliseconds()) bar = std::string(correctscale * it->second, '>');
else bar = std::string(scale * it->second, ch);
std::cout << boost::format("%5d ms | %s %d") % it->first % bar % it->second << std::endl;
}
std::cout << "\n";
std::string indent(30, ' ');
std::cout << indent << "+-- Legend ----------------------------------\n";
std::cout << indent << "| > " << num(1.0 / correctscale) << " (of " << duration.total_milliseconds() << " ms delay)\n";
if (wrongsamples > 0) std::cout << indent << "| " << ch << " " << num(1.0 / scale) << " (of any other delay)\n";
}
Results on Ubuntu 2.6.32-14-generic kernel. While measuring, I was compiling C++ code with four cores and playing a game with OpenGL graphics at the same time (to make it more interesting):
Total measurement duration: 00:01:45.191465
Number of samples collected: 10383
Incorrect delay count: 196 (1.89 %)
Histogram of actual delays:
10 ms | >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 10187
11 ms | *************************************************** 70
12 ms | ************************************************************ 82
13 ms | ********* 13
14 ms | ********* 13
15 ms | ** 4
17 ms | *** 5
18 ms | * 2
19 ms | **** 6
20 ms | 1
+-- Legend ----------------------------------
| > 169 occurrences (of 10 ms delay)
| * one occurrence (of any other delay)
With rt-patched kernels I get much better results, pretty much 10-12 ms only.
The legend in the printout appears to be suffering of a rounding error or something (and the source code pasted is not the exact same version). I never really polished this application for a release...
If you have superuser privileges, you can run a SystemTap program with probe points for context switches and print the current time at each one:
probe scheduler.ctxswitch {
printf("Switch from %d to %d at %d\n", prev_pid, next_pid, gettimeofday_us())
}
I'm not sure how reliability the output data is, but it's a quick and easy way to get some numbers.