This was a rather intriguing question. I removed the if cond: pass
by using v=cond
instead, but it did not eliminate the difference entirely. I am still not certain of the answer, but I found one plausible reason:
switch (op) {
case Py_LT: c = c < 0; break;
case Py_LE: c = c <= 0; break;
case Py_EQ: c = c == 0; break;
case Py_NE: c = c != 0; break;
case Py_GT: c = c > 0; break;
case Py_GE: c = c >= 0; break;
}
This is from Objects/object.c funcion convert_3way_to_object. Note that >= is the last branch; that means it, alone, needs no exit jump. That break statement is eliminated. It matches up with the 0 and 5 in shiki's disassembly. Being an unconditional break, it may be handled by branch prediction, but it may also result in less code to load.
At this level, the difference is naturally going to be highly machine specific. My measurements aren't very thorough, but this was the one point at C level I saw a bias between the operators. I probably got a larger bias from CPU speed scaling.