What is overhead? Are there multiple types of overhead, or only one? What are some examples?
Overhead is any usage of a particular resource that's a side-effect of what you're actually trying to achieve. e.g. Struct padding is a form of memory overhead. Pushing and popping arguments on the stack is a form of processing overhead. Packet headers are a form of bandwidth overhead. Think of a resource, it can have an overhead associated with it.
The business meaning of overhead cost explains it best. From Wikipedia:
The term overhead is usually used to group expenses that are necessary to the continued functioning of the business, but cannot be immediately associated with the products/services being offered1 (e.g. do not directly generate profits).
Overhead is a "cost" you will incur to be able to perform an operation; you need to "invest" some resource to perform the operation in question.
Here's an example of size overhead for structs and classes:
struct first {
char letter1;
int number;
char letter2;
};
struct second {
int number;
char letter1;
char letter2;
};
int main ()
{
cout << "Size of first: " << sizeof(first) << endl;
cout << "Size of second: " << sizeof(second) << endl;
return 0;
}
The result is:
Size of first: 12
Size of second: 8
The compiler must build a struct to be word-aligned. In the first struct, the surrounding char's (one byte each) cause the compiler to "push" the int down so that it can be accessed as a full word (four bytes). The second struct doesn't require nearly as much pushing.
Moral of the story: place similar-size data members next to each other.
Here's an example of time overhead, related to better use of locality to exploit the cache:
#include <stdio.h>
#define SIZE 1024
double A[SIZE][SIZE], B[SIZE][SIZE], C[SIZE][SIZE];
int main ()
{
int i, j, k;
for (i = 0; i < SIZE; i++) {
for (j = 0; j < SIZE; j++) {
for (k = 0; k < SIZE; k++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
return 0;
}
Running this on my machine takes this much time:
real 0m35.137s
user 0m34.996s
sys 0m0.067s
Now I'll swap the j and k loop iterations:
#include <stdio.h>
#define SIZE 1024
double A[SIZE][SIZE], B[SIZE][SIZE], C[SIZE][SIZE];
int main ()
{
int i, j, k;
for (i = 0; i < SIZE; i++) {
for (k = 0; k < SIZE; k++) { // this is the only change
for (j = 0; j < SIZE; j++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
return 0;
}
The runtime for this is:
real 0m5.489s
user 0m5.436s
sys 0m0.040s
It's much faster because the loop iterations are more in-line with the order of the array indices. Thus, the data is more likely to be accessed consecutively, and thus more likely to be available in cache.