I would start by scaling down the image to something like 30K to 60K pixels. This not only will speed up calculation, but also get rid of minute irrelevant changes. For instance you don't want to detect a curtain near an open window moved by the wind.
Next, sum the squared differences of the pixel values. If you want to do it thoroughly, you can do it for R, G and B separately:
Delta := 0;
for y := 0 to Im1.Height do begin
for x := 0 to Im1.Width do begin
DeltaR := ( Im1[x,y].Red - Im2[x,y].Red ) ^ 2;
DeltaG := ( Im1[x,y].Green - Im2[x,y].Green ) ^ 2;
DeltaB := ( Im1[x,y].Blue - Im2[x,y].Blue ) ^ 2;
Delta := Delta + DeltaR + DeltaG + DeltaB;
end;
end;
if (Delta > Treshold) then
Say( 'Hi Bob!' );
(this is just pseudo-code and as code would process rather slowly, google for "scanline" if you want to process all pixels in an image quickly)
You may want to define the treshold value empirically: walk slowly by the camera, preferably wearing clothes matching the background color and see what delta's you get.
Skipping frames should increase sensitivity.