Paul R has a pretty good answer, but I'd like to expand on it a bit. If you think of the sound as a series of pulses (and it kind of is), then a higher pitch will have more pulses per second (higher frequency) and a lower pitch will have fewer (lower frequency). To lower the pitch of an existing sound, you have to spread those pulses out (make them further apart from each other). As a result, the duration of the sound will increase because you haven't reduced the number of pulses, you've just made them further apart (fewer per second). The opposite happens if you try to increase the pitch: the pulses are closer together, thus making the sound shorter in duration.
If you want the duration to remain constant regardless of changes to the recorded pitch, you have to either throw information away (lower pitch) or manufacture information (higher pitch). This is where the fancy processing comes in. What can be safely discarded? What can be safely duplicated or constructed?