MFCCs combine consideration of aspects of human hearing (logarithmic frequency perception, the mel scale) and physics of musical instruments (these systems often have well defined overtones that are harmonic -- which is why the MFCCs use the FFT of the FFT), to give a simplified representation of the timbre of an instrument (where the fundamental frequency and loudness are factored out).
One could write endless pages on this topic, and there are many available on the web, so a more specific question that explains clearly what you want to know would be helpful. The algorithm for calculating MFCCs is listed at the top of the wikipedia page.