The results of a correctly implemented DFT are essentially identical to the results of a correctly implemented FFT (they differ only by rounding errors). As others have pointed out here, the major difference is that of performance. DFT has O(n^2) operations while the FFT has O(nlogn) operations.
The best, most readable publication I have ever found (the one I still refer to) is The Fast Fourier Transform and its Applications by E Oran Brigham. The first few chapters provide a very thorough overview of the continuous and discrete forms of the Fourier Transform. He then uses that to develop the fast version of the DFT based on the Cooley-Tukey Algorithm for the radix-2 (n is a power of 2) and mixed-radix cases (though the latter being somewhat more shallow treatise than the former).
The basic approach in the radix-2 algorithm to perform a linear time operation on the input X and to recursively split the result in half and perform a similar linear time operation on the two halves. The mixed radix case is similar, though you need to divide X into equal portions each time, so it helps if n doesn't have any large prime factors.