The essence of your question seems to be about image quality. There has been a considerable literature produced on the subject, and the result is that image quality is a hard thing to determine.
Standard mathematical error measures like the signal-to-noise ratio (SNR) and mean-squared error (MSE) can give a quantitative answer, but it is well known that these don’t correlate well with subjective viewer opinions, which must be our final authority. No other methods, even those founded on psycho-visual models of the viewer (e.g., S.A. Karunasekera and N.G. Kingsbury, “A distortion measure for blocking artifacts in images based on human visual sensitivity”, IEEE Trans. on Image Proc. vol. 4, no. 6, June 1995, pp. 713 –724; and M. Miyahara, K. Kotani, and V. R. Algazi, “Objective picture quality scale (PQS) for image coding,” IEEE Trans. on Comm. vol. 46, no. 9, Sept. 1998, pp. 1215 –1226), have proven themselves to be better than SNR.
Moreover, when you vary the type of imagery (line drawing, cartoon, photo, portrait, etc.), certain types of compression distortion become more evident. Mosquito noise might be objectionable in one image, while staircase noise might be the culprit in another.
In short, there is no pat answer to your question, "what would result in best image quality?"
That being said, we can say some things about the DCT that are of relevance. The pixels in a DCT of a block go from low variation to high variation in a zig-zag pattern from the top left corner [(0,0)->(0,1)->(1,0)->(2,0)->(1,1)->(0,2)->etc.], as your triangle selection mirrors. The closer a pixel is to the top left corner, the smoother the information contained therein [in fact, the (0,0) DCT value is the average of the whole block], and the farther away from that corner you get, the more "high frequency" details you'll get. The closer to the top and left of the image, the more horizontal and vertical details you'll have represented by that DCT coefficient, and the closer to the diagonal of the block, the more diagonal details you'll have.
In brief, lossy compression usually entails throwing away some of the "details" that may not be perceptible to the eye. (Throwing away the "smoother" DCT values results in severe distortion.) The more DCT values you throw away, the greater your compression ratio will be, but also the greater distortion you'll induce.
As for block size, it all depends. The more variance and detail there is in a block, the more you'll lose by throwing away coefficients. Some compression algorithms adaptively use different block sizes within the same image so that high-detail regions receive more and smaller blocks and smooth regions receive fewer and larger blocks.
For algorithms that use a single block size, 8x8, 16x16, and 32x32 are common for things like JPEG and MPEG. The processing required to compress them will be smaller than an adaptive block size, but the quality will also be lower in general.