As a first pass I would try computing the entropy of the color histogram of the image. Cartoon-like images should have fewer shades of different colors, and thus a lower entropy.
This is similar to what NawaMan proposed, but this method goes one step further. The number of colors over the number of pixels may not be enough. There may be jpeg artifacts, for instance, that artificially increase the number of colors in the image, but only for a few pixels. In this case most pixels in the image would still have very few colors, which would correspond to low entropy.
Let's say you start with an RGB image. For each pixel the R, G, and B values range from 0 to 255.
You can divide this range into n bins, where n can be 16 for example. The you would count how many pixels fall into each one of these 3-dimensional bins. Then you would need to divide the values of the bins by
the total number of pixels, so that your histogram sums up to 1. Then compute the entropy, which
is - sum_i p_i * log(p_i), where p_i is the value of the ith bin.
Try it with different values for n, and see if you can separate the real images from cartoons.