Just off the top of my head, what if you compare the % occurrence vs the % if all items had equal number of occurences
In your example above
John, John, Jon, Jonny
50% John
25% Jon
25% Jonny
33.3% Normal? (I'm making up a word because I don't know what to call this. 3 items: 100%/3)
John's score = 50% - 33.3% = 16.7%
John, John, Jon, Jon
50% John
50% Jon
50% Normal (2 items, 100%/2)
John's score = 50% - 50% = 0%
If you had [John, John, John, Jon, Jon] then John's score would be 60%-50% = 10% which is lower than the first case, but higher than the 2nd (hopefully that's the desired result, otherwise you'll need to clarify more what the desired results should be)
In your first case [John, John, John, John, Jon] you'd get 80%-50% = 30%
For [John, John, John, John, Jon, Jonny] you'd get 66.6%-33.3% = 33.3%
That may or may not be what you want.
Where the above might factor in more is if you had John*97+Jon+Jonny+Johnny, that would give you 97%-25% = 72%, but John*99+Jon would only give you a score of 99-50% = 49%
You'd need to figure out how you want to handle the degenerate case of them all being the same, otherwise you'd get a score of 0% for that which is probably not what you want.
EDIT (okay I made lots of edits, but this one isn't just more examples :p)
To normalize the results, take the score as calculated above divide by the limit of max possible score given the number of different values. (Okay, that sounds way more complicated than it needs to, example time)
Example:
[John, John, Jon, Jonny] 50% - 33.3% = 16.7%. That's the previous score, but with 3 items the upper limit of your score would be 100%-33.3% = 66.6%, so if we take that into account, we have 16.7/66.6 = 25%
[John, John, Jon, Jon] gives (50-50) /50 = 0%
[John, John, John, Jon, Jon] gives (60-50) /50 = 20%
[John, John, John, John, Jon] gives (80-50)/50 = 60%
[John, John, John, John, Jon, Jonny] gives (66.6-33.3)/(66.6)= 50%
[John*97, Jon, Jonny, Johnny] gives (97-25)/75 = 96%
[John*99, Jon] gives (99-50)/50 = 98%