input: Crypted English normal text (A-Z) using a random generated substitution cipher.
output: key
ideas:
read the whole text storing in some arrays the frequencies for each character/bigram/trigram and comparing them to:
http://en.wikipedia.org/wiki/Letter_frequencies
http://en.wikipedia.org/wiki/Bigram
http://en.wikipedia.org/wiki/Trigram
cons: letters/bigrams/trigrams with close percentage (like "c" and "u")
my software should be able to guess the max. possible characters from the crypted text (minimum 2000 characters).
I have to guess at least 18-20 letters.
questions:
is there a way/known algorithm to guess all the characters => full key ?
or can you give me some useful references or advices on how I could improve the whole guessing process ?
Thank you in advance.