ansaurus

Question

Answer 1

A:

I think you're asking a bit much to generate a substitution that is also "coherent". That is an AI problem for the encryption algorithm to figure out what text is coherent. Also, the longer your text is the more complicated it will be to create a "coherent" result... quickly approaching a point where you need a "key" as long as the text you are encrypting. Thus defeating the purpose of encrypting it at all.

SoapBox 2008-12-07 21:52:20

Note, I Explicitly DON'T want to /generate/ the text. Both sides must be /found/ written by some human other than my self. The point being to find a text that will decrypt to the wrong plain-text.

BCS 2008-12-08 00:17:17

Answer 2

+1 A:

There are 26! different substitution ciphers. That works out to a bit over 88 bits of choice:

>>> math.log(factorial(26), 2)
88.381953327016262

The entropy of English text is something like 2 bits per character at least. So it seems to me you can't reasonably expect to find passages of more than 45-50 characters that are accidentally equivalent under substitution.

For the large corpus, there's the Gutenberg Project and Wikipedia, for a start. You can download an dump of all the English Wikipedia's XML files from their website.

Darius Bacon 2008-12-08 01:16:15

Wikipidia! good to known! Also, I like your analysis. To make the project fun I would only need about 13-26 letters long so 30-40 would be more than good enough.

BCS 2008-12-08 04:08:22

ansaurus

tags:

views:

answers:

How to find "equivalent" texts?

related questions