How can I determine the statistical randomness of a binary string?
Ergo, how can I code my own test, and return a single value that corresponds to the statistical randomness, a value between 0 and 1.0 (0 being not random, 1.0 being random)?
The test would need to work on binary strings of any size.
When you do it with pen and paper, you might explore strings like this:
0 (arbitrary randomness, the only other choice is 1)
00 (not random, its a repeat and matches the size)
01 (better, two different values)
010 (less random, palindrome)
011 (less random, more 1's, still acceptable)
0101 (less random, pattern)
0100 (better, less ones, but any other distribution causes patterns)
Case examples:
Size: 1, Possibilities: 2
0: 1.0 (random)
1: 1.0 (random)
Size: 2, P:4
00: ?
01: 1.0 (random)
10: 1.0 (random)
11: ?
S:3, P:8
000: ? non-random
001: 1.0 (random)
010: ? less random
011: 1.0 (random)
100: 1.0 (random)
101: ? less random
110 1.0 (random)
111: ? non-random
And so on.
I feel that this may play a lot into breaking the string into all possible substrings and comparing frequencies, but it seems like this sort of groundwork should already have been done in the early days of computer science.