views:

215

answers:

5

Alright, I know this is going to sound bad, like I'm going to use this for un-ethical things, but you have my word that I am not.

I am writing a paper for my Computer and Information Security course and the topic I chose was hashing methods. One of the points that I go over in my paper is MD5 being only one-way and the only way to crack an MD5 hash is to continuously make strings and use an MD5 function, then compare it with the hash you want to crack.

I would like to build a really simple mock-up program to show alongside my paper (we do a presentation and this would be an awesome thing to have), so I wanted to work out an algorithm that makes a string with every possible character combination up to 8 characters. For example the output will be:

a, b, c, ..., aa, ab, ac, ... ba, bb, bc etc etc etc.

It need to include letters, numbers and symbols if possible.

I got partly through the algorithm for this, but unfortunately my programming skills are not up to the task. If anyone can provide a complete algorithm for this I'd be extremely thankful.

Again, if you think I'm a liar and I'm going to use this for hacking purposes you don't have to leave an answer.

Thank you. :)

+3  A: 

To implement this i would probably encode integers to base 36 (or more if you wanted symbols).

1 = 1 2 = 2 ... a = 10 b = 12 ..

and so on.

then you would have a number, like 38 and do some divisions, ie:

38/36 = 1 remaider 2 = 12 in base 36

then just run a for loop to your max number you want to encode, something very large and output your encoded numbers.

just for fun i wrote this for you: http://pastebin.antiyes.com/index.php?id=327

John Boker
Ah! This certainly is a interesting way to do it. I'll check back later to see if anyone else comes up with something, but this seems very likely. Thanks :)
Andy
+1  A: 

It is not true that "the only way to crack an MD5 hash" is to generate every possible string and look for collisions. In fact, if you have access to the original it is possible to modify it so that its MD5 matches that of another file you can create. This is described in a paper at infosec.edu.

Even if you cannot modify the original file, rainbow tables of MD5 checksums exist which can be used to generate collisions.

These facts make MD5 unsuitable for passwords or cryptography, and in fact the U.S. government has forbidden its continued use for secure applications.

Dour High Arch
Not true. The vulnerability of MD5 you describe is to a collision attack, which requires both the original and the impostor message to be under the control of an attacker. This makes MD5 unsuitable for some applications but fine for others. MD5 is not at this time known to have any practical preimage attack, which is what is required if you are *given* an arbitrary file not under your control, and you wish to create an impostor plaintext which matches the arbitrary file's MD5 hash. http://www.vpnc.org/hash.html
Jason S
Or if you are in fact saying the same thing I am, you should make it clearer. (anyway the paper cited is no longer available online, apparently.)
Jason S
That is indeed what I meant, apologies if it was not clear. And I have fixed the URL.
Dour High Arch
+6  A: 

In Python, itertools.product does almost all you require -- though it does it for just one "number of repeats", so you'll have to iterate from 1 to 8 (not hard;-). In essence:

import itertools
import string

# whatever you wish as alphabet (lower/upper, digits, punct, &c)
myalphabet = string.ascii_lowercase + string.ascii_digits

def prods(maxlen, alphabet=myalphabet):
  for i in range(1, maxlen+1):
    for s in itertools.product(alphabet, repeat=i):
      yield ''.join(s)

Of course, for an alphabet of length N and K repetitions (8 in your case) this does produce N + N^2 + ... + N^K possibilities (2,901,713,047,668 possibilities for N=36 and K=8), but, what's a few trillion outputs among friends!-)

Alex Martelli
A: 

If you already have access to the hashed version of the password, then MD5 is broken to begin with. That said, when it comes to breaking a hashed value, you'd likely be better off using Rainbow Tables, Dictionary Attacks, and Social Engineering over your brute force method. That said, since you asked for an algorithm to generate all the values, maybe the following will be beneficial (C#):

using System;
using System.Text;

namespace PossibiltyIterator
{
  class Program
  {
    static readonly char[] Symbols = {
      'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 
      'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 
      'I', 'J', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 
      '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '!', '@', '#', '$', '%', '^', '&', 
      '*', '(', ')', '-', '_', '+', '=', '/', '\\', '[', ']', '{', '}', ';', ':', '\'', '"', 
      ',', '.', '<', '>', '?', '`', '~'
    };

    const int MaxLength = 8;

    static void BuildWord(int currentLength, int desiredLength, char[] word)
    {
      if (currentLength == desiredLength)
      {
        Console.WriteLine(word);
      }
      else
      {
        for (int value = 0; value < Symbols.Length; ++value)
        {
          word[currentLength] = Symbols[value];
          BuildWord(currentLength + 1, desiredLength, word);
        }
      }
    }

    static void Main(String[] args)
    {
      double totalValues = (Math.Pow(Symbols.Length, MaxLength + 1) - Symbols.Length)/(Symbols.Length - 1);
      Console.WriteLine("Warning! You are about to print: {0} values", totalValues);
      Console.WriteLine("Press any key to continue...");
      Console.ReadKey(true /* intercept */);

      for (int desiredLength = 1; desiredLength <= MaxLength; ++desiredLength)
      {
        BuildWord(0 /* currentLength */, desiredLength, new char[MaxLength]);
      }
    }

  }
}

To be completely honest, this can be optimized further. Because it builds all the "words" of length 1, then does that work a second time in building the words of length 2. It would be smarter to build the words of length MaxLength, then truncate one letter to build a word of MaxLength-1.

Here is the optimized version... note that it does NOT return the words in the order originally requested.

using System;
using System.Text;

namespace PossibiltyIterator
{
  class Program
  {
    static readonly char[] Symbols = {
      'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 
      'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 
      'I', 'J', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 
      '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '!', '@', '#', '$', '%', '^', '&', 
      '*', '(', ')', '-', '_', '+', '=', '/', '\\', '[', ']', '{', '}', ';', ':', '\'', '"', 
      ',', '.', '<', '>', '?', '`', '~'
    };

    const int MaxLength = 8;

    static void BuildWord(int currentLength, int desiredLength, char[] word)
    {
      if (currentLength != desiredLength)
      {
        for (int value = 0; value < Symbols.Length; ++value)
        {
          word[currentLength] = Symbols[value];
          BuildWord(currentLength + 1, desiredLength, word);
        }
        word[currentLength] = '\0';
      }

      Console.WriteLine(word);
    }

    static void Main(String[] args)
    {
      double totalValues = (Math.Pow(Symbols.Length, MaxLength + 1) - Symbols.Length)/(Symbols.Length - 1);
      char[] word = new char[MaxLength];

      Console.WriteLine("Warning! You are about to print: {0} values", totalValues);
      Console.WriteLine("Press any key to continue...");
      Console.ReadKey(true /* intercept */);

      BuildWord(0 /* currentLength */, MaxLength, new char[MaxLength]);
    }

  }
}
Rob Rolnick
Would this code produce a stack overflow?
John Boker
The recursive stack should never exceed MaxLength (well maybe MaxLength+1), so that would be unlikely.
Rob Rolnick
Your comment about "MD5 is broken to begin with" is incorrect. Attacks on MD5 are collision attacks, not preimage attacks, and require an attacker to create both of the plaintext documents. See first paragraph of the "Vulnerability analysis" of the webpage you cite, or see http://vpnc.org/hash.html
Jason S
I stand by what I said, especially given what was posted by @Dour High Arch. "if you have access to the original it is possible to modify it so that its MD5 matches that of an arbitary file. This is described in a paper at infosec.edu."
Rob Rolnick
A: 

To complete the post with a Java example which will print out the Base64 encoded MD5's of all possible character combinations using only 0-9 and a-z characters:

MessageDigest digest = MessageDigest.getInstance("MD5");
int i = 0;
while (true)
{
    String raw = Integer.toString(i, Character.MAX_RADIX);
    byte[] md5 = digest.digest(raw.getBytes());
    String base64 = new BigInteger(1, md5).toString(16);
    System.out.println(raw + " = " + base64);
    i++;
}
mhaller