views:

216

answers:

3

How you could calculate size of brute force method dynamically? For example how many iterations and space would take if you printed all IPv6 addresses from 0:0:0:0:0:0:0:0 - ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff to file? The tricky parts are those when length of line varies. IP address is only example.

Idea is that you give the format and maximum lenghts of given parts. So if variable type is '%c' (char), and maxlen is 26 then iteration count is 26 and needed space in human format in text file is 26 + 26 (one char for separator)

def calculate(format, rules):

  end = format

  for i in rules:
    (vartype, maxlen) = rules[i]
    end = end.replace(i, vartype % maxlen)


  start = format

  for i in rules:
    (vartype, maxlen) = rules[i]
    minlen = 0
    start = start.replace(i, vartype % minlen)

  start_bytes = len(start)
  end_bytes = len(end)

  # how to add for example IPv4 calculations
  #         0.0.0.0 - 9.9.9.9
  #     10.10.10.10 - 99.99.99.99
  # 100.100.100.100 - 255.255.255.255

  iterations = 0

  for i in rules:
    if format.find(i) is not -1:
      (vartype, maxlen) = rules[i]
      if iterations == 0:
        iterations = int(maxlen) + 1
      else:
        iterations *= int(maxlen) + 1

  iterations -= 1

  needed_space = 0

  if start_bytes == end_bytes:
    # +1 for separator (space / new line)
    needed_space = (1 + start_bytes) * iterations
  else:
    needed_space = "How to calculate?"

  return [iterations, needed_space, start, end, start_bytes, end_bytes]


if __name__ == '__main__':

  # IPv4
  print calculate(
    "%a.%b.%c.%d",
    {
      '%a': ['%d', 255],
      '%b': ['%d', 255],
      '%c': ['%d', 255],
      '%d': ['%d', 255]
    },
  )

  # IPv4 zero filled version
  print calculate(
    "%a.%b.%c.%d",
    {
      '%a': ['%03d', 255],
      '%b': ['%03d', 255],
      '%c': ['%03d', 255],
      '%d': ['%03d', 255]
    },
  )


  # IPv6
  print calculate(
    "%a:%b:%c:%d:%e:%f:%g:%h",
    {
      '%a': ['%x', 65535],
      '%b': ['%x', 65535],
      '%c': ['%x', 65535],
      '%d': ['%x', 65535],
      '%e': ['%x', 65535],
      '%f': ['%x', 65535],
      '%g': ['%x', 65535],
      '%h': ['%x', 65535]
    },
  )

  # days in year, simulate with day numbers
  print calculate(
    "ddm%a", #ddmmyy
    {
      '%a': ['%03d', 365],
    },
  )

So for example:

  • 1.2.3.4 takes 7 bytes
  • 9.9.9.10 takes 8 bytes
  • 1.1.1.100 takes 9 bytes
  • 5.7.10.100 takes 10 bytes
  • 128.1.1.1 takes 9 bytes
  • and so on


Example 0.0.0.0 - 10.10.10.10:

  iterations = 0
  needed_space = 0

  for a in range(0, 11):
    for b in range(0, 11):
      for c in range(0, 11):
        for d in range(0, 11):
          line = "%d.%d.%d.%d\n" % (a, b, c, d)
          needed_space += len(line)
          iterations += 1

  print "iterations: %d needed_space: %d bytes" % (iterations, needed_space)

iterations: 14641 needed_space: 122452 bytes

Into

  print calculate(
    "%a.%b.%c.%d",
    {
      '%a': ['%d', 10],
      '%b': ['%d', 10],
      '%c': ['%d', 10],
      '%d': ['%d', 10]
    },
  )

Result: [14641, 122452]

+4  A: 

Using combinatorics and discrete math:

The IPv4 address space is 256*256*256*256 = 2^32 = 4,294,967,296 addresses.

IPv6 has 2^128 addresses (8 groups of 16*16*16*16).

An IPv4 address use 32 bits, so 32 bits * 4,294,967,296 addresses = 16 gigabytes if stored e.g. on disk.

An IPv6 address uses 128 bits, so 128 bits * 2^128 addresses = 5.07 * 10^30 gigabytes.

lost-theory
But this doesn't tell exactly how much space it would take in human format, only binary.
raspi
A: 
Greg Hewgill
Yeah, the trick is that one line is 15-40 characters and not fixed length :)
raspi
A: 

Breaking it up into components is probably the way to go; for IPv4 you've got four parts, separated by three dots, and each part can be 1, 2 or 3 characters (0-9, 10-99, 100-255) long. So your combinations are:

comp_length = {1: 10, 2: 90, 3: 156}

You can work out the total lengths by iterating through each combo:

def ipv4_comp_length(n=4):
    if n == 1:
        return comp_length
    res = {}
    for rest, restcnt in ipv4_comp_length(n-1).iteritems():
        for first, firstcnt in comp_length.iteritems():
             l = first + 1 + rest # "10" + "." + "0.0.127"
             res[l] = res.get(l,0) + (firstcnt * restcnt)
    return res

print sum( l*c for l,c in ipv4_comp_length().iteritems() )

Note that this doesn't take record separators into account (the space in "1.2.3.4 1.2.3.5").

Extending to IPv6 should be mostly straightforward -- unless you want to cope with abbreviated IPv6 addresses like 0::0 == 0:0:0:0:0:0:0:0.

Anthony Towns
IPv6 truncation can't be there because it should be able to calculate anything.
raspi