ansaurus

Question

Answer 1

+4 A:

import urllib

html = urllib.urlopen('http://random.yahoo.com/bin/ryl').read()

I think that pulling a random page is much easier to implement and will be far more random than anything you could program yourself. Any program designed to produce random pages will still have to adhere to whatever rules defining the structure of html. Since humans are much better and breaking rules than machines, a random page from the web is more likely to contain structures you won't get from a randomizer.

You don't have to use yahoo, there are probably other random link generators, or you could build your own.

mikerobi 2010-05-08 20:07:07

+1: Alternate responce

sixtyfootersdude 2010-05-08 20:27:40

not random enough :)

karramba 2010-05-11 19:14:21

Answer 2

+1 A:

It's quite easy to roll your own random html generator that looks very much like a top-down parser. Here's a base!

def RandomHtml():
    yield '<html><body>'
    yield '<body>'
    yield RandomBody()
    yield '</body></html>'

def RandomBody():
    yield RandomSection()
    if random.randrange(2) == 0:
        yield RandomBody()

def RandomSection():
    yield '<h1>'
    yield RandomSentence()
    yield '</h1>'
    sentences = random.randrange(5, 20)
    for _ in xrange(sentences):
         yield RandomSentence()

def RandomSentence():
    words = random.randrange(5, 15)
    yield (' '.join(RandomWord() for _ in xrange(words)) + '.').capitalize()

def RandomWord():
    chars = random.randrange(2, 10)
    return ''.join(random.choice(string.ascii_lowercase) for _ in xrange(chars))

def Output(generator):
    if isinstance(generator, str):
        print generator
    else:
        for g in generator: Output(g)

Output(RandomHtml())

Paul Hankin 2010-05-09 11:22:51

ansaurus

tags:

views:

answers:

How to generate random html document

related questions