views:

129

answers:

2

I'm quite new to python and GAE, can anyone please provide some help/sample code for doing the following simple task? I managed to read a simple file and output it as a webpage but I need some slightly more complicated logic. Here is the pseudo code:

  open file;
  for each line in file {
    store first line as album title;
    for each song read {
      store first line as song title;
      store second line as song URL;
    }
  }
  Output the read in data as a json;

The file format will be something like this

Album title1
song1 title
song1 url
song2 title
song2 url

Album title2
song1 title
song1 url
song2 title
song2 url
..

+3  A: 

Here's a generator-based solution with a few nice features:

  • Tolerates multiple blank lines between albums in text file
  • Tolerates leading/trailing blank lines in text file
  • Uses only an album's worth of memory at a time
  • Demonstrates a lot of neato things you can do with Python :)

albums.txt

Album title1
song1 title
song1 url
song2 title
song2 url

Album title2
song1 title
song1 url
song2 title
song2 url

Code

from django.utils import simplejson

def gen_groups(lines):
   """ Returns contiguous groups of lines in a file """

   group = []

   for line in lines:
      line = line.strip()
      if not line and group:
         yield group
         group = []
      elif line:
         group.append(line)


def gen_albums(groups):
   """ Given groups of lines in an album file, returns albums  """

   for group in groups:
      title    = group.pop(0)
      songinfo = zip(*[iter(group)]*2)
      songs    = [dict(title=title,url=url) for title,url in songinfo]
      album    = dict(title=title, songs=songs)

      yield album


input = open('albums.txt')
groups = gen_groups(input)
albums = gen_albums(groups)

print simplejson.dumps(list(albums))

Output

[{"songs": [{"url": "song1 url", "title": "song1 title"}, {"url": "song2 url", "title": "song2 title"}], "title": "song2
title"},
{"songs": [{"url": "song1 url", "title": "song1 title"}, {"url": "song2 url", "title": "song2 title"}], "title": "song2
title"}]

Album information could then be accessed in Javascript like so:

var url = albums[1].songs[0].url;

Lastly, here's a note about that tricky zip line.

Triptych
wow amazing upvoted, do I have to import anything for json.dumps to work in GAE?
erotsppa
Yeah just noted that GAE uses Python 2.5, which means you will have to import simplejson, which is nicely included with django, which is included with GAE :). I changed my code accordingly - should work as is.
Triptych
@erotsppa - a note about this solution: it DOES require that the input file is somewhat well-formed in that there are always complete song/url pairs.
Triptych
I just tried your code, I got a NameError: global name 'gen_groups' is not defined. I put your two methods inside my class MainHandler. Is this correct?
erotsppa
@erotsppa - putting those two methods in the global scope will probably fix your problem, but you may want to research scope in Python in general. You could also create a new module and dump those methods in there, then import the module as you would any other.
Triptych
+1  A: 
from django.utils import simplejson

def albums(f):
  "" yields lists of strings which are the
     stripped lines for an album (blocks of
     nonblank lines separated by blocks of
     blank ones.
  """
  while True:
    # skip leading blank lines if any
    for line in f:
      if not line: return
      line = line.strip()
      if line: break
    result = [line]
    # read up to next blank line or EOF
    for line in f:
      if not line:
        yield result
        return
      line = line.strip()
      if not line: break
      result.append(line)
    yield result

def songs(album):
  """ yields lists of 2 lines, one list per song.
  """
  for i in xrange(1, len(album), 2):
    yield (album[i:i+2] + ['??'])[:2]

result = dict()
f = open('thefile.txt')
for albumlines in albums(f):
  current = result[albumlines[0]] = []
  for songlines in songs(albumlines):
    current.append( {
      'songtitle': songlines[0],
      'songurl': songlines[1]
    } )

response.out.write(simplejson.dumps(result))
Alex Martelli
Hmm - reading through the master's take now...
Triptych