ansaurus

Question

Answer 1

+1 A:

Instead of evaling scripts, maybe you should consider making a package of them? Parsing CSV is one thing — the format is simple and regular, parsing XML requires completely another approach. Considering you don't want to write every single parser from scratch, why not just write a bunch of small modules, each having identical API and use them? I believe, using Python itself (not some templating DSL) is ideal for this sort of thing.

For example, this is an approach I've seen in one small torrent-fetching script I'm using:

Main program:

...
def import_plugin(name):
    mod = __import__(name)
    components = name.split('.')
    for comp in components[1:]:
        mod = getattr(mod, comp)
    return mod

...
feed_parser = import_plugin('parsers.%s' % feed['format'])
data = feed_parser(...)
...

parsers/csv.py:

#!/usr/bin/python
from __future__ import absolute_import

import urllib2
import csv

def parse_feed(...):
    ...

If you don't particularly like dynamically loaded modules, you may consider writing, for example, a single module with several parses classes (probably derived from some "abstract parser" base class).

class BaseParser(object):
    ...

class CSVParser(BaseParser):
    ...
register_feed_parser(CSVParser, ['text/plain', 'text/csv'])
...

parsers = get_registered_feed_parsers(feed['mime_type'])
data = None
for parser in parsers:
    try:
        data = parser(feed['data'])
        if data is not None: break
    except ParsingError:
        pass
...

drdaeman 2009-06-23 16:01:19

Thanks drdaeman, I really like that solution and may end up using it. The only place where it falls short is the parsing scripts need to be stored in a database.The reason for the database requirement is an administrator of this site ideally would be able to create and manage these parsing scripts (There are dozens of them) in a web interface, but even though administrators are trusted users, its still undesirable to have them enter code that ends up getting eval'ed.I think it will come down to creating a new module or going with your suggestion. Thanks again!

Jon Biddle 2009-06-23 17:05:33

Thanks. If the code needs to be accessible by end-users, then maybe I was wrong and creating DSL or sandbox, allowing access only to trusted Python modules and operations is a way to go. Unfortunately I haven't developed anything like that, so I don't have much of ideas. Maybe, this link will be useful, though: http://pypi.python.org/pypi/RestrictedPython/

drdaeman 2009-06-23 20:31:05

Thanks again for your advice. I may go with end up going with eval within RestrictedPython concept. Alternatively if I feel ambitious, I might try creating a Python module to do this.

Jon Biddle 2009-06-23 21:00:25

ansaurus

tags:

views:

answers:

Template driven feed parsing

related questions