Requirements: I have a python project which parses data feeds from multiple sources in varying formats (Atom, valid XML, invalid XML, csv, almost-garbage, etc...) and inserts the resulting data into a database. The catch is the information required to parse each of the feeds must also be stored in the database.
Current solution: My previous solution was to store small python scripts which are eval'ed on the raw data, and return a data object for the parsed data. I'd really like to get away from this method as it obviously opens up a nasty security hole.
Ideal solution: What I'm looking for is what I would describe as a template-driven feed parser for python, so that I can write a template file for each of the feed formats, and this template file would be used to make sense of the various data formats.
I've had limited success finding something like this in the past, and was hoping someone may have a good suggestion.
Thanks everyone!