views:

286

answers:

3

I'm just getting back into coding after a few year hiatus and I'm trying to model multi-tiered static forms in a way that lets me grab and perform operations on a specific form level or an entire sub-tree.

Example Form hierarchy:

  • MyForm
    • Question 1
    • Part 1
      • Question 1.1
    • Part 2
      • Question 2.1
      • SubPart 1
        • Question 2.1.1
        • Question 2.1.2
    • Question 2

Each Question will have multiple attributes (question text, whether it's a required field, etc.) and Questions can be at any level of the hierarchy.

I'd like to be able to do something like this:

>>> MyForm.getQuestionObjects()
[Question1, Question1_1, Question2_1, Question2_1_1, Question2_1_2, Question2]

>>> MyForm.Part2.getQuestionObjects()
[Question2_1, Question2_1_1, Question2_1_2]

and/or stuff like:

>>> # Get questions (return class members)
>>> MyForm.SubPart1.getQuestions()
(('2.1.1 text', otherAttributes), ('2.1.2 text', otherAttributes))

>>> # Get questions -- but replace an attribute on 2.1.2
>>> MyForm.Part2.getQuestions(replace_attr('Question_2_1_2', 'text', 'New text'))
(('2.1.1 text', otherAttributes), ('New text', otherAttributes))

I keep trying to do this with nested/inner classes, which are a big headache and not well-supported in python. But even if I can figure out a solution using nested classes, I keep wondering whether there's a much better way of storing this form info somewhere to make it easier for non-coders to edit (probably a plain text template), and then loading the data at run-time since it's static and I'll need it in memory quite often. The form data won't be updated more than say once per month. Regardless how I store the data, I'd like to figure out a good data structure to represent, traverse, and operate on it.

  • Is there a way to make a tiered-attributes object like this?
  • Could I do something like multidimensional named tuples?
  • Any other ideas?

Thanks for any comments.

+1  A: 

I'd store such hierarchical data in XML on the storage. You can use the xml.etree.ElementTree standard module to load such an XML file into a hierarchical data structure in Python, make changes to it, then save it back to a file. This way you don't have to bother with the actual data structure, since it is built by ElementTree automatically.

See xml.etree.ElementTree in the Python Manual. More information can be found here:

http://effbot.org/zone/element-index.htm

(There're other mature solutions in Python to load an XML file into various data structures. Just pick one which is the easiest to use for your task. Google is your friend. :-) )

fviktor
Thanks. I read up on lxml and am currently taking a look at http://codespeak.net/lxml/objectify.html which makes XML access behave like Python objects. Not sure if it'll fit all my needs but will have to play around.
David Marble
I agree. Trying lxml is a very good idea. I used lxml a lot myself and it worked better than ElementTree from Python's standard library. It is especially true when you come to XML namespace support...
fviktor
A: 

There's nothing headachey or ill-supported about nested classes in Python, it's just they don't do anything. Don't expect to get a Java-inner-class-style link back to an owner instance automatically: nested classes are nothing but normal classes whose class object happens to be stored in the property of another class. They don't help you here.

Is there a way to make a tiered-attributes object like this?

Certainly, but you'd probably be better off extending Python's existing sequence classes to get the benefits of all the existing operations on them. For example, a form ‘part’ might simply be a list which also has a title:

class FormPart(list):
    def __init__(self, title, *args):
        list.__init__(self, *args)
        self.title= title
    def __repr__(self):
        return 'FormPart(%r, %s)' % (self.title, list.__repr__(self))

Now you can say form= FormPart('My form', [question, formpart...]) and access the questions and formparts inside it using normal list indexing and slicing.

Next, a question might be an immutable thing like a tuple, but perhaps you want the items in it to have nice property names. So add that to tuple:

class FormQuestion(tuple):
    def __new__(cls, title, details= '', answers= ()):
        return tuple.__new__(cls, (title, details, answers))
    def __repr__(self):
        return 'FormQuestion%s' % tuple.__repr__(self)

    title= property(operator.itemgetter(0))
    details= property(operator.itemgetter(1))
    answers= property(operator.itemgetter(2))

Now you can define your data like:

form= FormPart('MyForm', [
    FormQuestion('Question 1', 'Why?', ('Because', 'Why not?')),
    FormPart('Part 1', [
        FormQuestion('Question 1.1', details= 'just guess'),
    ]),
    FormPart('Part 2', [
        FormQuestion('Question 2.1'),
        FormPart('SubPart 1', [
            FormQuestion('Question 2.1.1', answers= ('Yes')),
        ]),
    ]),
    FormQuestion('Question 2'),
])

And access it:

>>> form[0]
FormQuestion('Question 1', 'Why?', ('Because', 'Why not?'))
>>> form[1].title
'Part 1'
>>> form[2][1]
FormPart('SubPart 1', [FormQuestion('Question 2.1.1', '', 'Yes')])

Now for your hierarchy-walking you can define on FormPart:

    def getQuestions(self):
        for child in self:
            for descendant in child.getQuestions():
                yield descendant

and on FormQuestion:

    def getQuestions(self):
        yield self

Now you've got a descendant generator returning FormQuestions:

>>> list(form[1].getQuestions())
[FormQuestion('Question 1.1', 'just guess', ())]
>>> list(form.getQuestions())
[FormQuestion('Question 1', 'Why?', ('Because', 'Why not?')), FormQuestion('Question 1.1', 'just guess', ()), FormQuestion('Question 2.1', '', ()), FormQuestion('Question 2.1.1', '', 'Yes'), FormQuestion('Question 2', '', ())]
bobince
Thanks a ton for this. You got me thinking about extending the built-in data structures instead of trying to fit my problem to what's available already. Not sure I'm crazy about the particular format you came up with for defining the form, but it's a giant leap in opening up ideas to me. Thanks again!
David Marble
A: 

Thought I'd share a bit of what I've learned from doing this using ElementTree, specifically the lxml implementation of ElementTree and lxml.objectify with some XPath. The XML could also be simplified to <part> and <question> tags with names stored as attributes.

questions.xml

<myform>
    <question1>Question 1</question1>
    <part1 name="Part 1">
        <question1_1>Question 1.1</question1_1>
    </part1>
    <part2 name="Part 2">
        <question2_1 attribute="stuff">Question 2.1</question2_1>
        <subpart1 name="SubPart 1">
            <question2_1_1>Question 2.1.1</question2_1_1>
            <question2_1_2>Question 2.1.2</question2_1_2>
        </subpart1>
    </part2>
    <question2>Question 2</question2>
</myform>

questions.py

from lxml import etree
from lxml import objectify
# Objectify adds some python object-like syntax and other features.
# Important note: find()/findall() in objectify uses ETXPath, which supports
# any XPath expression. The union operator, starts-with(), and local-name()
# expressions below don't work with etree.findall.

# Using etree features
tree = objectify.parse('questions.xml')
root = tree.getroot()

# Dump root to see nodes and attributes
print etree.dump(root)

# Pretty print XML
print etree.tostring(root, pretty_print=True)

# Get part2 & all of its children
part2_and_children = root.findall(".//part2 | //part2//*")

# Get all Part 2 children
part2_children = root.findall(".//*[@name='Part 2']//*[starts-with(local-name(), 'question')]")

# Get dictionary of attributes for Question 2.1
list_of_dict_of_attributes = root.find(".//question2_1")[0].items()

# Access nodes like python objects
# Get all part2 question children
part2_question_children = root.part2.findall(".//*[starts-with(local-name(), 'question')]")

# Get text of question 2.1
text2_1 = root.part2.question2_1.text

# Get dictionary of attributes for Question 2.1
q2_1_attrs = root.part2.question2_1[0].items()
David Marble