tags:

views:

115

answers:

5

I have a basic grasp of XML and python and have been using minidom with some success. I have run into a situation where I am unable to get the values I want from an XML file. Here is the basic structure of the pre-existing file.

<localization>
    <b n="Stats">
        <l k="SomeStat1">
            <v>10</v>
        </l>
        <l k="SomeStat2">
            <v>6</v>
        </l>
    </b>
    <b n="Levels">
        <l k="Level1">
            <v>Beginner Level</v>
        </l>
        <l k="Level2">
            <v>Intermediate Level</v>
        </l>
    </b>
</localization>

There are about 15 different <b> tags with dozens of children. What I'd like to do is, if given a level number(1), is find the <v> node for the corresponding level. I just have no idea how to go about this.

+2  A: 
#!/usr/bin/python

from xml.dom.minidom import parseString

xml = parseString("""<localization>
    <b n="Stats">
        <l k="SomeStat1">
            <v>10</v>
        </l>
        <l k="SomeStat2">
            <v>6</v>
        </l>
    </b>
    <b n="Levels">
        <l k="Level1">
            <v>Beginner Level</v>
        </l>
        <l k="Level2">
            <v>Intermediate Level</v>
        </l>
    </b>
</localization>""")

level = 1
blist = xml.getElementsByTagName('b')
for b in blist:
    if b.getAttribute('n') == 'Levels':
        llist = b.getElementsByTagName('l')
        l = llist.item(level)
        v = l.getElementsByTagName('v')
        print v.item(0).firstChild.nodeValue;
        #prints Intermediate Level
Amarghosh
A: 

If you really only care about searching for an <l> tag with a specific "k" attribute and then getting its <v> tag (that's how I understood your question), you could do it with DOM:

from xml.dom.minidom import parseString

xmlDoc = parseString("""<document goes here>""")
lNodesWithLevel2 = [lNode for lNode in xmlDoc.getElementsByTagName("l")
                    if lNode.getAttribute("k") == "Level2"]

matchingVNodes = map(lambda lNode: lNode.getElementsByTagName("v"), lNodesWithLevel2)

print map(lambda vNode: vNode.firstChild.nodeValue, matchingVNodes)
# Prints [u'Intermediate Level']

How that is what you meant.

AndiDog
I like this solution. I would not have even thought about doing it this way.
DewBoy3d
A: 
level = "Level"+raw_input("Enter level number: ")
content= open("xmlfile").read()
data= content.split("</localization>")
for item in data:
    if "localization" in item:
        s = item.split("</b>")
        for i in s:
           if """<b n="Levels">""" in i:
                for c in i.split("</l>"):
                    if "<l" in c and level in c:
                         for v in c.split("</v>"):
                            if "<v>" in v:
                                print v[v.index("<v>")+3:]
+3  A: 

You might consider using XPATH, a language for addressing parts of an xml document.

Here's the answer using lxml.etree and it's support for xpath.

>>> data = """
... <localization>
...     <b n="Stats">
...         <l k="SomeStat1">
...             <v>10</v>
...         </l>
...         <l k="SomeStat2">
...             <v>6</v>
...         </l>
...     </b>
...     <b n="Levels">
...         <l k="Level1">
...             <v>Beginner Level</v>
...         </l>
...         <l k="Level2">
...             <v>Intermediate Level</v>
...         </l>
...     </b>
... </localization>
... """
>>>
>>> from lxml import etree
>>>
>>> xmldata = etree.XML(data)
>>> xmldata.xpath('/localization/b[@n="Levels"]/l[@k=$level]/v/text()',level='Level1')
['Beginner Level']
MattH
just for grins I tried this because it seemed to be a tad more efficient than some of the other solutions. I have one question about this though, How do I get xpath to return the value with out [' '] (brackets and quotes)?
DewBoy3d
the `xpath` method is returning a `list` of string objects. The list would be zero-length if nothing matched the query or greater than 1 if there we're more than 1 matches. You should check the len of the return object or `result[0]` and be prepared to catch an `IndexError`. I'm not sure what to say to `without quotes` regarding a string object. Perhaps `print result[0]` ?
MattH
A: 

If you could use BeautifulSoup library (couldn't you?) you could end up with this dead-simple code:

from BeautifulSoup import BeautifulStoneSoup

def get_it(xml, level_n):
    soup = BeautifulStoneSoup(xml)
    l = soup.find('l', k="Level%d" % level_n)
    return l.v.string

if __name__ == '__main__':
    print get_it(1)

It prints Beginner Level for the example XML you provided.

nailxx
well this is definitely beautiful but I didn't really want to use another library for this project. It's almost finished and I would want to go back and change everything else to accommodate this new library and I don't have the time.
DewBoy3d