ansaurus

Question

how do I filter values from XML file in python

Answer 1

+2 A:

#!/usr/bin/python

from xml.dom.minidom import parseString

xml = parseString("""<localization>
    <b n="Stats">
        <l k="SomeStat1">
            <v>10</v>
        </l>
        <l k="SomeStat2">
            <v>6</v>
        </l>
    </b>
    <b n="Levels">
        <l k="Level1">
            <v>Beginner Level</v>
        </l>
        <l k="Level2">
            <v>Intermediate Level</v>
        </l>
    </b>
</localization>""")

level = 1
blist = xml.getElementsByTagName('b')
for b in blist:
    if b.getAttribute('n') == 'Levels':
        llist = b.getElementsByTagName('l')
        l = llist.item(level)
        v = l.getElementsByTagName('v')
        print v.item(0).firstChild.nodeValue;
        #prints Intermediate Level

Amarghosh 2010-02-18 06:15:39

Answer 2

A:

If you really only care about searching for an <l> tag with a specific "k" attribute and then getting its <v> tag (that's how I understood your question), you could do it with DOM:

from xml.dom.minidom import parseString

xmlDoc = parseString("""<document goes here>""")
lNodesWithLevel2 = [lNode for lNode in xmlDoc.getElementsByTagName("l")
                    if lNode.getAttribute("k") == "Level2"]

matchingVNodes = map(lambda lNode: lNode.getElementsByTagName("v"), lNodesWithLevel2)

print map(lambda vNode: vNode.firstChild.nodeValue, matchingVNodes)
# Prints [u'Intermediate Level']

How that is what you meant.

AndiDog 2010-02-18 06:20:08

I like this solution. I would not have even thought about doing it this way.

DewBoy3d 2010-02-18 12:45:41

Answer 3

A:

level = "Level"+raw_input("Enter level number: ")
content= open("xmlfile").read()
data= content.split("</localization>")
for item in data:
    if "localization" in item:
        s = item.split("</b>")
        for i in s:
           if """<b n="Levels">""" in i:
                for c in i.split("</l>"):
                    if "<l" in c and level in c:
                         for v in c.split("</v>"):
                            if "<v>" in v:
                                print v[v.index("<v>")+3:]

2010-02-18 06:25:24

Answer 4

+3 A:

You might consider using XPATH, a language for addressing parts of an xml document.

Here's the answer using lxml.etree and it's support for xpath.

>>> data = """
... <localization>
...     <b n="Stats">
...         <l k="SomeStat1">
...             <v>10</v>
...         </l>
...         <l k="SomeStat2">
...             <v>6</v>
...         </l>
...     </b>
...     <b n="Levels">
...         <l k="Level1">
...             <v>Beginner Level</v>
...         </l>
...         <l k="Level2">
...             <v>Intermediate Level</v>
...         </l>
...     </b>
... </localization>
... """
>>>
>>> from lxml import etree
>>>
>>> xmldata = etree.XML(data)
>>> xmldata.xpath('/localization/b[@n="Levels"]/l[@k=$level]/v/text()',level='Level1')
['Beginner Level']

MattH 2010-02-18 09:47:14

just for grins I tried this because it seemed to be a tad more efficient than some of the other solutions. I have one question about this though, How do I get xpath to return the value with out [' '] (brackets and quotes)?

DewBoy3d 2010-02-18 15:24:00

the `xpath` method is returning a `list` of string objects. The list would be zero-length if nothing matched the query or greater than 1 if there we're more than 1 matches. You should check the len of the return object or `result[0]` and be prepared to catch an `IndexError`. I'm not sure what to say to `without quotes` regarding a string object. Perhaps `print result[0]` ?

MattH 2010-02-18 16:53:00

Answer 5

A:

If you could use BeautifulSoup library (couldn't you?) you could end up with this dead-simple code:

from BeautifulSoup import BeautifulStoneSoup

def get_it(xml, level_n):
    soup = BeautifulStoneSoup(xml)
    l = soup.find('l', k="Level%d" % level_n)
    return l.v.string

if __name__ == '__main__':
    print get_it(1)

It prints Beginner Level for the example XML you provided.

nailxx 2010-02-18 11:21:33

well this is definitely beautiful but I didn't really want to use another library for this project. It's almost finished and I would want to go back and change everything else to accommodate this new library and I don't have the time.

DewBoy3d 2010-02-18 12:47:35

ansaurus

tags:

views:

answers:

how do I filter values from XML file in python

related questions