views:

507

answers:

1

Hello,

I have an XML structure which looks similar to:

<Store>
   <foo>
      <book>
        <isbn>123456</isbn>
      </book>
      <title>XYZ</title>
      <checkout>no</checkout>
   </foo>

   <bar>
      <book>
        <isbn>7890</isbn>
      </book>
      <title>XYZ2</title>
      <checkout>yes</checkout>
   </bar>
</Store>

Using xml.dom.minidom only (restrictions) i would like to

1)traverse through the XML file

2)Search/Get for particular element, depending on its parent

Example: checkout element for author1, isbn for author2

3)Change/Set that element's value

4)Write the new XML structure to a file

Can anyone help here?

Thank you!

UPDATE:

This is what i have done till now

import xml.dom.minidom
checkout = "yes"

def getLoneChild(node, tagname):

  assert ((node is not None) and (tagname is not None))
  elem = node.getElementsByTagName(tagname)
  if ((elem is None) or (len(elem) != 1)):
    return None
  return elem

def getLoneLeaf(node, tagname):

  assert ((node is not None) and (tagname is not None))
  elem = node.getElementsByTagName(tagname)
  if ((elem is None) or (len(elem) != 1)):
    return None
  leaf = elem[0].firstChild
  if (leaf is None):
    return None
  return leaf.data


def setcheckout(node, tagname):

  assert ((node is not None) and (tagname is not None))
  child = getLoneChild(node, 'foo')
  Check = getLoneLeaf(child[0],'checkout')
  Check = tagname
  return Check

doc = xml.dom.minidom.parse('test.xml') 
root = doc.getElementsByTagName('Store')[0]
output = setcheckout(root, checkout)

tmp_config = '/tmp/tmp_config.xml' 
fw = open(tmp_config, 'w')
fw.write(doc.toxml())
fw.close()
+2  A: 

Hi,

I'm not entirely sure what you mean by "checkout". This script will find the element and alter the value of that element. Perhaps you can adapt it to your specific needs.

import xml.dom.minidom as DOM

# find the author as a child of the "Store"
def getAuthor(parent, author):
  # by looking at the children
  for child in [child for child  in parent.childNodes 
                if child.nodeType != DOM.Element.TEXT_NODE]:
    if child.tagName == author:
      return child
  return None

def alterElement(parent, attribute, newValue):
  found = False;
  # look through the child elements, skipping Text_Nodes 
  #(in your example these hold the "values"
  for child in [child for child  in parent.childNodes 
                if child.nodeType != DOM.Element.TEXT_NODE]:

    # if the child element tagName matches target element name
    if child.tagName == attribute:
      # alter the data, i.e. the Text_Node value, 
      # which is the firstChild of the "isbn" element
      child.firstChild.data = newValue
      return True

    else:
      # otherwise look at all the children of this node.
      found = alterElement(child, attribute, newValue)

    if found:
      break 

  # return found status
  return found

doc = DOM.parse("test.xml")
# This assumes that there is only one "Store" in the file
root = doc.getElementsByTagName("Store")[0]

# find the author
# this assumes that there are no duplicate author names in the file
author = getAuthor(root, "foo")
if not author:
  print "Author not found!"
else:
  # alter an element
  if not alterElement(author, "isbn", "987654321"):
    print "isbn not found"
  else:
    # output the xml
    tmp_config = '/tmp/tmp_config.xml'
    f = open(tmp_config, 'w')
    doc.writexml( f )
    f.close()

The general idea is that you match the name of the author against the tagNames of the children of the "Store" element, then recurse through the children of the author, looking for a match against a target element tagName. There are a lot of assumptions made in this solution, but it may get you started. It's painful to try and deal with hierarchical structures like XML without using recursion.

Cheers, Phil


In retrospect there was an error in the "alterElement" function. I've fixed this (note the "found" variable")

Phil Boltt
Thanks a lot Phil. This really helped!