views:

414

answers:

1

Hi,

I am working in STAF and STAX. Here python is used for coding . I am new to python. Basically my task is to parse a XML file in python using Document Factory Parser.

The XML file I am trying to parse is :

<?xml version="1.0" encoding="utf-8"?>
<operating_system>
  <unix_80sp1>
    <tests type="quick_sanity_test">
      <prerequisitescript>preparequicksanityscript</prerequisitescript>
      <acbuildpath>acbuildpath</acbuildpath>
      <testsuitscript>test quick sanity script</testsuitscript>
      <testdir>quick sanity dir</testdir>
    </tests>
    <machine_name>u80sp1_L004</machine_name>
    <machine_name>u80sp1_L005</machine_name>
    <machine_name>xyz.pxy.dxe.cde</machine_name>
    <vmware id="155.35.3.55">144.35.3.90</vmware>
    <vmware id="155.35.3.56">144.35.3.91</vmware>
  </unix_80sp1>
</operating_system>
  1. I need to read all the tags .
  2. For the tags machine_name i need to read them into a list say all machine names should be in a list machname. so machname should be [u80sp1_L004,u80sp1_L005,xyz.pxy.dxe.cde] after reading the tags.

  3. I also need all the vmware tags: all attributes should be vmware_attr =[155.35.3.55,155.35.3.56] all vmware values should be vmware_value = [ 144.35.3.90,155.35.3.56]

I am able to read all tags properly except vmware tags and machine name tags: I am using the following code:(i am new to xml and vmware).Help required.

The below code needs to be modified.

factory = DocumentBuilderFactory.newInstance();
factory.setValidating(1)
factory.setIgnoringElementContentWhitespace(0)
builder = factory.newDocumentBuilder()
document = builder.parse(xmlFileName)

vmware_value = None
vmware_attr = None
machname = None

# Get the text value for the element with tag name "vmware" 
nodeList = document.getElementsByTagName("vmware") 
for i in range(nodeList.getLength()): 
node = nodeList.item(i) 
if node.getNodeType() == Node.ELEMENT_NODE: 
children = node.getChildNodes() 
for j in range(children.getLength()): 
thisChild = children.item(j) 
if (thisChild.getNodeType() == Node.TEXT_NODE): 
vmware_value = thisChild.getNodeValue()
vmware_attr ==??? what method to use ?
# Get the text value for the element with tag name "machine_name" 
nodeList = document.getElementsByTagName("machine_name") 
for i in range(nodeList.getLength()): 
node = nodeList.item(i) 
if node.getNodeType() == Node.ELEMENT_NODE: 
children = node.getChildNodes() 
for j in range(children.getLength()): 
thisChild = children.item(j) 
if (thisChild.getNodeType() == Node.TEXT_NODE): 
machname = thisChild.getNodeValue()

Also how to check if a tag exists or not at all. I need to code the parsing properly.

A: 

You are need to instantiate vmware_value, vmware_attr and machname as lists not as strings, so instead of this:

vmware_value = None
vmware_attr = None
machname = None

do this:

vmware_value = []
vmware_attr = []
machname = []

Then, to add items to the list, use the append method on your lists. E.g.:

factory = DocumentBuilderFactory.newInstance();
factory.setValidating(1)
factory.setIgnoringElementContentWhitespace(0)
builder = factory.newDocumentBuilder()
document = builder.parse(xmlFileName)

vmware_value = []
vmware_attr = []
machname = []

# Get the text value for the element with tag name "vmware"
nodeList = document.getElementsByTagName("vmware")
for i in range(nodeList.getLength()):
    node = nodeList.item(i)
    vmware_attr.append(node.attributes["id"].value)
    if node.getNodeType() == Node.ELEMENT_NODE:
        children = node.getChildNodes()
        for j in range(children.getLength()):
            thisChild = children.item(j)
            if (thisChild.getNodeType() == Node.TEXT_NODE):
                vmware_value.append(thisChild.getNodeValue())

I've also edited the code to something I think should work to append the correct values to vmware_attr and vmware_value.

I had to make the assumption that STAX uses xml.dom syntax, so if that isn't the case, you will have to edit my suggestion appropriately.

Rob Carr