Get parent node? - python

I write a script to delete unwanted objects from huge datasets by their id-prefix.
That's how these objects are structured:
<wfsext:Replace vendorId="AdV" safeToIgnore="false">
<AX_Anschrift gml:id="DENWAEDA0000001G20161222T083308Z">
<gml:identifier codeSpace="http://www.adv-online.de/">urn:adv:oid:DENWAEDA0000001G</gml:identifier>
...
</AX_Anschrift>
<ogc:Filter>
<ogc:FeatureId fid="DENWAEDA0000001G20161222T083308Z" />
</ogc:Filter>
</wfsext:Replace>
I like to delete these full snippet within <wfsext:Replace>...</wfsext:Replace>
And there is a code snippet from my script:
file = etree.parse(portion_file)
root = file.getroot()
nsmap = root.nsmap.copy()
nsmap['adv'] = nsmap.pop(None)
node = root.xpath(".//adv:geaenderteObjekte/wfs:Transaction", namespaces=nsmap)[0]
for t in node:
for obj in t:
objecttype = str(etree.QName(obj.tag).localname)
if objecttype == 'Filter':
pass
else:
objid = (obj.xpath('#gml:id', namespaces=nsmap))[0][:16]
if debug:
print('{} - {}'.format(objid[:16], objecttype))
if objid[:6] != prefix:
#parent = obj.getparent()
t.remove(obj)
The t.remove(obj) removes <AX_Anschrift>..</AX_Anschrift> but not the rest of the object. I tried to get the parent node by using obj.getparent() but this gives me an error. How to catch it?

obj.getparent() is t, so you don't actually need to call getparent(), simply remove the entire object with:
node.remove(t)
or, if you want to remove the entire wfs:Transaction,
node.getparent().remove(node)

Related

Parsing an XML File to CSV without hardcoding values

I was wondering if there is a way to parse through an XML and basically get all the tags (or as much as possible) and put them into columns without hardcoding.
For example the eventType tag in my xml. I would like it to initially create a column named "eventType" and put the value inside it underneath that column. Each "eventType" tag it parses through would be put it into the same column.
Here is generally how I am trying to make it look like:
Here is the XML sample:
<?xml version="1.0" encoding="UTF-8"?>
<faults version="1" xmlns="urn:nortel:namespaces:mcp:faults" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:nortel:namespaces:mcp:faults NortelFaultSchema.xsd ">
<family longName="1OffMsgr" shortName="OOM"/>
<family longName="ACTAGENT" shortName="ACAT">
<logs>
<log>
<eventType>RES</eventType>
<number>1</number>
<severity>INFO</severity>
<descTemplate>
<msg>Accounting is enabled upon this NE.</msg>
</descTemplate>
<note>This log is generated when setting a Session Manager's AM from <none> to a valid AM.</note>
<om>On all instances of this Session Manager, the <NE_Inst>:<AM>:STD:acct OM row in the StdRecordStream group will appear and start counting the recording units sent to the configured AM.
On the configured AM, the <NE_inst>:acct OM rows in RECSTRMCOLL group will appear and start counting the recording units received from this Session Manager's instances.
</om>
</log>
<log>
<eventType>RES</eventType>
<number>2</number>
<severity>ALERT</severity>
<descTemplate>
<msg>Accounting is disabled upon this NE.</msg>
</descTemplate>
<note>This log is generated when setting a Session Manager's AM from a valid AM to <none>.</note>
<action>If you do not intend for the Session Manager to produce accounting records, then no action is required. If you do intend for the Session Manager to produce accounting records, then you should set the Session Manager's AM to a valid AM.</action>
<om>On all instances of this Session Manager, the <NE_Inst>:<AM>:STD:acct OM row in the StdRecordStream group that matched the previous datafilled AM will disappear.
On the previously configured AM, the <NE_inst>:acct OM rows in RECSTRMCOLL group will disappear.
</om>
</log>
</logs>
</family>
<family longName="ACODE" shortName="AC">
<alarms>
<alarm>
<eventType>ADMIN</eventType>
<number>1</number>
<probableCause>INFORMATION_MODIFICATION_DETECTED</probableCause>
<descTemplate>
<msg>Configured data for audiocode server updated: $1</msg>
<param>
<num>1</num>
<description>AudioCode configuration data got updated</description>
<exampleValue>acgwy1</exampleValue>
</param>
</descTemplate>
<manualClearable></manualClearable>
<correctiveAction>None. Acknowledge/Clear alarm and deploy the audiocode server if appropriate.</correctiveAction>
<alarmName>Audiocode Server Updated</alarmName>
<severities>
<severity>MINOR</severity>
</severities>
</alarm>
<alarm>
<eventType>ADMIN</eventType>
<number>2</number>
<probableCause>CONFIG_OR_CUSTOMIZATION_ERROR</probableCause>
<descTemplate>
<msg>Deployment for audiocode server failed: $1. Reason: $2.</msg>
<param>
<num>1</num>
<description>AudioCode Name</description>
<exampleValue>audcod</exampleValue>
</param>
<param>
<num>2</num>
<description>AudioCode Deployment failed reason</description>
<exampleValue>Failed to parse audiocode configuration data</exampleValue>
</param>
</descTemplate>
<manualClearable></manualClearable>
<correctiveAction>Check the configuration of audiocode server. Acknowledge/Clear alarm and deploy the audiocode server if appropriate.</correctiveAction>
<alarmName>Audiocode Server Deploy Failed</alarmName>
<severities>
<severity>MINOR</severity>
<severity>MAJOR</severity>
</severities>
</alarm>
<alarm>
<eventType>COMM</eventType>
<number>2</number>
<probableCause>LOSS_OF_FRAME</probableCause>
<descTemplate>
<msg>Far end LOF (a.k.a., Yellow Alarm). Trunk (DS1 Number): $1.</msg>
<param>
<num>1</num>
<description>Trunk Number of Trunk with configuration problem</description>
<exampleValue>2</exampleValue>
</param>
</descTemplate>
<clearCondition>Far end is correctly configured for proper framing.</clearCondition>
<correctiveAction>Check that the far end is configured for the proper framing.</correctiveAction>
<alarmName>Far end LOF</alarmName>
<severities>
<severity>CRITICAL</severity>
</severities>
<note>This alarm indicates the Trunk Framing settings on the connected PSTN switch do not match those provisioned on the Audiocodes Mediant 2k.</note>
</alarm>
<alarm>
<eventType>COMM</eventType>
<number>3</number>
<probableCause>LOSS_OF_FRAME</probableCause>
<descTemplate>
<msg>Near end sending LOF Indication. Trunk (DS1 Number): $1.</msg>
<param>
<num>1</num>
<description>Trunk Number of Trunk with configuration problem</description>
<exampleValue>2</exampleValue>
</param>
</descTemplate>
<clearCondition>Gateway is correctly configured for proper framing.</clearCondition>
<correctiveAction>Check that the Audiocodes gateway is configured for the proper framing.</correctiveAction>
<alarmName>Near end sending LOF Indication</alarmName>
<severities>
<severity>CRITICAL</severity>
</severities>
</alarm>
</alarms>
</family>
</faults>
This is the code, as you can see my tag names are hardcoded:
from xml.etree import ElementTree
import csv
import lxml.etree
import pandas as pd
from copy import copy
from pprint import pprint
tree = ElementTree.parse('FaultFamilies.xml')
sitescope_data = open('Out.csv', 'w', newline='', encoding='utf-8')
csvwriter = csv.writer(sitescope_data)
# Create all needed columns here in order and writes them to excel file
col_names = ['longName', 'shortName', 'eventType', 'ProbableCause', 'Severity', 'alarmName', 'clearCondition',
'correctiveAction', 'note', 'action', 'om']
csvwriter.writerow(col_names)
def recurse(root, props):
# Finds every single tag in the xml file
for child in root:
#print(child.text)
if child.tag == '{urn:nortel:namespaces:mcp:faults}family':
# copy of the dictionary
p2 = copy(props)
# adds to the dictionary the longNm name and shortName
p2['longName'] = child.attrib.get('longName', '')
p2['shortName'] = child.attrib.get('shortName', '')
recurse(child, p2)
else:
recurse(child, props)
# FIND ALL NEEDED ALARMS INFORMATION
for event in root.findall('{urn:nortel:namespaces:mcp:faults}alarm'):
event_data = [props.get('longName',''), props.get('shortName', '')]
# Find eventType and appends it
event_id = event.find('{urn:nortel:namespaces:mcp:faults}eventType')
if event_id != None:
event_id = event_id.text
# appends to the to the list with comma
event_data.append(event_id)
# Find probableCause and appends it
probableCause = event.find('{urn:nortel:namespaces:mcp:faults}probableCause')
if probableCause != None:
probableCause = probableCause.text
event_data.append(probableCause)
# Find severities and appends it
severities = event.find('{urn:nortel:namespaces:mcp:faults}severities')
if severities:
severity_data = ','.join(
[sv.text for sv in severities.findall('{urn:nortel:namespaces:mcp:faults}severity')])
event_data.append(severity_data)
else:
event_data.append("")
# Find alarmName and appends it
alarmName = event.find('{urn:nortel:namespaces:mcp:faults}alarmName')
if alarmName != None:
alarmName = alarmName.text
event_data.append(alarmName)
clearCondition = event.find('{urn:nortel:namespaces:mcp:faults}clearCondition')
if clearCondition != None:
clearCondition = clearCondition.text
event_data.append(clearCondition)
correctiveAction = event.find('{urn:nortel:namespaces:mcp:faults}correctiveAction')
if correctiveAction != None:
correctiveAction = correctiveAction.text
event_data.append(correctiveAction)
note = event.find('{urn:nortel:namespaces:mcp:faults}note')
if note != None:
note = note.text
event_data.append(note)
action = event.find('{urn:nortel:namespaces:mcp:faults}action')
if action != None:
action = action.text
event_data.append(action)
csvwriter.writerow(event_data)
# FIND ALL LOGS INFORMATION
for event in root.findall('{urn:nortel:namespaces:mcp:faults}log'):
event_data = [props.get('longName', ''), props.get('shortName', '')]
event_id = event.find('{urn:nortel:namespaces:mcp:faults}eventType')
if event_id != None:
event_id = event_id.text
event_data.append(event_id)
probableCause = event.find('{urn:nortel:namespaces:mcp:faults}probableCause')
if probableCause != None:
probableCause = probableCause.text
event_data.append(probableCause)
severities = event.find('{urn:nortel:namespaces:mcp:faults}severity')
if severities != None:
severities = severities.text
event_data.append(severities)
alarmName = event.find('{urn:nortel:namespaces:mcp:faults}alarmName')
if alarmName != None:
alarmName = alarmName.text
event_data.append(alarmName)
# Find alarmName and appends it
clearCondition = event.find('{urn:nortel:namespaces:mcp:faults}clearCondition')
if clearCondition != None:
clearCondition = clearCondition.text
event_data.append(clearCondition)
correctiveAction = event.find('{urn:nortel:namespaces:mcp:faults}correctiveAction')
if correctiveAction != None:
correctiveAction = correctiveAction.text
event_data.append(correctiveAction)
note = event.find('{urn:nortel:namespaces:mcp:faults}note')
if note != None:
note = note.text
event_data.append(note)
action = event.find('{urn:nortel:namespaces:mcp:faults}action')
if action != None:
action = action.text
event_data.append(action)
csvwriter.writerow(event_data)
root = tree.getroot()
recurse(root, {}) # root + empty dictionary
print("File successfuly converted to CSV")
sitescope_data.close()
When running #tdelaney solution:
You could build a list of lists to represent rows of the table. Whenever its time for a new row, build a new list with all known columns defaulted to "" and append it to the bottom of the outer list. When a new column needs to inserted, its just a case of spinning through the existing inner lists and appending a default "" cell. Keep a map of known column names to index in the row. Now when you spin through the events, you use the tag name to find the row index and add its value to the latest row in the table.
It looks like you want "log" and "alarm" tags, but I wrote the element selector to take any element that has an "eventType" child element. Since "longName" and "shortName" are common to all events under a given , there is an outer loop to grab those and apply on each new row of the table. I switched to xpath so that I could setup namespaces and write the selectors more tersely. Personal preference there, but I think it makes the xpath more readable.
import csv
import lxml.etree
from lxml.etree import QName
import operator
class ExpandingTable:
"""A 2 dimensional table where columns are exapanded as new column
types are discovered"""
def __init__(self):
"""Create table that can expand rows and columns"""
self.name_to_col = {}
self.table = []
def add_column(self, name):
"""Add column named `name` unless already included"""
if name not in self.name_to_col:
self.name_to_col[name] = len(self.name_to_col)
for row in self.table:
row.append('')
def add_cell(self, name, value):
"""Add value to named column in the current row"""
if value:
self.add_column(name)
self.table[-1][self.name_to_col[name]] = value.strip().replace("\r\n", " ")
def new_row(self):
"""Create a new row and make it current"""
self.table.append([''] * len(self.name_to_col))
def header(self):
"""Gather discovered column names into a header list"""
idx_1 = operator.itemgetter(1)
return [name for name, _ in sorted(self.name_to_col.items(), key=idx_1)]
def prepend_header(self):
"""Gather discovered column names into a header and
prepend it to the list"""
self.table.insert(0, self.header())
def events_to_table(elem):
""" Builds table from <family> child elements and their contained alarms and
logs."""
ns = {"f":"urn:nortel:namespaces:mcp:faults"}
table = ExpandingTable()
for family in elem.xpath("f:family", namespaces=ns):
longName = family.get("longName")
shortName = family.get("shortName")
for event in family.xpath("*/*[f:eventType]", namespaces=ns):
table.new_row()
table.add_cell("longName", longName)
table.add_cell("shortName", shortName)
for cell in event:
tag = QName(cell.tag).localname
if tag == "severities":
tag = "severity"
text = ",".join(severity.text for severity in cell.xpath("*"))
print("severities", repr(text))
else:
text = cell.text
table.add_cell(tag, text)
table.prepend_header()
return table.table
def main(filename):
doc = lxml.etree.parse(filename)
table = events_to_table(doc.getroot())
with open('test.csv', 'w', newline='', encoding='utf-8') as fileobj:
csv.writer(fileobj).writerows(table)
main('test.xml')

lxml (etree) - Pretty Print attributes of root tag

Is it possible in python to pretty print the root's attributes?
I used etree to extend the attributes of the child tag and then I had overwritten the existing file with the new content. However during the first generation of the XML, we were using a template where the attributes of the root tag were listed one per line and now with the etree I don't manage to achieve the same result.
I found similar questions but they were all referring to the tutorial of etree, which I find incomplete.
Hopefully someone has found a solution for this using etree.
EDIT: This is for custom XML so HTML Tidy (which was proposed in the comments), doesn't work for this.
Thanks!
generated_descriptors = list_generated_files(generated_descriptors_folder)
counter = 0
for g in generated_descriptors:
if counter % 20 == 0:
print "Extending Descriptor # %s out of %s" % (counter, len(descriptor_attributes))
with open(generated_descriptors_folder + "\\" + g, 'r+b') as descriptor:
root = etree.XML(descriptor.read(), parser=parser)
# Go through every ContextObject to check if the block is mandatory
for context_object in root.findall('ContextObject'):
for attribs in descriptor_attributes:
if attribs['descriptor_name'] == g[:-11] and context_object.attrib['name'] in attribs['attributes']['mandatoryobjects']:
context_object.set('allow-null', 'false')
elif attribs['descriptor_name'] == g[:-11] and context_object.attrib['name'] not in attribs['attributes']['mandatoryobjects']:
context_object.set('allow-null', 'true')
# Sort the ContextObjects based on allow-null and their name
context_objects = root.findall('ContextObject')
context_objects_sorted = sorted(context_objects, key=lambda c: (c.attrib['allow-null'], c.attrib['name']))
root[:] = context_objects_sorted
# Remove mandatoryobjects from Descriptor attributes and pretty print
root.attrib.pop("mandatoryobjects", None)
# paste new line here
# Convert to string in order to write the enhanced descriptor
xml = etree.tostring(root, pretty_print=True, encoding="UTF-8", xml_declaration=True)
# Write the enhanced descriptor
descriptor.seek(0) # Set cursor at beginning of the file
descriptor.truncate(0) # Make sure that file is empty
descriptor.write(xml)
descriptor.close()
counter+=1

How to use the Typed unmarshaller in suds?

I have existing code that processes the output from suds.client.Client(...).service.GetFoo(). Now that part of the flow has changed and we are no longer using SOAP, instead receiving the same XML through other channels. I would like to re-use the existing code by using the suds Typed unmarshaller, but so far have not been successful.
I came 90% of the way using the Basic unmarshaller:
tree = suds.umx.basic.Basic().process(xmlroot)
This gives me the nice tree of objects with attributes, so that the pre-existing code can access tree[some_index].someAttribute, but the value will of course always be a string, rather than an integer or date or whatever, so the code can still not be re-used as-is.
The original class:
class SomeService(object):
def __init__(self):
self.soap_client = Client(some_wsdl_url)
def GetStuff(self):
return self.soap_client.service.GetStuff()
The drop-in replacement that almost works:
class SomeSourceUntyped(object):
def __init__(self):
self.url = some_url
def GetStuff(self):
xmlfile = urllib2.urlopen(self.url)
xmlroot = suds.sax.parser.Parser().parse(xmlfile)
if xmlroot:
# because the parser creates a document root above the document root
tree = suds.umx.basic.Basic().process(xmlroot)[0]
else:
tree = None
return tree
My vain effort to understand suds.umx.typed.Typed():
class SomeSourceTyped(object):
def __init__(self):
self.url = some_url
self.schema_file_name =
os.path.realpath(os.path.join(os.path.dirname(__file__),'schema.xsd'))
with open(self.schema_file_name) as f:
self.schema_node = suds.sax.parser.Parser().parse(f)
self.schema = suds.xsd.schema.Schema(self.schema_node, "", suds.options.Options())
self.schema_query = suds.xsd.query.ElementQuery(('http://example.com/namespace/','Stuff'))
self.xmltype = self.schema_query.execute(self.schema)
def GetStuff(self):
xmlfile = urllib2.urlopen(self.url)
xmlroot = suds.sax.parser.Parser().parse(xmlfile)
if xmlroot:
unmarshaller = suds.umx.typed.Typed(self.schema)
# I'm still running into an exception, so obviously something is missing:
# " Exception: (document, None, ), must be qref "
# Do I need to call the Parser differently?
tree = unmarshaller.process(xmlroot, self.xmltype)[0]
else:
tree = None
return tree
This is an obscure one.
Bonus caveat: Of course I am in a legacy system that uses suds 0.3.9.
EDIT: further evolution on the code, found how to create SchemaObjects.

docutils/sphinx custom directive creating sibling section rather than child

Consider a reStructuredText document with this skeleton:
Main Title
==========
text text text text text
Subsection
----------
text text text text text
.. my-import-from:: file1
.. my-import-from:: file2
The my-import-from directive is provided by a document-specific Sphinx extension, which is supposed to read the file provided as its argument, parse reST embedded in it, and inject the result as a section in the current input file. (Like autodoc, but for a different file format.) The code I have for that, right now, looks like this:
class MyImportFromDirective(Directive):
required_arguments = 1
def run(self):
src, srcline = self.state_machine.get_source_and_line()
doc_file = os.path.normpath(os.path.join(os.path.dirname(src),
self.arguments[0]))
self.state.document.settings.record_dependencies.add(doc_file)
doc_text = ViewList()
try:
doc_text = extract_doc_from_file(doc_file)
except EnvironmentError as e:
raise self.error(e.filename + ": " + e.strerror) from e
doc_section = nodes.section()
doc_section.document = self.state.document
# report line numbers within the nested parse correctly
old_reporter = self.state.memo.reporter
self.state.memo.reporter = AutodocReporter(doc_text,
self.state.memo.reporter)
nested_parse_with_titles(self.state, doc_text, doc_section)
self.state.memo.reporter = old_reporter
if len(doc_section) == 1 and isinstance(doc_section[0], nodes.section):
doc_section = doc_section[0]
# If there was no title, synthesize one from the name of the file.
if len(doc_section) == 0 or not isinstance(doc_section[0], nodes.title):
doc_title = nodes.title()
doc_title.append(make_title_text(doc_file))
doc_section.insert(0, doc_title)
return [doc_section]
This works, except that the new section is injected as a child of the current section, rather than a sibling. In other words, the example document above produces a TOC tree like this:
Main Title
Subsection
File1
File2
instead of the desired
Main Title
Subsection
File1
File2
How do I fix this? The Docutils documentation is ... inadequate, particularly regarding control of section depth. One obvious thing I have tried is returning doc_section.children instead of [doc_section]; that completely removes File1 and File2 from the TOC tree (but does make the section headers in the body of the document appear to be for the right nesting level).
I don't think it is possible to do this by returning the section from the directive (without doing something along the lines of what Florian suggested), as it will get appended to the 'current' section. You can, however, add the section via self.state.section as I do in the following (handling of options removed for brevity)
class FauxHeading(object):
"""
A heading level that is not defined by a string. We need this to work with
the mechanics of
:py:meth:`docutils.parsers.rst.states.RSTState.check_subsection`.
The important thing is that the length can vary, but it must be equal to
any other instance of FauxHeading.
"""
def __init__(self, length):
self.length = length
def __len__(self):
return self.length
def __eq__(self, other):
return isinstance(other, FauxHeading)
class ParmDirective(Directive):
required_arguments = 1
optional_arguments = 0
has_content = True
option_spec = {
'type': directives.unchanged,
'precision': directives.nonnegative_int,
'scale': directives.nonnegative_int,
'length': directives.nonnegative_int}
def run(self):
variableName = self.arguments[0]
lineno = self.state_machine.abs_line_number()
secBody = None
block_length = 0
# added for some space
lineBlock = nodes.line('', '', nodes.line_block())
# parse the body of the directive
if self.has_content and len(self.content):
secBody = nodes.container()
block_length += nested_parse_with_titles(
self.state, self.content, secBody)
# keeping track of the level seems to be required if we want to allow
# nested content. Not sure why, but fits with the pattern in
# :py:meth:`docutils.parsers.rst.states.RSTState.new_subsection`
myLevel = self.state.memo.section_level
self.state.section(
variableName,
'',
FauxHeading(2 + len(self.options) + block_length),
lineno,
[lineBlock] if secBody is None else [lineBlock, secBody])
self.state.memo.section_level = myLevel
return []
I don't know how to do it directly inside your custom directive. However, you can use a custom transform to raise the File1 and File2 nodes in the tree after parsing. For example, see the transforms in the docutils.transforms.frontmatter module.
In your Sphinx extension, use the Sphinx.add_transform method to register the custom transform.
Update: You can also directly register the transform in your directive by returning one or more instances of the docutils.nodes.pending class in your node list. Make sure to call the note_pending method of the document in that case (in your directive you can get the document via self.state_machine.document).

XML Parsing in Python using document builder factory

I am working in STAF and STAX. Here python is used for coding . I am new to python.
Basically my task is to parse a XML file in python using Document Factory Parser.
The XML file I am trying to parse is :
<?xml version="1.0" encoding="utf-8"?>
<operating_system>
<unix_80sp1>
<tests type="quick_sanity_test">
<prerequisitescript>preparequicksanityscript</prerequisitescript>
<acbuildpath>acbuildpath</acbuildpath>
<testsuitscript>test quick sanity script</testsuitscript>
<testdir>quick sanity dir</testdir>
</tests>
<machine_name>u80sp1_L004</machine_name>
<machine_name>u80sp1_L005</machine_name>
<machine_name>xyz.pxy.dxe.cde</machine_name>
<vmware id="155.35.3.55">144.35.3.90</vmware>
<vmware id="155.35.3.56">144.35.3.91</vmware>
</unix_80sp1>
</operating_system>
I need to read all the tags .
For the tags machine_name i need to read them into a list
say all machine names should be in a list machname.
so machname should be [u80sp1_L004,u80sp1_L005,xyz.pxy.dxe.cde] after reading the tags.
I also need all the vmware tags:
all attributes should be vmware_attr =[155.35.3.55,155.35.3.56]
all vmware values should be vmware_value = [ 144.35.3.90,155.35.3.56]
I am able to read all tags properly except vmware tags and machine name tags:
I am using the following code:(i am new to xml and vmware).Help required.
The below code needs to be modified.
factory = DocumentBuilderFactory.newInstance();
factory.setValidating(1)
factory.setIgnoringElementContentWhitespace(0)
builder = factory.newDocumentBuilder()
document = builder.parse(xmlFileName)
vmware_value = None
vmware_attr = None
machname = None
# Get the text value for the element with tag name "vmware"
nodeList = document.getElementsByTagName("vmware")
for i in range(nodeList.getLength()):
node = nodeList.item(i)
if node.getNodeType() == Node.ELEMENT_NODE:
children = node.getChildNodes()
for j in range(children.getLength()):
thisChild = children.item(j)
if (thisChild.getNodeType() == Node.TEXT_NODE):
vmware_value = thisChild.getNodeValue()
vmware_attr ==??? what method to use ?
# Get the text value for the element with tag name "machine_name"
nodeList = document.getElementsByTagName("machine_name")
for i in range(nodeList.getLength()):
node = nodeList.item(i)
if node.getNodeType() == Node.ELEMENT_NODE:
children = node.getChildNodes()
for j in range(children.getLength()):
thisChild = children.item(j)
if (thisChild.getNodeType() == Node.TEXT_NODE):
machname = thisChild.getNodeValue()
Also how to check if a tag exists or not at all. I need to code the parsing properly.
You are need to instantiate vmware_value, vmware_attr and machname as lists not as strings, so instead of this:
vmware_value = None
vmware_attr = None
machname = None
do this:
vmware_value = []
vmware_attr = []
machname = []
Then, to add items to the list, use the append method on your lists. E.g.:
factory = DocumentBuilderFactory.newInstance();
factory.setValidating(1)
factory.setIgnoringElementContentWhitespace(0)
builder = factory.newDocumentBuilder()
document = builder.parse(xmlFileName)
vmware_value = []
vmware_attr = []
machname = []
# Get the text value for the element with tag name "vmware"
nodeList = document.getElementsByTagName("vmware")
for i in range(nodeList.getLength()):
node = nodeList.item(i)
vmware_attr.append(node.attributes["id"].value)
if node.getNodeType() == Node.ELEMENT_NODE:
children = node.getChildNodes()
for j in range(children.getLength()):
thisChild = children.item(j)
if (thisChild.getNodeType() == Node.TEXT_NODE):
vmware_value.append(thisChild.getNodeValue())
I've also edited the code to something I think should work to append the correct values to vmware_attr and vmware_value.
I had to make the assumption that STAX uses xml.dom syntax, so if that isn't the case, you will have to edit my suggestion appropriately.

Categories

Resources