Instert an ElementTree.Element as a SubElement - python

I'm using xml.etree.ElementTree to create some basic XML in Python. I have a block of XML that I need to access on its own, so I make it as a root Element:
import xml.etree.ElementTree as Tree
def correction_xml(self):
correction = Tree.Element('ColourCorrection')
sop_node = Tree.SubElement(correction, "SOPNode")
slope = Tree.SubElement(sop_node, 'Slope')
offset = Tree.SubElement(sop_node, 'Offset')
power = Tree.SubElement(sop_node, 'Power')
return correction
I also need to insert multiple instances of this part of a bigger XML, so is there a way to insert my correction Element into another tree as a SubElement? Somthing like this, except the SubElement factory only accepts a single string, not an Element object:
def correction_list(self):
list = Tree.Element("List")
item_1 = Tree.SubElement(list, self.correction_xml()) /*insert correction_xml into list as a subelement, keeping its children intact*/

Related

How can I implement doubly linked lists into my tree structure in Python?

So I'm trying to represent an XML file using python. I already managed to do that. But each child node in my tree has to be doubly linked and I don't know how to do that. I've found a few examples of code online but they all use classes and the professor doesn't want us using classes. Here's my code:
from xml.etree.ElementTree import ElementTree
from xml.etree.ElementTree import Element
import xml.etree.ElementTree as etree
def create_tree(): #This function creates the root element of my tree
n = input("Enter your root Element: ")
root = Element(n)
tree = ElementTree(root)
print(etree.tostring(root))
return root
def add_child():
root = create_tree()
new = True
while new == True:
ask = input("If you wish to add a new child type 'yes', else type something else: ")
if ask == 'yes':
n = input("Enter the name of the node: ") #This block of code creates a child and appends it to the root element
node = Element(n)
root.append(node)
print(etree.tostring(root))
else:
break
return etree.tostring(root)
add_child()
For those of you wondering, the purpose of this project is to create a rooted tree with unbounded branching. I hope that once I can implement the doubly linked list I will be able to add a child node within a child node.
You should be able to implement a linked list using lists. You can define
each element as a list with the element itself, the item before it, and the item after it. If the list contains the previous node, the next node, and the element itself, respectively, and the head is stored in head, the syntax to append newElement to the beginning of the list would be as follows:
newNode = [None,head,newElement]
head[0] = newNode
head = newNode

(Python) How to get the level of a node in xml?

Is there an efficient way to get the level of each node in xml, using the python xml parser? The level of a node can also be determined by counting the closing tags while parsing the xml top-down, however, I need help in that too :). Thanks!
maybe this could be an initial idea:
from xml.etree import ElementTree
from xml.etree.ElementTree import iterparse
level = 0
for (event, node) in iterparse( fileName, ['start', 'end', 'start-ns', 'end-ns'] ):
if event == 'end':
level -= 1
if event == 'start':
level += 1
# do soething with node ...
I'm assuming by "python xml parser" you mean xml.etree. There are other libraries as well (such as lxml) which can make this job easier, but according to this answer there's no easy way to access the parent elements using this library. However, the workaround presented there (creating a mapping of parents to the whole tree) is enough to determine the level of any given element:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
# root = ET.fromstring(...) # Alternative
parent_map = dict((c, p) for p in root.getiterator() for c in p)
def level(node):
return level(parent_map[node])+1 if node in parent_map else 0
I always use minidom and recommend it for these kind of interactions with XML files. You can use the parentNode attribute of an element for this purpose.
import xml.dom.minidom as parser
tree = parser.parse('xml.xml')
#let's say we want the first instance of a tag with an specific name
x = tree.getElementsByTagName('tagname')[0]
level = 0
while x:
x = x.parentNode
level += 1
print(level)
This is a link to minidom documentation.

Python - lxml - how to 'move' around the tree when building the tree

Basic question - how do you 'move' around in a tree when you are building a tree.
I can populate the first level:
import lxml.etree as ET
def main():
root = ET.Element('baseURL')
root.attrib["URL"]='www.com'
root.attrib["title"]='Level Title'
myList = [["www.1.com","site 1 Title"],["www.2.com","site 2 Title"],["www.3.com","site 3 Title"]]
for i in xrange(len(myList)):
ET.SubElement(root, "link_"+str(i), URL=myList[i][0], title=myList[i][1])
This gives me something like:
baseURL:
link_0
link_1
link_2
from there, I want to add a subtree from each of the new nodes so it looks something like:
baseURL:
link_0:
link_A
link_B
link_C
link_1
link_2
I can't see how to 'point' the subElement call to the next node down - I tried:
myList2 = [["www.A.com","site A Title"],["www.B.com","site B Title"],["www.C.com","site C Title"]]
for i in xrange(len(myList2)):
ET.SubElement('link_0', "link_"+str(i), URL=myList2[i][0], title=myList2[i][1])
But that throws the error:
TypeError: Argument '_parent' has incorrect type (expected lxml.etree._Element, got str)
as I am giving the subElement call a string, not an element reference. I also tried it as a variable, (i.e. link_0' rather than"link_0"`) and that gives a global missing variable, so my reference is obviously incorrect.
How do I 'point' my lxml builder to a child as a parent, and write a new child?
ET.SubElement(parent_node,type) creates a new XML element node as a child of parent_node. It also returns this new node.
So you could do this:
import lxml.etree as ET
def main():
root = ET.Element('baseURL')
myList = [1,2,3]
children = []
for x in myList:
children.append( ET.SubElement(root, "link_"+str(x)) )
for y in myList:
ET.SubElement( children[0], "child_"+str(y) )
But keeping track of the children is probably excessive since lxml already provides you with many ways to get to them.
Here's a way using lxmls built in children lists:
node = root[0]
for y in myList:
ET.SubElement( node, "child_"+str(y) )
Here's a way using XPath (possibly better if your XML is getting ugly)
node = root.xpath("/baseURL/link_0")[0]
for y in myList:
ET.SubElement( node, "child_"+str(y) )
Found the answer. I should be using the python array referencing, root[n] not trying to get to it via list_0

Turning ElementTree findall() into a list

I'm using ElementTree findall() to find elements in my XML which have a certain tag. I want to turn the result into a list. At the moment, I'm iterating through the elements, picking out the .text for each element, and appending to the list. I'm sure there's a more elegant way of doing this.
#!/usr/bin/python2.7
#
from xml.etree import ElementTree
import os
myXML = '''<root>
<project project_name="my_big_project">
<event name="my_first_event">
<location>London</location>
<location>Dublin</location>
<location>New York</location>
<month>January</month>
<year>2013</year>
</event>
</project>
</root>
'''
tree = ElementTree.fromstring(myXML)
for node in tree.findall('.//project'):
for element in node.findall('event'):
event_name=element.attrib.get('name')
print event_name
locations = []
if element.find('location') is not None:
for events in element.findall('location'):
locations.append(events.text)
# Could I use something like this instead?
# locations.append(''.join.text(*events) for events in element.findall('location'))
print locations
Outputs this (which is correct, but I'd like to assign the findall() results directly to a list, in text format, if possible;
my_first_event
['London', 'Dublin', 'New York']
You can try this - it uses a list comprehension to generate the list without having to create a blank one and then append.
if element.find('location') is not None:
locations = [events.text for events in element.findall('location')]
With this, you can also get rid of the locations definition above, so your code would be:
tree = ElementTree.fromstring(myXML)
for node in tree.findall('.//project'):
for element in node.findall('event'):
event_name=element.attrib.get('name')
print event_name
if element.find('location') is not None:
locations = [events.text for events in element.findall('location')]
print locations
One thing you will want to be wary of is what you are doing with locations - it won't be defined if location doesn't exist, so you will get a NameError if you try to print it and it doesn't exist. If that is an issue, you can retain the locations = [] definition - if the matching element isn't found, the result will just be an empty list.

Iterate through XML to get all child nodes text value

i have a xml with following data. i need to get value of and all other attribute. i return a python code there i get only first driver value.
My xml :
<volume name="sp" type="span" operation="create">
<driver>HDD1</driver>
<driver>HDD2</driver>
<driver>HDD3</driver>
<driver>HDD4</driver>
</volume>
My script:
import xml.etree.ElementTree as ET
doc = ET.parse("vol.xml")
root = doc.getroot() #Returns the root element for this tree.
root.keys() #Returns the elements attribute names as a list. The names are returned in an arbitrary order
root.attrib["name"]
root.attrib["type"]
root.attrib["operation"]
print root.get("name")
print root.get("type")
print root.get("operation")
for child in root:
#print child.tag, child.attrib
print root[0].text
My output:
sr-query:~# python volume_check.py aaa
sp
span
create
HDD1
sr-queryC:~#
I am not get HDD2, HDD3 and HDD4. How to itirate through this xml to get all values? Any optimized way? I think any for loop can do that but not familiar with Python.
In your for loop, it should be
child.text
not
root[0].text

Categories

Resources