Python - lxml - how to 'move' around the tree when building the tree - python

Basic question - how do you 'move' around in a tree when you are building a tree.
I can populate the first level:
import lxml.etree as ET
def main():
root = ET.Element('baseURL')
root.attrib["URL"]='www.com'
root.attrib["title"]='Level Title'
myList = [["www.1.com","site 1 Title"],["www.2.com","site 2 Title"],["www.3.com","site 3 Title"]]
for i in xrange(len(myList)):
ET.SubElement(root, "link_"+str(i), URL=myList[i][0], title=myList[i][1])
This gives me something like:
baseURL:
link_0
link_1
link_2
from there, I want to add a subtree from each of the new nodes so it looks something like:
baseURL:
link_0:
link_A
link_B
link_C
link_1
link_2
I can't see how to 'point' the subElement call to the next node down - I tried:
myList2 = [["www.A.com","site A Title"],["www.B.com","site B Title"],["www.C.com","site C Title"]]
for i in xrange(len(myList2)):
ET.SubElement('link_0', "link_"+str(i), URL=myList2[i][0], title=myList2[i][1])
But that throws the error:
TypeError: Argument '_parent' has incorrect type (expected lxml.etree._Element, got str)
as I am giving the subElement call a string, not an element reference. I also tried it as a variable, (i.e. link_0' rather than"link_0"`) and that gives a global missing variable, so my reference is obviously incorrect.
How do I 'point' my lxml builder to a child as a parent, and write a new child?

ET.SubElement(parent_node,type) creates a new XML element node as a child of parent_node. It also returns this new node.
So you could do this:
import lxml.etree as ET
def main():
root = ET.Element('baseURL')
myList = [1,2,3]
children = []
for x in myList:
children.append( ET.SubElement(root, "link_"+str(x)) )
for y in myList:
ET.SubElement( children[0], "child_"+str(y) )
But keeping track of the children is probably excessive since lxml already provides you with many ways to get to them.
Here's a way using lxmls built in children lists:
node = root[0]
for y in myList:
ET.SubElement( node, "child_"+str(y) )
Here's a way using XPath (possibly better if your XML is getting ugly)
node = root.xpath("/baseURL/link_0")[0]
for y in myList:
ET.SubElement( node, "child_"+str(y) )

Found the answer. I should be using the python array referencing, root[n] not trying to get to it via list_0

Related

Instert an ElementTree.Element as a SubElement

I'm using xml.etree.ElementTree to create some basic XML in Python. I have a block of XML that I need to access on its own, so I make it as a root Element:
import xml.etree.ElementTree as Tree
def correction_xml(self):
correction = Tree.Element('ColourCorrection')
sop_node = Tree.SubElement(correction, "SOPNode")
slope = Tree.SubElement(sop_node, 'Slope')
offset = Tree.SubElement(sop_node, 'Offset')
power = Tree.SubElement(sop_node, 'Power')
return correction
I also need to insert multiple instances of this part of a bigger XML, so is there a way to insert my correction Element into another tree as a SubElement? Somthing like this, except the SubElement factory only accepts a single string, not an Element object:
def correction_list(self):
list = Tree.Element("List")
item_1 = Tree.SubElement(list, self.correction_xml()) /*insert correction_xml into list as a subelement, keeping its children intact*/

How can I implement doubly linked lists into my tree structure in Python?

So I'm trying to represent an XML file using python. I already managed to do that. But each child node in my tree has to be doubly linked and I don't know how to do that. I've found a few examples of code online but they all use classes and the professor doesn't want us using classes. Here's my code:
from xml.etree.ElementTree import ElementTree
from xml.etree.ElementTree import Element
import xml.etree.ElementTree as etree
def create_tree(): #This function creates the root element of my tree
n = input("Enter your root Element: ")
root = Element(n)
tree = ElementTree(root)
print(etree.tostring(root))
return root
def add_child():
root = create_tree()
new = True
while new == True:
ask = input("If you wish to add a new child type 'yes', else type something else: ")
if ask == 'yes':
n = input("Enter the name of the node: ") #This block of code creates a child and appends it to the root element
node = Element(n)
root.append(node)
print(etree.tostring(root))
else:
break
return etree.tostring(root)
add_child()
For those of you wondering, the purpose of this project is to create a rooted tree with unbounded branching. I hope that once I can implement the doubly linked list I will be able to add a child node within a child node.
You should be able to implement a linked list using lists. You can define
each element as a list with the element itself, the item before it, and the item after it. If the list contains the previous node, the next node, and the element itself, respectively, and the head is stored in head, the syntax to append newElement to the beginning of the list would be as follows:
newNode = [None,head,newElement]
head[0] = newNode
head = newNode

Adding subelement with ElementTree

The below code, which should add a subelemenet to a given XML element, gives the error:
xml.SubElement(new,xml.Element(self.XMLEntriesList['RiverCallPower']))
TypeError: must be xml.etree.ElementTree.Element, not None
But when I check, the element in question is confirmed to be an Element, and not None.
self.XMLEntriesList['RiverCallPower']
Out[3]: Element 'RiverCallPower' at 0x04B83420
What am I doing wrong?
import xml.etree.ElementTree as xml
self.tree = xml.parse('strategies.xml')
self.root = self.tree.getroot()
...
new=self.root.append(xml.Element('newElement'))
xml.SubElement(new,xml.Element(self.XMLEntriesList['RiverCallPower']))
I suspect the problem is not in the XMLEntriesList['RiverCallPower'] part, but the new variable that is None. And that happen because append() simply adds the new element to the list of root element's children and doesn't return anything. Try this way :
.......
new = xml.Element('newElement')
self.root.append(new)
xml.SubElement(new,xml.Element(self.XMLEntriesList['RiverCallPower']))

(Python) How to get the level of a node in xml?

Is there an efficient way to get the level of each node in xml, using the python xml parser? The level of a node can also be determined by counting the closing tags while parsing the xml top-down, however, I need help in that too :). Thanks!
maybe this could be an initial idea:
from xml.etree import ElementTree
from xml.etree.ElementTree import iterparse
level = 0
for (event, node) in iterparse( fileName, ['start', 'end', 'start-ns', 'end-ns'] ):
if event == 'end':
level -= 1
if event == 'start':
level += 1
# do soething with node ...
I'm assuming by "python xml parser" you mean xml.etree. There are other libraries as well (such as lxml) which can make this job easier, but according to this answer there's no easy way to access the parent elements using this library. However, the workaround presented there (creating a mapping of parents to the whole tree) is enough to determine the level of any given element:
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()
# root = ET.fromstring(...) # Alternative
parent_map = dict((c, p) for p in root.getiterator() for c in p)
def level(node):
return level(parent_map[node])+1 if node in parent_map else 0
I always use minidom and recommend it for these kind of interactions with XML files. You can use the parentNode attribute of an element for this purpose.
import xml.dom.minidom as parser
tree = parser.parse('xml.xml')
#let's say we want the first instance of a tag with an specific name
x = tree.getElementsByTagName('tagname')[0]
level = 0
while x:
x = x.parentNode
level += 1
print(level)
This is a link to minidom documentation.

Iterate through XML to get all child nodes text value

i have a xml with following data. i need to get value of and all other attribute. i return a python code there i get only first driver value.
My xml :
<volume name="sp" type="span" operation="create">
<driver>HDD1</driver>
<driver>HDD2</driver>
<driver>HDD3</driver>
<driver>HDD4</driver>
</volume>
My script:
import xml.etree.ElementTree as ET
doc = ET.parse("vol.xml")
root = doc.getroot() #Returns the root element for this tree.
root.keys() #Returns the elements attribute names as a list. The names are returned in an arbitrary order
root.attrib["name"]
root.attrib["type"]
root.attrib["operation"]
print root.get("name")
print root.get("type")
print root.get("operation")
for child in root:
#print child.tag, child.attrib
print root[0].text
My output:
sr-query:~# python volume_check.py aaa
sp
span
create
HDD1
sr-queryC:~#
I am not get HDD2, HDD3 and HDD4. How to itirate through this xml to get all values? Any optimized way? I think any for loop can do that but not familiar with Python.
In your for loop, it should be
child.text
not
root[0].text

Categories

Resources