Append on XML element using Python - python

I have xml named user_data.xml that contains:
<?xml version="1.0"?>
<users>
<user name="Rocky" id="1" age="38"/>
<user name="Steve" id="2" age="50"/>
<user name="Melinda" id="3" age="38"/>
</users>
and i want to add new element inside users something like:
<?xml version="1.0"?>
<users>
<user name="Rocky" id="1" age="38"/>
<user name="Steve" id="2" age="50"/>
<user name="Melinda" id="3" age="38"/>
<user name="Yondu" id="4" age="55"/>
</users>
and I've tried to do it using this python code:
class add_user:
root_new = ET.Element("users")
root_new.append((ET.fromstring('<user name="Yondu" id="4" age="55"/>')))
tree = ET.ElementTree(root_new)
tree.write(sys.stdout)
for c in root_new:
print(root_new)
but it's not working .
Any idea on how can I do it?

Parse input XML file/content by etree.fromstring()
Now you get object of root element.
Then use etree.Element() method to create New User element.
As our root element is users i.e. append new element to root element by append method.
Demo:
>>> from lxml import etree
>>> input_data = """<?xml version="1.0"?>
... <users>
... <user name="Rocky" id="1" age="38"/>
... <user name="Steve" id="2" age="50"/>
... <user name="Melinda" id="3" age="38"/>
... </users>"""
>>> root = etree.fromstring(input_data)
>>> new_user = etree.Element("user", {"name":"Yondu", "id":"4", "age": "55"})
>>> root.tag
'users'
>>> root.append(new_user)
>>> print etree.tostring(root, method="xml", pretty_print=True)
<users>
<user name="Rocky" id="1" age="38"/>
<user name="Steve" id="2" age="50"/>
<user name="Melinda" id="3" age="38"/>
<user age="55" name="Yondu" id="4"/></users>
>>>
Note: Do necessary Exception handle.
lxml Documentation Link

Related

Parsing XML in Python with ElementTree

I'm using the documentation here to try to get only the values (name,ip , netmask) for certain elements.
This is an example of the structure of my xml:
<?xml version="1.0" ?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="urn:uuid:5cf32451-91af-4f71-a0bd-ead244b81b1f">
<data>
<interfaces xmlns="urn:ietf:params:xml:ns:yang:ietf-interfaces">
<interface>
<name>GigabitEthernet1</name>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:iana-if-type">ianaift:ethernetCsmacd</type>
<enabled>true</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>192.168.40.30</ip>
<netmask>255.255.255.0</netmask>
</address>
</ipv4>
<ipv6 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip"/>
</interface>
<interface>
<name>GigabitEthernet2</name>
<type xmlns:ianaift="urn:ietf:params:xml:ns:yang:iana-if-type">ianaift:ethernetCsmacd</type>
<enabled>true</enabled>
<ipv4 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip">
<address>
<ip>10.10.10.1</ip>
<netmask>255.255.255.0</netmask>
</address>
</ipv4>
<ipv6 xmlns="urn:ietf:params:xml:ns:yang:ietf-ip"/>
</interface>
</interfaces>
</data>
</rpc-reply>
Python code: This code returns nothing .
import xml.etree.ElementTree as ET
tree = ET.parse("C:\\Users\\Redha\\Documents\\test_network\\interface1234.xml")
root = tree.getroot()
namespaces = {'interfaces': 'urn:ietf:params:xml:ns:yang:ietf-interfaces' }
for elem in root.findall('.//interfaces:interfaces', namespaces):
s0 = elem.find('.//interfaces:name',namespaces)
name = s0.text
print(name)
interface = ET.parse('interface2.xml')
interface_root = interface.getroot()
for interface_attribute in interface_root[0][0]:
print(f"{interface_attribute[0].text}, {interface_attribute[3][0][0].text}, {interface_attribute[3][0][1].text}")

Parsing XML: Python ElementTree, find elements and its parent elements without other elements in same parent

I am using python's ElementTree library to parse an XML file which has the following structure. I am trying to get the xml string corresponding to entity with id = 192 with all its parents (folders) but without other entities
<catalog>
<folder name="entities">
<entity id="102">
</entity>
<folder name="newEntities">
<entity id="192">
</entity>
<entity id="2982">
</entity>
</folder>
</folder>
</catalog>
The required result should be
<catalog>
<folder name="entities">
<folder name="newEntities">
<entity id="192">
</entity>
</folder>
</folder>
</catalog>
assuming the 1st xml string is stored in a variable called xml_string
tree = ET.fromstring(xmlstring)
id = 192
required_element = tree.find(".//entity[#id='" + id + "']")
This gets the xml element for the required entity but not the parent folders, any quick solution fix for this?
The challenge here is to bypass the fact that ET has no parent information. The solution is to use parent_map
import copy
import xml.etree.ElementTree as ET
import xml.dom.minidom as minidom
xml = '''<catalog>
<folder name="entities">
<entity id="102">
</entity>
<folder name="newEntities">
<entity id="192">
</entity>
<entity id="2982">
</entity>
</folder>
</folder>
</catalog>'''
def prettify(elem):
"""Return a pretty-printed XML string for the Element.
"""
rough_string = ET.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent="\t")
root = ET.fromstring(xml)
parent_map = {c: p for p in root.iter() for c in p}
_id = 192
required_element = root.find(".//entity[#id='" + str(_id) + "']")
_path = [copy.deepcopy(required_element)]
while True:
parent = parent_map.get(required_element)
if parent:
_path.append(copy.deepcopy(parent))
required_element = parent
else:
break
idx = len(_path) - 1
while idx >= 1:
_path[idx].clear()
_path[idx].append(_path[idx-1])
idx -= 1
print(prettify(_path[-1]))
output
<?xml version="1.0" ?>
<catalog>
<folder>
<folder>
<entity id="192">
</entity>
</folder>
</folder>
</catalog>

How to use Python etree objects as additional input documents for XSLT

Using Python's lxml.etree module I want to refer to an additional input document in a stylesheet like this
<xsl:copy-of select="document($abc)//order"/>
where $abc is an XML document that I already have as an etree.Element object in my code. I know how to refer to a file document, but I would like to use the (parsed) XML object directly, without having to serialize it as an extra step (and having XSLT parse it again).
Is this possible? If yes, how?
I have tried to find a way to pass an element as a parameter but I have not found one. What has worked for me is setting up an extension function that simply returns the element:
from lxml import etree
sheet = etree.XML('''\
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:mf="http://example.com/mf" exclude-result-prefixes="mf">
<xsl:template match="root">
<xsl:copy>
<xsl:copy-of select="foo[bar = mf:node_set()/item/bar]"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
''')
doc1 = etree.XML('''\
<root>
<foo>
<bar>a</bar>
</foo>
<foo>
<bar>b</bar>
</foo>
<foo>
<bar>c</bar>
</foo>
</root>''')
doc2 = etree.XML('''\
<items>
<item>
<bar>a</bar>
</item>
<item>
<bar>c</bar>
</item>
</items>''')
def node_set(context, node):
return node
ns = etree.FunctionNamespace('http://example.com/mf')
ns.prefix = "mf"
ns['node_set'] = lambda _: doc2
transformer = etree.XSLT(sheet)
result = transformer(doc1)
print(str(result))
Output is
<?xml version="1.0"?>
<root><foo>
<bar>a</bar>
</foo><foo>
<bar>c</bar>
</foo></root>
You can then change the function as needed with e.g.
ns['node_set'] = lambda _: etree.XML('<items><item><bar>b</bar></item></items>')
result = transformer(doc1)
print(str(result))
and get
<?xml version="1.0"?>
<root><foo>
<bar>b</bar>
</foo></root>
Of course using an extension function opens up different data representation strategies, here is an example that instead puts the additional documents into a list and then uses an extension function taking the index into a list to select a document:
from lxml import etree
sheet = etree.XML('''\
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:mf="http://example.com/mf" exclude-result-prefixes="mf">
<xsl:output indent="yes"/>
<xsl:template match="root">
<xsl:copy>
<example nr="1">
<xsl:copy-of select="foo[bar = mf:get_doc(1)/item/bar]"/>
</example>
<example nr="2">
<xsl:copy-of select="foo[bar = mf:get_doc(2)/item/bar]"/>
</example>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
''')
doc1 = etree.XML('''\
<root>
<foo>
<bar>a</bar>
</foo>
<foo>
<bar>b</bar>
</foo>
<foo>
<bar>c</bar>
</foo>
</root>''')
doc2 = etree.XML('''\
<items>
<item>
<bar>a</bar>
</item>
<item>
<bar>c</bar>
</item>
</items>''')
doc3 = etree.XML('<items><item><bar>b</bar></item></items>')
param_docs = [doc2, doc3]
def get_doc(context, a):
return param_docs[int(a) - 1]
ns = etree.FunctionNamespace('http://example.com/mf')
ns.prefix = "mf"
ns['get_doc'] = get_doc
transformer = etree.XSLT(sheet)
result = transformer(doc1)
print(str(result))
Here is the output:
<root>
<example nr="1">
<foo>
<bar>a</bar>
</foo>
<foo>
<bar>c</bar>
</foo>
</example>
<example nr="2">
<foo>
<bar>b</bar>
</foo>
</example>
</root>

how to replace elements in xml using python

sorry for my poor English. but i need your help ;(
i have 2 xml files.
one is:
<root>
<data name="aaaa">
<value>"old value1"</value>
<comment>"this is an old value1 of aaaa"</comment>
</data>
<data name="bbbb">
<value>"old value2"</value>
<comment>"this is an old value2 of bbbb"</comment>
</data>
</root>
two is:
<root>
<data name="aaaa">
<value>"value1"</value>
<comment>"this is a value 1 of aaaa"</comment>
</data>
<data name="bbbb">
<value>"value2"</value>
<comment>"this is a value2 of bbbb"</comment>
</data>
<data name="cccc">
<value>"value3"</value>
<comment>"this is a value3 of cccc"</comment>
</data>
</root>
one.xml will be updated from two.xml.
so, the one.xml should be like this.
one.xml(after) :
<root>
<data name="aaaa">
<value>"value1"</value>
<comment>"this is a value1 of aaaa"</comment>
</data>
<data name="bbbb">
<value>"value2"</value>
<comment>"this is a value2 of bbbb"</comment>
</data>
</root>
data name="cccc" is not exist in one.xml. therefore ignored.
actually what i want to do is
download two.xml(whole list) from db
update my one.xml (it contains DATA-lists that only the app uses) by two.xml
Any can help me please !!
Thanks!!
==============================================================
xml.etree.ElementTree
your code works with the example. but i found a problem in real xml file.
the real one.xml contains :
<?xml version="1.0" encoding="utf-8"?>
<root>
<resheader name="resmimetype">
<value>text/microsoft-resx</value>
</resheader>
<resheader name="version">
<value>2.0</value>
</resheader>
<resheader name="reader">
<value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<resheader name="writer">
<value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<data name="NotesLabel" xml:space="preserve">
<value>Hinweise:</value>
<comment>label for input field</comment>
</data>
<data name="NotesPlaceholder" xml:space="preserve">
<value>z . Milch kaufen</value>
<comment>example input for notes field</comment>
</data>
<data name="AddButton" xml:space="preserve">
<value>Neues Element hinzufügen</value>
<comment>this string appears on a button to add a new item to the list</comment>
</data>
</root>
it seems, resheader causes trouble.
do you have any idea to fix?
You can use xml.etree.ElementTree and while there are propably more elegant ways, this should work on files that fit in memory if names are unique in two.xml
import xml.etree.ElementTree as ET
tree_one = ET.parse('one.xml')
root_one = tree_one.getroot()
tree_two = ET.parse('two.xml')
root_two = tree_two.getroot()
data_two=dict((e.get("name"), e) for e in root_two.findall("data"))
for eo in root_one.findall("data"):
name=eo.get("name")
tail=eo.tail
eo.clear()
eo.tail=tail
en=data_two[name]
for k,v in en.items():
eo.set(k,v)
eo.extend(en.findall("*"))
eo.text=en.text
tree_one.write("one.xml")
If your files do not fit in memory you can still use xml.dom.pulldom as long as single data entries do fit.

Python: Find and Remove Child from XML file and output to new file

I have an XMl file below. I want to be able to delete all childs plan that are not called john and output to new file.
<data>
<plan_main>
<plan>
<name>John</name>
<id>1</id>
</plan>
<plan>
<name>Charlie</name>
<id>2</id>
</plan>
</plan_main>
<location>
<country>
<code>GB</code>
</country>
<country>
<code>DE</code>
</country>
</location>
</data>
I've tried the following code but get an ValueError Not in list
for plan in root.findall('./plan_main/plan'):
name = plan.find('name').text
if name =! "john":
root.remove(plan)
tree.write('output.xml')
I want my output file to look like this:
<data>
<plan_main>
<plan>
<name>John</name>
<id>1</id>
</plan>
</plan_main>
<location>
<country>
<code>GB</code>
</country>
<country>
<code>DE</code>
</country>
</location>
</data>
However I get the following error:
ValueError: list.remove(x): x not in list
Assuming the =! is just a copy/paste mistake. The issue is that you are trying to remove the element from root node using Element.remove() method, but .remove() only removes elements if they are direct children of root.
If you want to use ElementTree itself, you can change the XPath to loop over all the plan_main elements, and then for each plan_main element, loop over all its children, and if any children's name is not john , remove it. Example-
for plan_main in root.findall('./plan_main'):
for plan in plan_main:
name = plan.find('name').text
if name.lower() != "john":
plan_main.remove(plan)
Demo -
>>> import xml.etree.ElementTree as ET
>>> s = """ <data>
... <plan_main>
... <plan>
... <name>John</name>
... <id>1</id>
... </plan>
... <plan>
... <name>Charlie</name>
... <id>2</id>
... </plan>
... </plan_main>
... <location>
... <country>
... <code>GB</code>
... </country>
... <country>
... <code>DE</code>
... </country>
... </location>
... </data>"""
>>> root = ET.fromstring(s)
>>> for plan_main in root.findall('./plan_main'):
... for plan in plan_main:
... name = plan.find('name').text
... if name.lower() != "john":
... plan_main.remove(plan)
...
>>> print(ET.tostring(root).decode('utf-8'))
<data>
<plan_main>
<plan>
<name>John</name>
<id>1</id>
</plan>
</plan_main>
<location>
<country>
<code>GB</code>
</country>
<country>
<code>DE</code>
</country>
</location>
</data>
If you can use lxml.etree , you can make small change to your code to get it working, by getting the direct parent of the child you want to remove, using .getparent() method. Example -
for plan in root.findall('./plan_main/plan'):
name = plan.find('name').text
if name.lower() != "john":
plan.getparent().remove(plan)

Categories

Resources