Script does not run correctly. No output - python

I'm writing a code that should read a XML file. But it doesn't run and I'm not able to find the problem. I think it has something to do with my laptop.
I've already tried to run it in the normal python3 shell and in there it works perfectly fine.
import os
import xml.etree.ElementTree as ET
tree = ET.parse('people.xml')
root = tree.getroot()
root[0].attrib
It should output this : {'name': 'Samy'} and it does in the python3 shell but it doesn't work in the script.
The XML file looks like this
<?xml version="1.0"?>
<PEOPLE>
<Person name="Samy">
<age>99</age>
<number>0176293747238</number>
</Person>
<Person name="Alkoholik">
<age>20</age>
<number>0176234923482</number>
</Person>
</PEOPLE>

adding the print() function gives you the wanted output.
import xml.etree.ElementTree as ET
XML = 'people.xml'
tree = ET.parse(XML)
root = tree.getroot()
print(root[0].attrib) #{'name': 'Samy'}
for i in root:
print(i.attrib) #{'name': 'Samy'}
#{'name': 'Alkoholik'}

Related

Change the key value in XML using Python

I need to modify the online_hostname key value in XML using python. I tried xml element tree but it does not work.
import xml.etree.ElementTree as ET
xml_tree = ET.parse('test.xml')
root = xml_tree.getroot()
root[0][0] = "requiredvalue"
test.xml file is as below:
<?xml version="1.0" encoding="UTF-8" ?>
<bzinfo>
<myidentity online_hostname="testdevice-air_2022_01_25"
bzlogin="me#abc.com" />
</bzinfo>
Error:
IndexError: child assignment index out of range
It is much better to explicitly search the required node (find will do in this case) instead of using indexes on the root node:
import xml.etree.ElementTree as ET
xml_tree = ET.parse('test.xml')
root = xml_tree.getroot()
myidentity_node = root.find('myidentity')
myidentity_node.attrib['online_hostname'] = 'required_value'
xml_tree.write('modified.xml')
modified.xml after running this code:
<bzinfo>
<myidentity bzlogin="me#abc.com" online_hostname="required_value" />
</bzinfo>

Parsing XML Attributes with Python

I am trying to parse out all the green highlighted attributes (some sensitive things have been blacked out), I have a bunch of XML files all with similar formats, I already know how to loop through all of them individually them I am having trouble parsing out the specific attributes though.
XML Document
I need the text in the attributes: name="text1"
from
project logLevel="verbose" version="2.0" mainModule="Main" name="text1">
destinationDir="/text2" from
put label="Put Files" destinationDir="/Trigger/FPDMMT_INBOUND">
destDir="/text3" from
copy disabled="false" version="1.0" label="Archive Files" destDir="/text3" suffix="">
I am using
import csv
import os
import re
import xml.etree.ElementTree as ET
tree = ET.parse(XMLfile_path)
item = tree.getroot()[0]
root = tree.getroot()
print (item.get("name"))
print (root.get("name"))
This outputs:
Main
text1
The item.get pulls the line at index [0] which is the first line root in the tree which is <module
The root.get pulls from the first line <project
I know there's a way to search for exactly the right part of the root/tree with something like:
test = root.find('./project/module/ftp/put')
print (test.get("destinationDir"))
I need to be able to jump directly to the thing I need and output the attributes I need.
Any help would be appreciated
Thanks.
Simplified copy of your XML:
xml = '''<project logLevel="verbose" version="2.0" mainModule="Main" name="hidden">
<module name="Main">
<createWorkspace version="1.0"/>
<ftp version="1.0" label="FTP connection to PRD">
<put label="Put Files" destinationDir="destination1">
</put>
</ftp>
<ftp version="1.0" label="FTP connection to PRD">
<put label="Put Files" destinationDir="destination2">
</put>
</ftp>
<copy disabled="false" destDir="destination3">
</copy>
</module>
</project>
'''
# solution using ETree
from xml.etree import ElementTree as ET
root = ET.fromstring(xml)
name = root.get('name')
ftp_destination_dir1 = root.findall('./module/ftp/put')[0].get('destinationDir')
ftp_destination_dir2 = root.findall('./module/ftp/put')[1].get('destinationDir')
copy_destination_dir = root.find('./module/copy').get('destDir')
print(name)
print(ftp_destination_dir1)
print(ftp_destination_dir2)
print(copy_destination_dir)
# solution using lxml
from lxml import etree as et
root = et.fromstring(xml)
name = root.get('name')
ftp_destination_dirs = root.xpath('./module/ftp/put/#destinationDir')
copy_destination_dir = root.xpath('./module/copy/#destDir')[0]
print(name)
print(ftp_destination_dirs[0])
print(ftp_destination_dirs[1])
print(copy_destination_dir)

python xml.etree.ElementTree parse

The SOAP envelop looks like following. I need to parse the following in python. I tried following without any luck.
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = mytree.getroot()
I tried Namespaces and XPath
Can someone help as I am new to python? Thanks.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Body>
<dp:response xmlns:dp="http://www.datapower.com/schemas/management">
<dp:timestamp>2008-03-18T17:48:22+01:00</dp:timestamp>
<dp:config>
<User xmlns:env="http://www.w3.org/2003/05/soap-envelope" name="xyz"
<mAdminState read-only="true">enabled</mAdminState>
<userSummary>admin</UserSummary>
</User>
<User xmlns:env="http://www.w3.org/2003/05/soap-envelope" name="abc"
<mAdminState read-only="true">enabled</mAdminState>
<userSummary>admin</UserSummary>
</User>
</dp:config>
</dp:response>
</env:Body>
</env:Envelope>
The XML is not well-formatted, you're missing a > at the end of each user tag. Also userSummary is not case matching with UserSummary. After these fixes, you can go ahead and start parsing the file as follows.
import xml.etree.ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
namespaces = {
'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
'dp': 'http://www.datapower.com/schemas/management'
}
users = root.findall('./soap:Body/dp:response/dp:config/User/userSummary', namespaces=namespaces)
for user in users:
print(user.text)

How to edit xml config file in python 3?

I have a xml config file and needs to update particular attribute value.
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<testCommnication>
<connection intervalInSeconds="50" versionUpdates="15"/>
</testCommnication>
</configuration>
I just need to update the "versionUpdates" value to "10".
How can i achieve this in python 3.
I have tried xml.etree and minidom and not able to achieve it.
Please use xml.etree.ElementTree to modify the xml:
Edit: If you want to retail the attribute order, use lxml instead. To install, use pip install lxml
# import xml.etree.ElementTree as ET
from lxml import etree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
# modifying an attribute
for elem in root.iter('connection'):
elem.set('versionUpdates', '10')
tree.write('modified.xml') # you can write 'sample.xml' as well
Content now in modified.xml:
<configuration>
<testCommnication>
<connection intervalInSeconds="50" versionUpdates="10" />
</testCommnication>
</configuration>
You can use xml.etree.ElementTree in Python 3 to handle XML :
import xml.etree.ElementTree
config_file = xml.etree.ElementTree.parse('your_file.xml')
config_file.findall(".//connection")[0].set('versionUpdates', 10))
config_file.write('your_new_file.xml')

How to keep the xml-stylesheet?

I want to keep the xml-stylesheet. But it doesn't work.
I use Python to modify the XML for deploy hadoop automatically.
XML:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
    <name>fs.default.name</name>
    <value>hdfs://c11:9000</value>
  </property>
</configuration>
Code:
from xml.etree.ElementTree import ElementTree as ET
def modify_core_site(namenode_hostname):
tree = ET()
tree.parse("pkg/core-site.xml")
root = tree.getroot()
for p in root.iter("property"):
name = p.find("name").text
if name == "fs.default.name":
text = "hdfs://%s:9000" % namenode_hostname
p.find("value").text = text
tree.write("pkg/tmp.xml", encoding="utf-8", xml_declaration=True)
modify_core_site("c80")
Result:
<?xml version='1.0' encoding='utf-8'?>
<configuration>
<property>
    <name>fs.default.name</name>
    <value>hdfs://c80:9000</value>
  </property>
</configuration>
The xml-stylesheet disappear...
How can I keep this?
One solution is you can use lxml Once you parse xml go till you find the xsl node. Quick sample below:
>>> import lxml.etree
>>> doc = lxml.etree.parse('C:/downloads/xmltest.xml')
>>> root = doc.getroot()
>>> xslnode=root.getprevious().getprevious()
>>> xslnode
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
Make sure you put in some exception handling and check if the node indeed exists. You can check if the node is xslt processing instruction by
>>> isinstance(xslnode, lxml.etree._XSLTProcessingInstruction)
True

Categories

Resources