how to add information to xml file python - python

i'm using minidom to create an xml file
and I would like to add the header to my file , i.e,
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE root SYSTEM "file.dtd">
how can I add it?

Related

Parsing XML to CSV or YAML - Error "CData section not finished"

I'm trying to parse data collected from a Checkpoint firewall which is in XML format. Ideally, I would like XML data converted to csv or xml format. Once I'm there then I'm good.
Top of XML Text
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="../network_objects.xsl"?>
Code I'm trying, this code I got from other posts.
from lxml import etree
file = "network_objects.xml"
parser = etree.XMLParser(encoding = "iso-8859-1")
tree = etree.parse(file, parser)
Error I'm getting. I don't know anything about XML and what CData is. Can someone provide some guidance?
lxml.etree.XMLSyntaxError: CData section not finished
Petroleum Traders public tunnel IP뽨Ä뻼, line 6787, column 651

Modify xml file with extra namespace

I want to modify an existing xml file.
The layout of the existing file:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
After i modified a field in the xml i want to get the new xml file, but the modified file is different from the original.
<?xml version='1.0' encoding='UTF-8'?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
Whats the difference between the files:
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" //this is missing in the mutated file
So what i did:
ET.register_namespace('xsi', 'http://www.w3.org/2001/XMLSchema-instance')
ET.register_namespace('', "urn:iso:std:iso:20022:tech:xsd:pain.001.001.03")
#parse the data
tree = ET.parse(self.sepa_xml.path)
root = tree.getroot()
#add a subelement
body = ET.SubElement(root, "{http://www.w3.org/2001/XMLSchema-instance}")
The finale result:
<?xml version='1.0' encoding='UTF-8'?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<CstmrCdtTrfInitn>
<PmtInf>
<PmtInfId>20220929085842-36645</PmtInfId>
<PmtMtd>TRF</PmtMtd>
<PmtTpInf>
<SvcLvl>
<Cd>SEPA</Cd>
</SvcLvl>
<CtgyPurp>
<Cd>SALA</Cd>
</CtgyPurp>
</PmtTpInf>
<ReqdExctnDt>2022-09-29</ReqdExctnDt>
<Dbtr>
<Nm>test name</Nm>
</Dbtr>
</DbtrAgt>
<ChrgBr>SLEV</ChrgBr>
<CdtTrfTxInf>
<PmtId>
<EndToEndId>20220929085842-36645/1</EndToEndId>
</PmtId>
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
<xsi: /></Document>. // how can i delete this xsi tag ?
The problem is now that is get an extra tag at the end of the xml file:
<xsi: />. I assume this is because i added a subelement. How can ik delete this last tag ?

How to access the tag below another tag in xml using xml.dom.minidom in python?

I am using python 3.10.4 . I am new at parsing xml files.
like for eg, let the xml file be with the filename "test.xml":
<?xml version="1.0" encoding="UTF-8"?>
<tag1 name="1">
<tag2 name="a"></tag2>
</tag1>
<tag1 name = "2">
<tag2 name = "b"></tag2>
</tag1>
</xml>
python code
import xml.dom.minidom
file = xml.dom.minidom.parse('test.xml')
list = []
tags=file.getElementsByTagName("tag1")
for tag in tags:
if(tag.getAttribute("name")=="1"):
print(tag.getAttribute("tag2"))
So here I want to access the tag2 of tag1 with name="1". How can I do it?

How to find and extract certain string from XML and replace in another XML document

can someone help me on how to extract certain string from xml and replace in another xml. Like I want to extract only mediaId from second line of 1.XML
- <sample_settings_config version="23" mediaId="0x6868">
and replace it in 2.xml document (code of both XML's are same except the mediaId).
Where as 1.xml location is dynamic (C:\users\xxxx\Documents), so am looking for append, find extract and replace. :(
I am trying for Python or batch script.
struggling with this batch script
setlocal EnableDelayedExpansion
(Set UPD=C:\Users)
(Set UDS=\RAP\Documents\MB)
For /F "delims=" %%a in ('findstr /I /L "mediaId" 1.xml') do (
set "line=%%a"
set "line=!line:*<string>=!"
for /F "delims=<" %%b in ("!line!") do echo %%b
but it is not working, any help is appreciated, thank you.
from xml.etree.ElementTree import ElementTree
tree = ElementTree()
tree.parse("1.xml")
root = tree.getroot()
media_id = root.find('sample_settings_config').attrib['mediaId']
tree.parse("2.xml")
root = tree.getroot()
root.find('sample_settings_config').attrib['mediaId'] = media_id
tree.write('2.xml', xml_declaration=True)
You can use xml.etree.ElementTree to read and
write to XML, by the use of Python.
The code imports ElementTree and an instance is
created as tree. From tree, the XML file is
parsed and get the root. Then, find the tag
sample_settings_config and get the value of the
attribute of mediaId.
Repeat the parsing, get root and find the tag.
Update the dictionary key of mediaId with the
value stored in the variable media_id.
Write the modified content to the XML file.
1.xml:
<?xml version='1.0' encoding='us-ascii'?>
<root>
<sample_settings_config version="123" mediaId="0x6868">
"XML File 1"
</sample_settings_config>
</root>
2.xml:
<?xml version='1.0' encoding='us-ascii'?>
<root>
<sample_settings_config version="456" mediaId="0x0">
"XML File 2"
</sample_settings_config>
</root>
2.xml modified:
<?xml version='1.0' encoding='us-ascii'?>
<root>
<sample_settings_config mediaId="0x6868" version="456">
"XML File 2"
</sample_settings_config>
</root>

Making Excel file with desired name from XML in Python

I got XML file with structure like this:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE raml SYSTEM 'raml20.dtd'>
<raml version="2.0" xmlns="raml20.xsd">
<cmData type="actual">
<header>
<log dateTime="2017-04-26T12:03:30" action="created" appInfo="ActualExporter">UIValues are used</log>
</header>
After doing my job, I need to save pandas dataframe to Excel file (.csv or .xlsx), but filename needs to be "log dateTime".
So desired output filename would be: 2017-04-26T12:03:30.xlsx or 2017-04-26T12:03:30.csv.
Anyone have any idea?

Categories

Resources