Haven't done this kind of process in xml before.
I have these empty folders, called: 125,127,128
and I have this xml:
<?xml version="1.0" encoding="ASCII"?>
<Metadata version="1.0">
<CODE_OK>510</CODE_OK>
<DeliveryDate>13/08/2018</DeliveryDate>
I want to replace the number between:<CODE_OK>510</CODE_OK> with the number that is each folder's name:125,127 and 128 and drop each new xml in the corresponding folder.
This is one approach.
import xml.etree.ElementTree as ET
import os
sampleXML = """<?xml version="1.0" encoding="ASCII"?>
<Metadata version="1.0">
<CODE_OK>510</CODE_OK>
<DeliveryDate>13/08/2018</DeliveryDate>
</Metadata>
"""
tree = ET.ElementTree(ET.fromstring(sampleXML))
for folder in os.listdir("YourPath"): #Iterate the dir
tree.find("CODE_OK").text = folder #Update dir name in XML
tree.write(open(os.path.join(r"YourPath", folder, "yourxml.xml"), "w")) #Write to XML
Related
I want to modify an existing xml file.
The layout of the existing file:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Document xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
After i modified a field in the xml i want to get the new xml file, but the modified file is different from the original.
<?xml version='1.0' encoding='UTF-8'?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03">
Whats the difference between the files:
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" //this is missing in the mutated file
So what i did:
ET.register_namespace('xsi', 'http://www.w3.org/2001/XMLSchema-instance')
ET.register_namespace('', "urn:iso:std:iso:20022:tech:xsd:pain.001.001.03")
#parse the data
tree = ET.parse(self.sepa_xml.path)
root = tree.getroot()
#add a subelement
body = ET.SubElement(root, "{http://www.w3.org/2001/XMLSchema-instance}")
The finale result:
<?xml version='1.0' encoding='UTF-8'?>
<Document xmlns="urn:iso:std:iso:20022:tech:xsd:pain.001.001.03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<CstmrCdtTrfInitn>
<PmtInf>
<PmtInfId>20220929085842-36645</PmtInfId>
<PmtMtd>TRF</PmtMtd>
<PmtTpInf>
<SvcLvl>
<Cd>SEPA</Cd>
</SvcLvl>
<CtgyPurp>
<Cd>SALA</Cd>
</CtgyPurp>
</PmtTpInf>
<ReqdExctnDt>2022-09-29</ReqdExctnDt>
<Dbtr>
<Nm>test name</Nm>
</Dbtr>
</DbtrAgt>
<ChrgBr>SLEV</ChrgBr>
<CdtTrfTxInf>
<PmtId>
<EndToEndId>20220929085842-36645/1</EndToEndId>
</PmtId>
</CdtTrfTxInf>
</PmtInf>
</CstmrCdtTrfInitn>
<xsi: /></Document>. // how can i delete this xsi tag ?
The problem is now that is get an extra tag at the end of the xml file:
<xsi: />. I assume this is because i added a subelement. How can ik delete this last tag ?
I am using python 3.10.4 . I am new at parsing xml files.
like for eg, let the xml file be with the filename "test.xml":
<?xml version="1.0" encoding="UTF-8"?>
<tag1 name="1">
<tag2 name="a"></tag2>
</tag1>
<tag1 name = "2">
<tag2 name = "b"></tag2>
</tag1>
</xml>
python code
import xml.dom.minidom
file = xml.dom.minidom.parse('test.xml')
list = []
tags=file.getElementsByTagName("tag1")
for tag in tags:
if(tag.getAttribute("name")=="1"):
print(tag.getAttribute("tag2"))
So here I want to access the tag2 of tag1 with name="1". How can I do it?
I am trying to create a .docx file, using the .docx template, .xml and .xslt file. I want to fill the placeholders in the .docx template file, with the data in the .xml file, and then generate a new word file, containing the data.
The template.docx file looks like this:
The data.xml file looks like this:
<root>
<person>
<Name>John</Name>
<profession>dentist</profession>
<city>Miami</city>
</person>
<person>
<Name>Mia</Name>
<profession>teacher</profession>
<city>London</city>
</person>
</root>
The parser.xslt file that I came up with looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="">
<xsl:template match="/">
<xsl:for-each select="root/person">
<xsl:value-of select="Name"/>
<xsl:value-of select="profession"/>
<xsl:value-of select="city"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The output result.docs file should look like this:
My python code that I came up with looks like this:
import lxml.etree as ET
dom = ET.parse('data.xml')
xslt = ET.parse(r'parser.xslt')
transform = ET.XSLT(xslt)
newdom = transform(dom)
I don't know what the content of the .xslt file must be, in order to work, and how to create the result.docx
Any kind of help will be appreciable
Here is the code but the exported xml appears badly formatted.
import xml.etree.ElementTree as ET
import os
sampleXML = """<?xml version="1.0" encoding="ASCII"?>
<Metadata version="1.0">
<CODE_OK>510</CODE_OK>
<DeliveryDate>13/08/2018</DeliveryDate>
</Metadata>
"""
tree = ET.ElementTree(ET.fromstring(sampleXML))
for folder in os.listdir("YourPath"): #Iterate the dir
tree.find("CODE_OK").text = folder #Update dir name in XML
tree.write(open(os.path.join(r"Path", folder, "newxml.xml"), "wb")) #Write to XML
How to make the exported xml appear normally formatted?
I found in docs that xml module has an implementation of Document Object Model interface. I provide a simple example
from xml.dom.minidom import parseString
example = parseString(sampleXML) # your string
# write to file
with open('file.xml', 'w') as file:
example.writexml(file, indent='\n', addindent=' ')
Output:
<?xml version="1.0" ?>
<Metadata version="1.0">
<CODE_OK>510</CODE_OK>
<DeliveryDate>13/08/2018</DeliveryDate>
</Metadata>
Update
You can also write like this
example = parseString(sampleXML).toprettyxml()
with open('file.xml', 'w') as file:
file.write(example)
Output:
<?xml version="1.0" ?>
<Metadata version="1.0">
<CODE_OK>510</CODE_OK>
<DeliveryDate>13/08/2018</DeliveryDate>
</Metadata>
Update 2
I copy all your code and only add indent from this site. And for me is working correctly
import xml.etree.ElementTree as ET
import os
sampleXML = "your xml"
tree = ET.ElementTree(ET.fromstring(sampleXML))
indent(tree.getroot()) # this I add
for folder in os.listdir(path):
tree.find("CODE_OK").text = folder
tree.write(open(os.path.join(path, folder, "newxml.xml"), "wb"))
I have a sample.xml file as below..I need to scan for the "revision" tag for project "kernel/msm" and
print the word after "refs/heads"..i have a sample.xml file below and xml output?i can figure out the python part later,can any one provide inputs on how can this be done?
INPUT:-
Assume there is a variable project like below
project='kernel/msm'
sample.xml
<?xml version="1.0" encoding="utf-8"?>
<project name="platform/vendor/google/proprietary/code"
path="vendor/widevine"
revision="refs/heads/ab_mr"
x-grease-customer="none"
x-quic-dist="none"
x-ship="none" />
<!-- test Projects -->
<project name="kernel/msm"
path="kernel"
revision="refs/heads/msm-3.4"
x-grease-customer="none"
x-quic-dist="la"
x-ship="oss" />
......
EXPECTED OUTPUT:-
msm-3.4
Sample code:-
project='kernel/msm'
#open xml file
with open('./test.xml', 'r') as f:
#get the branch and project
for line in project :
if line in 'revision':
branch = line.split('/')[-1]
print branch
Thanks
import xml.etree.ElementTree as ET
import re
temp = 'refs/heads/'
name = 'kernel/msm'
pattern = re.compile('%s(.*)' % temp)
tree = ET.parse('sample.xml')
root = tree.getroot()
project = root.find("./project[#name='%s']" % name)
revision = project.get('revision')
res = pattern.match(revision)
print(res.group(1))
You have to wrap your xml data with a root node, for example <data> or it will raise a parse error.