read xml in python - python

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<LoginResponse xmlns="http://tempuri.org/">
<LoginResult>true</LoginResult>
<aSessionID>AF-6A-51-FD-E6-8D-C8-12-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-CA</aSessionID>
</LoginResponse>
</soap:Body>
</soap:Envelope>
This xml format coming form sope api i want to read xml aSessionID form this. Please help me to do this in python

list_test.xml:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<LoginResponse xmlns="http://tempuri.org/">
<LoginResult>true</LoginResult>
<aSessionID>AF-6A-51-FD-E6-8D-C8-12-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-CA</aSessionID>
<aSessionID>54F-6A-51-FD-E6-8D-C8-45-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-65</aSessionID>
</LoginResponse>
</soap:Body>
</soap:Envelope>
and then:
from xml.dom import minidom
doc = minidom.parse("list_test.xml")
sessionList = doc.getElementsByTagName('aSessionID')
for sess in sessionList:
print(sess.firstChild.nodeValue)
OUTPUT:
AF-6A-51-FD-E6-8D-C8-12-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-CA
54F-6A-51-FD-E6-8D-C8-45-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-65
EDIT:
To read the xml from a string rather than the file, you may use:
minidom.parseString(xml_str)
Hence:
from xml.dom import minidom
xml_str = '''<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<LoginResponse xmlns="http://tempuri.org/">
<LoginResult>true</LoginResult>
<aSessionID>AF-6A-51-FD-E6-8D-C8-12-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-CA</aSessionID>
<aSessionID>54F-6A-51-FD-E6-8D-C8-45-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-65</aSessionID>
</LoginResponse>
</soap:Body>
</soap:Envelope>'''
doc = minidom.parseString(xml_str)
sessionList = doc.getElementsByTagName('aSessionID')
for sess in sessionList:
print(sess.firstChild.nodeValue)
OUTPUT:
AF-6A-51-FD-E6-8D-C8-12-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-CA
54F-6A-51-FD-E6-8D-C8-45-AB-7E-C1-BD-50-7A-43-D0-AA-27-15-65

Related

How to insert a processing instruction in XML file?

I want to add a xml-stylesheet processing instruction before the root element in my XML file using ElementTree (Python 3.8).
You find as below my code that I used to create XML file
import xml.etree.cElementTree as ET
def Export_star_xml( self ):
star_element = ET.Element("STAR",**{ 'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance' })
element_node = ET.SubElement(star_element ,"STAR_1")
element_node.text = "Mario adam"
tree.write( "star.xml" ,encoding="utf-8", xml_declaration=True )
Output:
<?xml version="1.0" encoding="windows-1252"?>
<STAR xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<STAR_1> Mario adam </STAR_1>
</STAR>
Output Expected:
<?xml version="1.0" encoding="windows-1252"?>
<?xml-stylesheet type="text/xsl" href="ResourceFiles/form_star.xsl"?>
<STAR xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<STAR_1> Mario adam </STAR_1>
</STAR>
I cannot figure out how to do this with ElementTree. Here is a solution that uses lxml, which provides an addprevious() method on elements.
from lxml import etree as ET
# Note the use of nsmap. The syntax used in the question is not accepted by lxml
star_element = ET.Element("STAR", nsmap={'xsi': 'http://www.w3.org/2001/XMLSchema-instance'})
element_node = ET.SubElement(star_element ,"STAR_1")
element_node.text = "Mario adam"
# Create PI and and insert it before the root element
pi = ET.ProcessingInstruction("xml-stylesheet", text='type="text/xsl" href="ResourceFiles/form_star.xsl"')
star_element.addprevious(pi)
ET.ElementTree(star_element).write("star.xml", encoding="utf-8",
xml_declaration=True, pretty_print=True)
Result:
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type="text/xsl" href="ResourceFiles/form_star.xsl"?>
<STAR xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<STAR_1>Mario adam</STAR_1>
</STAR>

merging two xml files and appending elements for similar elements and moving elements that aren't present in one file in python

I want to merge two XML files. I read many solutions but they are specific to those files. I am using xml.etree.ElementTree as well as lxml for parsing, comparing the files, getting the differences. I understand my next step is:
for element in file2.xml:
if element present in file1.xml:
append to output_file.xml
else:
copy element to the output_file
but I haven't worked much on XML, and the tools to merge are licensed, so I need to write a generic script to merge to the format I want.
file1.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
</great_grands>
file2.xml:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_two>great_grandpa_name</great_grandpa_name_two>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>
Required output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<great_grandma_name_two>great_grandma_name</great_grandma_name_two>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>
Consider XSLT, the special-purpose declarative language and sibling to XPath, designed to transform XML files. Using its document() function, it can parse from external XML files at relative links. Python's lxml module can process XSLT 1.0 scripts.
And because XSLT scripts are well-formed XML files you can parse from file or embedded string. Below assumes all files and scripts are saved in same directory:
XSLT Script (save as .xsl script, notice only file2.xml is referenced)
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output version="1.0" encoding="UTF-8" indent="yes" />
<xsl:strip-space elements="*"/>
<xsl:template match="/great_grands">
<xsl:copy>
<xsl:copy-of select="great_grandpa_name_one"/>
<xsl:copy-of select="document('file2.xml')/great_grands/great_grandpa_name_two"/>
<xsl:copy-of select="grandpa"/>
<xsl:copy-of select="document('file2.xml')/great_grands/grandpa"/>
<xsl:copy-of select="grandma"/>
<xsl:copy-of select="document('file2.xml')/great_grands/grandma"/>
</xsl:copy>
</xsl:template>
</xsl:transform>
Python Script (notice only file1.xml is referenced)
from lxml import etree
xml = etree.parse('file1.xml')
xsl = etree.parse('XSLTScript.xsl')
transform = etree.XSLT(xsl)
newdom = transform(xml)
# SAVE NEW DOM STRING TO FILE
with open('Output.xml', 'wb') as f:
f.write(newdom)
Output
<?xml version="1.0" encoding="UTF-8"?>
<great_grands>
<great_grandpa_name_one>great_grandpa_name</great_grandpa_name_one>
<great_grandpa_name_two>great_grandpa_name</great_grandpa_name_two>
<grandpa>
<grandpa_name>grandpa_name_one_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name>grandpa_name_two_1</grandpa_name>
</grandpa>
<grandpa>
<grandpa_name_2>grandpa_name_one_2</grandpa_name_2>
</grandpa>
<grandma>
<grandma_name>grandma_name_one_1</grandma_name>
</grandma>
<grandma>
<grandma_name>grandma_name_two_1</grandma_name>
</grandma>
<grandma>
<grandma_name_2>grandma_name_one_2</grandma_name_2>
</grandma>
</great_grands>

Python suds XML element from nested complex type is not created in SOAP envelope

I'm having issues using suds-jurko (a fork of suds) to handle SOAP services and am running in some issues which seem related to the presence of nested complex types in the wsdl.
Here's the offending service as defined in the wsdl:
<?xml version="1.0" encoding="UTF-8"?>
<definitions name="FormHandler" targetNamespace="http://grid.agnis.net/FormHandler">
<import namespace="http://security.introduce.cagrid.nci.nih.gov/ServiceSecurity" location="FormHandler?wsdl=ServiceSecurity.wsdl">
</import>
<types>
<schema attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="http://grid.agnis.net/FormHandler">
<import namespace="gme://forms.AGNIS/2.0/net.agnis.forms" schemaLocation="FormHandler?xsd=net.agnis.forms.xsd"/>
<element name="SubmitFormRevisionRequest">
<complexType>
<sequence>
<element name="formRevision">
<complexType>
<sequence>
<element maxOccurs="1" minOccurs="1" ref="ns0:FormRevision"/>
So there is basically a <ns0:FormRevision> element nested inside a <formRevision> (which is taken from the current namespace)
I should be getting this:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="gme://forms.AGNIS/2.0/net.agnis.forms" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns2="http://grid.agnis.net/FormHandler" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns1:Body>
<ns2:SubmitFormRevisionRequest>
<ns2:formRevision>
<ns0:FormRevision>
<ns0:form publicId="4637831" version="1.0">
<ns0:originator uniqueName="cibmtr_center_number:XXX"/>
</ns0:form>
But when I print out the envelope, I get the following output:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="gme://forms.AGNIS/2.0/net.agnis.forms" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns2="http://grid.agnis.net/FormHandler" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns1:Body>
<ns2:SubmitFormRevisionRequest>
<ns2:formRevision>
<form publicId="4637831" version="1.0">
<originator uniqueName="cibmtr_center_number:XXX"/>
</form>
Notice the missing <ns0:FormRevision> element ? (along with the namespaces :ns0 for the other elements)
Can anyone assist me in fixing this issue ?
Thanks !!
JP

XSLT transformation gives only root element python lxml

Currently doing XML-XSLT transformation using following code.
from lxml import etree
xmlRoot = etree.parse('path/abc.xml')
xslRoot = etree.parse('path/abc.xsl')
transform = etree.XSLT(xslRoot)
newdom = transform(xmlRoot)
print(etree.tostring(newdom, pretty_print=True))
The following code works fine but gives only the root element as output but not the whole XML content. When i run the transformation for the same XML and XSL file using Altova it works fine doing the transformation. Is the syntax for printing the whole XML is different or any errors in here that u find out?
XML content :
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
<slide name="slide7.xml" nav_lvl_1="Solutions" nav_lvl_2="Value Map" page_number="7">
<title>Retail Value Map</title>
<Subheadline>Retail </Subheadline>
</slide>
</root>
XSL content:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output encoding="UTF-8" indent="yes" method="xml" standalone="yes" version="1.0"/>
<xsl:template match="/">
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main">
<xsl:for-each select="root/slide">
<xsl:choose>
<xsl:when test="#nav_lvl_1='Solutions'">
<xsl:if test="#nav_lvl_2='Value Map'">
<p:txBody>
<a:p>
<a:r>
<a:rPr lang="en-US" dirty="0" smtClean="0"/>
<a:t>
<xsl:value-of select="title"/>
</a:t>
</a:r>
<a:endParaRPr lang="en-US" dirty="0"/>
</a:p>
</p:txBody>
</xsl:if>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</p:sld>
</xsl:template>
Current output :
<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main"/>

Unable to ingest data using python suds

I am trying to Ingest data in Attivio (Active Intelligence Engine) using suds in python. The document id, Field name are ingested successfully. But fieldValue value is not getting populated. Here's the python code:
from suds.client import Client
url = "http://localhost:17000/ws/bean.attivioIngestWebService/com.attivio.webservice.service.IngestApi?wsdl"
client = Client(url)
cfg = client.factory.create('sessionConfig')
cfg.commitInterval = 1
sessionId = client.service.connect(cfg)
doc = client.factory.create('attivioDocument')
doc._id = "Doc1"
text = client.factory.create('Field')
text._name = "text"
textval = client.factory.create('fieldValue')
textval.value = "Test document text"
text.values = [textval]
doc.fields = [text]
print doc
try:
client.service.feed(sessionId, [doc])
client.service.commit(sessionId)
except Exception as e:
print e
UPDATE:
See the difference between the .net and python SOAP below. Because <value> is defined as anyType, suds doesn't put the type information on it and AIE can't handle it correctly. This maybe the root of problem. Any idea how to fix it?
.net
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
<s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<feed xmlns="http://webservice.attivio.com/">
<sessionId xmlns="">704#10.7.3.7_r437gs-f879-4c58-9da5-45erf7as88ex</sessionId>
<docs readOnly="false" id="dotnet" xmlns="">
<fields name="title">
<values>
**<value xsi:type="xsd:string">Hello dot net</value>**
</values>
</fields>
</docs>
</feed>
</s:Body>
</s:Envelope>POST /ws/bean.attivioIngestWebService/com.attivio.webservice.service.IngestApi HTTP/1.1
python
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns1="http://webservice.attivio.com/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns0:Body>
<ns1:feed>
<sessionId>401#192.168.1.100_fnwsf5b5218-9159-4a8f-987a</sessionId>
<docs id="Doc1">
<fields name="text">
<values>
**<value>Test document text</value>**
</values>
<metadata/>
</fields>
<mode/>
</docs>
</ns1:feed>
</ns0:Body>
</SOAP-ENV:Envelope>

Categories

Resources