SOAP - Create an element with zeep (python) - python

I've to use a SOAP API for a project.
For a specific method I've to send a complex type.
This complex type is declared like that:
<complexType name="specialList">
<sequence>
<element name=data" minOccurs="0"maxOccurs="unbounded">
<complexType>
<simpleContent>
<extension base="string">
<attribute name="key" type="string" use="required"/>
</extension>
</simpleContent>
</complexType>
</element>
</sequence>
</complexType>
This is an example:
<my_action type="specialList">
<data key="myKey">MyValue</data>
<data key="myOtherKey">MyOtherValue</data>
</my_action>
To access to the SOAP API, I use zeep (I tried with suds).
The first think I do is retrieve my "specialList".
special_list = client.get_type('ns1:specialList')
my_action = special_list(data=[data_1, data_2])
However I've a problem with the type "data". Indeed this type "data" is not declared. I cannot do a client.get_type("ns1:data").
I tried several time to create an simple element but without success.
Do you have an idea how to create this "special" data ?
In advance, thank you.
Sylvain

can you try using AnyObject as indicated in their documentation: http://docs.python-zeep.org/en/master/datastructures.html
so in your code:
from zeep import xsd
special_list = client.get_type('ns1:specialList')
my_action = xsd.AnyObject(special_list, special_list(data=[data_1, data_2]))

Related

Remove namespace with xmltodict in Python

xmltodict converts XML to a Python dictionary. It supports namespaces. I can follow the example on the homepage and successfully remove a namespace. However, I cannot remove the namespace from my XML and cannot identify why? Here is my XML:
<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus">
<section1
mystatus:field1="data1"
mystatus:field2="data2" />
<section2
mystatus:lineA="outputA"
mystatus:lineB="outputB" />
</status>
And using:
xmltodict.parse(xml,process_namespaces=True,namespaces={'http://localhost/mystatus':None})
I get:
OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'#http://localhost/mystatus:field1', u'data1'), (u'#http://localhost/mystatus:field2', u'data2')])), (u'section2', OrderedDict([(u'#http://localhost/mystatus:lineA', u'outputA'), (u'#http://localhost/mystatus:lineB', u'outputB')]))]))])
instead of:
OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'field1', u'data1'), (u'field2', u'data2')])), (u'section2', OrderedDict([(u'lineA', u'outputA'), (u'#lineB', u'outputB')]))]))])
Am I making some simple mistake, or is there something about my XML that prevents the process_namespace modification from working correctly?
xmltodict is based on expat, so namespaces should applied to the class name, not attribute names:
<?xml version="1.0" encoding="UTF-8"?>
<status xmlns:mystatus="http://localhost/mystatus">
<mystatus:section1 field1="data1" field2="data2" />
<mystatus:section2 lineA="outputA" lineB="outputB" />
</status>
When parsed with:
foo = xmltodict.parse(xml,
process_namespaces=True,
namespaces={'http://localhost/mystatus':None})
outputs:
OrderedDict([(u'status', OrderedDict([(u'section1', OrderedDict([(u'#field1', u'data1'), (u'#field2', u'data2')])), (u'section2', OrderedDict([(u'#lineA', u'outputA'), (u'#lineB', u'outputB')]))]))])
Accessing it is easy:
# Get attribute 'lineA' from class 'section2' from class 'status'
>>> foo.get('status').get('section2').get('#lineA')
u'outputA'
Attribute namespaces are only required when you have multiple attributes of the same name (e.g. multiple id's or multiple prices, etc), in which case, I couldn't get expat or xmltodict to parse it correctly. YMMV though.

python suds wrong namespace prefix in SOAP request

I use python/suds to implement a client and I get wrong namespace prefixes in the sent SOAP header for a spefic type of parameters defined by element ref= in the wsdl.
The .wsdl is referencing a data types .xsd file, see below. The issue is with the function GetRecordAttributes and its first argument of type gbt:recordReferences.
File: browse2.wsdl
<xsd:schema targetNamespace="http://www.grantadesign.com/10/10/Browse" xmlns="http://www.grantadesign.com/10/10/Browse" xmlns:gbt="http://www.grantadesign.com/10/10/GrantaBaseTypes" elementFormDefault="qualified" attributeFormDefault="qualified">
<xsd:import schemaLocation="grantabasetypes2.xsd" namespace="http://www.grantadesign.com/10/10/GrantaBaseTypes"/>
<xsd:element name="GetRecordAttributes">
<xsd:complexType>
<xsd:sequence>
<xsd:element ref="gbt:recordReferences">
</xsd:element>
Referenced File : grantabasetypes2.xsd
<element name="recordReferences">
<complexType>
<sequence>
<element name="record" minOccurs="0" maxOccurs="unbounded" type="gbt:MIRecordReference"/>
</sequence>
</complexType>
</element>
SOAP Request sent by suds:
<SOAP-ENV:Envelope xmlns:ns0="http://www.grantadesign.com/10/10/GrantaBaseTypes" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/" xmlns:ns2="http://www.grantadesign.com/10/10/Browse" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
<SOAP-ENV:Header/>
<ns1:Body>
<ns2:GetRecordAttributes>
<ns2:recordReferences>
<ns0:record>
</ns0:record>
</ns2:recordReferences>
</ns2:GetRecordAttributes>
</ns1:Body>
</SOAP-ENV:Envelope>
Problem : <ns2:recordReferences> has wrong prefix, should be <ns0:recordReferences> since it belongs to the namespace ...GrantaBaseTypes defined in the .xsd.
This happens for all arguments defined by ref= in the wsdl. How can this be automatically fixed?
Note: I checked that the "good" prefix is accepted by the service by manually sending the xml SOAP request via curl.
UPDATE
I meddled with SUDS source code and the following empirical fix forces all elements with ref= attribute to assume the ref-ed namespace (previously, they take on the schema root namespace or whatever tns is):
File: /suds/xsd/sxbase.py
class SchemaObject(object):
....
def namespace(self, prefix=None):
ns = self.schema.tns
#FIX BEGIN
if self.ref and self.ref in self.schema.elements.keys():
ns = self.ref
#FIX END
Works with my service, but I'm not sure if it'll break other things. I would prefer a smarter solution that does not change SUDS source code.
Thanks,
Alex
Write a Suds plugin to modify the XML before it is sent.
from suds.client import Client
from suds.plugin import MessagePlugin
class MyPlugin(MessagePlugin):
def marshalled(self, context):
#modify this line to reliably find the "recordReferences" element
context.envelope[1][0][0].setPrefix('ns0')
client = Client(WSDL_URL, plugins=[MyPlugin()])
Quoting Suds documentation:
marshalled()
Provides the plugin with the opportunity to inspect/modify the envelope Document before it is sent.
I had the exact same problem when using suds to access a BizTalk/IIS SOAP service.
From what I can tell from the WSDL it occurs when there is a "complexType" that is not part of the "targetNamespace" (it has it's own), which has a child that is also a complexType, but with no namespace set. In BizTalk this means that the child should belong to the same namespace as the parent, but Suds seem to think that it then should be part of the targetNamespace ....
The fix in the source-code solved the thing "correctly", but since I want to be able to upgrade without applying the fix every time I went for another solution....
My solution was to skip Suds and just copy the raw XML, use that as a template and copy the values into it ... Not beautiful, but at least simple.
The solution to add a plugin is in my opinion equally hardcoded and perhaps even harder to maintain.
You could build soap message yourself and use SoapClient to send the message :
sc = SoapClient(cli.service.XXXMethod.client,cli.service.XXXMethod.method)
sc.send(some_soap_doc)
I prefer regular expressions :)
import re
class EnvelopeFixer(MessagePlugin):
def sending(self, context):
# rimuovi i prefissi
context.envelope = re.sub( 'ns[0-9]:', '', context.envelope )
return context.envelope

parsing XML configuration file using Etree in python

Please help me parse a configuration file of the below prototype using lxml etree. I tried with for event, element with tostring. Unfortunately I don't need the text, but the XML between
<template name>
<config>
</template>
for a given attribute.
I started with this code, but get a key error while searching for the attribute since it scans from start
config_tree = etree.iterparse(token_template_file)
for event, element in config_tree:
if element.attrib['name']=="ad auth":
print ("attrib reached. get XML before child ends")
Since I am a newbie to XML and python, I am not sure how to go about it. Here is the config file:
<Templates>
<template name="config1">
<request>
<password>pass</password>
<userName>username</userName>
<appID>someapp</appID>
</request>
</template>
<template name="config2">
<request>
<password>pass1</password>
<userName>username1</userName>
<appID>someapp</appID>
</request>
</template>
</Templates>
Thanks in advance!
Expected Output:
Say the user requests the config2- then the output should look like:
<request>
<password>pass1</password>
<userName>username1</userName>
<appID>someapp</appID>
</request>
(I send this XML using httplib2 to a server for initial authentication)
FINAL CODE:
thanks to FC and Constantnius. Here is the final code:
config_tree = etree.parse(token_template_file)
for template in config_tree.iterfind("template"):
if template.get("name") == "config2":
element = etree.tostring(template.find("request"))
print (template.get("name"))
print (element)
output:
config2
<request>
<password>pass1</password>
<userName>username1</userName>
<appID>someapp</appID>
</request>
You could try to iterate over all template elements in the XML and parse them with the following code:
for template in root.iterfind("template"):
name = template.get("name")
request = template.find(requst)
password = template.findtext("request/password")
username = ...
...
# Do something with the values
You could try using get('name', default='') instead of ['name']
To get the text in the tag use .text

lxml parse xsd file without Schema URL

I am using lxml to parse an xsd file and am looking for an easy way to remove the URL namespace attached to each element name. Here's the xsd file:
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" version="2.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="rootelement">
<xs:complexType>
<xs:choice maxOccurs="unbounded">
<xs:element minOccurs="1" maxOccurs="1" name="element1">
<xs:complexType>
<xs:all>
<xs:element name="subelement1" type="xs:string" />
<xs:element name="subelement2" type="xs:integer" />
<xs:element name="subelement3" type="xs:dateTime" />
</xs:all>
<xs:attribute name="id" type="xs:integer" use="required" />
</xs:complexType>
</xs:element>
</xs:choice>
<xs:attribute fixed="2.0" name="version" type="xs:decimal" use="required" />
</xs:complexType>
</xs:element>
</xs:schema>
and using this code:
from lxml import etree
parser = etree.XMLParser()
data = etree.parse(open("testschema.xsd"),parser)
root = data.getroot()
rootelement = root.getchildren()[0]
rootelementattribute = rootelement.getchildren()[0].getchildren()[1]
print "root element tags"
print rootelement[0].tag
print rootelementattribute.tag
elements = rootelement.getchildren()[0].getchildren()[0].getchildren()
elements_attribute = elements[0].getchildren()[0].getchildren()[1]
print "element tags"
print elements[0].tag
print elements_attribute.tag
subelements = elements[0].getchildren()[0].getchildren()[0].getchildren()
print "subelements"
print subelements
I get the following output
root element tags
{http://www.w3.org/2001/XMLSchema}complexType
{http://www.w3.org/2001/XMLSchema}attribute
element tags
{http://www.w3.org/2001/XMLSchema}element
{http://www.w3.org/2001/XMLSchema}attribute
subelements
[<Element {http://www.w3.org/2001/XMLSchema}element at 0x7f2998fb16e0>, <Element {http://www.w3.org/2001/XMLSchema}element at 0x7f2998fb1780>, <Element {http://www.w3.org/2001/XMLSchema}element at 0x7f2998fb17d0>]
I don't want "{http://www.w3.org/2001/XMLSchema}" to appear at all when I pull the tag data (altering the xsd file is not an option). The reason I need the xsd tag info is that I am using this to validate column names from a series of flat files. On the "element" level there are multiple elements that I'm pulling, as well as subelements, which I am using a dictionary to validate columns. Also, any suggestions on improving the code above would be greatly, such as a way to use fewer "getchildren" calls, or just make it more organized.
I'd use:
print elem.tag.split('}')[-1]
But you could also use the xpath function local-name():
print elem.xpath('local-name()')
As for fewer getchildren() calls: just leave them out. getchildren() is a deprecated way of making a list of the direct children (you should just use list(elem) instead if you actually want this).
You can iterate over, or use an index on an element directly. For example: rootelement[0] will give you the first child element of rootelement (but more efficient than if you were use rootelement.getchildren()[0], because this would act like list(rootelement) and create a new list first)
I wonder why etree.XMLParser(ns_clean=True) doesn't work. It had not worked for me so did it getting namespace from root.nsmap between brackets and replacing it with empty string
print rootelement[0].tag.replace('{%s}' %root.nsmap['xs'], '')
The easiest thing to do is to just use string slicing to remove namespace prefix:
>>> print rootelement[0].tag[34:]
complexType
If the URI might change in the future (for some unknown reason or you're truly paranoid), consider the following:
print "root element tags"
tag, nsmap, prefix = rootelement[0].tag, rootelement[0].nsmap, rootelement[0].prefix
tag = tag[len(nsmap[prefix]) + 2:]
print tag
This is a very unlikely case, but who knows?

How to validate XML with multiple namespaces in Python?

I'm trying to write some unit tests in Python 2.7 to validate against some extensions I've made to the OAI-PMH schema: http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd
The problem that I'm running into is business with multiple nested namespaces is caused by this specification in the above mentioned XSD:
<complexType name="metadataType">
<annotation>
<documentation>Metadata must be expressed in XML that complies
with another XML Schema (namespace=#other). Metadata must be
explicitly qualified in the response.</documentation>
</annotation>
<sequence>
<any namespace="##other" processContents="strict"/>
</sequence>
</complexType>
Here's a snippet of the code I'm using:
import lxml.etree, urllib2
query = "http://localhost:8080/OAI-PMH?verb=GetRecord&by_doc_ID=false&metadataPrefix=nsdl_dc&identifier=http://www.purplemath.com/modules/ratio.htm"
schema_file = file("../schemas/OAI/2.0/OAI-PMH.xsd", "r")
schema_doc = etree.parse(schema_file)
oaischema = etree.XMLSchema(schema_doc)
request = urllib2.Request(query, headers=xml_headers)
response = urllib2.urlopen(request)
body = response.read()
response_doc = etree.fromstring(body)
try:
oaischema.assertValid(response_doc)
except etree.DocumentInvalid as e:
line = 1;
for i in body.split("\n"):
print "{0}\t{1}".format(line, i)
line += 1
print(e.message)
I end up with the following error:
AssertionError: http://localhost:8080/OAI-PMH?verb=GetRecord&by_doc_ID=false&metadataPrefix=nsdl_dc&identifier=http://www.purplemath.com/modules/ratio.htm
Element '{http://www.openarchives.org/OAI/2.0/oai_dc/}oai_dc': No matching global element declaration available, but demanded by the strict wildcard., line 22
I understand the error, in that the schema is requiring that the child element of the metadata element be strictly validated, which the sample xml does.
Now I've written a validator in Java that works - however it would be helpful for this to be in Python, since the rest of the solution I'm building is Python based. To make my Java variant work, I had to make my DocumentFactory namespace aware, otherwise I got the same error. I've not found any working example in python that performs this validation correctly.
Does anyone have an idea how I can get an XML document with multiple nested namespaces as my sample doc validate with Python?
Here is the sample XML document that i'm trying to validate:
<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/
http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2002-02-08T08:55:46Z</responseDate>
<request verb="GetRecord" identifier="oai:arXiv.org:cs/0112017"
metadataPrefix="oai_dc">http://arXiv.org/oai2</request>
<GetRecord>
<record>
<header>
<identifier>oai:arXiv.org:cs/0112017</identifier>
<datestamp>2001-12-14</datestamp>
<setSpec>cs</setSpec>
<setSpec>math</setSpec>
</header>
<metadata>
<oai_dc:dc
xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/
http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>Using Structural Metadata to Localize Experience of
Digital Content</dc:title>
<dc:creator>Dushay, Naomi</dc:creator>
<dc:subject>Digital Libraries</dc:subject>
<dc:description>With the increasing technical sophistication of
both information consumers and providers, there is
increasing demand for more meaningful experiences of digital
information. We present a framework that separates digital
object experience, or rendering, from digital object storage
and manipulation, so the rendering can be tailored to
particular communities of users.
</dc:description>
<dc:description>Comment: 23 pages including 2 appendices,
8 figures</dc:description>
<dc:date>2001-12-14</dc:date>
</oai_dc:dc>
</metadata>
</record>
</GetRecord>
</OAI-PMH>
Found this in lxml's doc on validation:
>>> schema_root = etree.XML('''\
... <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
... <xsd:element name="a" type="xsd:integer"/>
... </xsd:schema>
... ''')
>>> schema = etree.XMLSchema(schema_root)
>>> parser = etree.XMLParser(schema = schema)
>>> root = etree.fromstring("<a>5</a>", parser)
So, perhaps, what you need is this? (See last two lines.):
schema_doc = etree.parse(schema_file)
oaischema = etree.XMLSchema(schema_doc)
request = urllib2.Request(query, headers=xml_headers)
response = urllib2.urlopen(request)
body = response.read()
parser = etree.XMLParser(schema = oaischema)
response_doc = etree.fromstring(body, parser)

Categories

Resources