Consume SOAP webservices using escaped xml as attribute - python

I am using suds to consume SOAP web services like this way:
from suds.client import Client
url = "http://www.example.com?wsdl"
client = Client(url)
client.service.example(xml_argument)
If I call the method using this xml works:
<?xml version="1.0" encoding="UTF-8"?><a><b description="Foo Bar"></b></a>
But If I add a quote (escaped) like this:
<?xml version="1.0" encoding="UTF-8"?><a><b description="Foo " Bar"></b></a>
I get the following error (from the webservice):
Attribute name "Bar" associated with an element type "b" must be
followed by the ' = ' character.
I am using version: 0.4 GA build: R699-20100913
Am I not using suds.client in the proper way? any suggestions?
UPDATE:
I have already contacted customer support, emailed them my escaped XML and they told me that it works for them, so probably is caused due a bad use from suds in my side. I'll give a try with PySimpleSOAP.

Mine is mostly a guess, but the error you are quoting seems to be generated from the XML well-formedness checker on the machine providing the service.
It seems that on that side of the cable they are getting something like:
<a><b description="Foo" Bar"></b></a>
(" converted to ") and thus they are telling you that you should instead send something like:
<a><b description="Foo" Bar="..."></b></a>
which is clearly not what you want.
AFAIK your XML is well formed (just tested here for extra safety), so either there is a bug in suds (which would surprise me, given the magnitude of the bug and the maturity of the package) or there is a bug on the server providing the service (possibly a "too early conversion" from XML entities to regular chars).
Again: lot of speculation and few hard facts here, but I still HTH! :)

Related

How to get unparsed XML from a suds response, and best django model field to use for storage

I am using suds to request data from a 3rd party using a wsdl. I am only saving some of the data returned for now, but I am paying for the data that I get so I would like to keep all of it. I have decided that the best way to save this data is by capturing the raw xml response into a database field both for future use should I decide that I want to start using different parts of the data and as a paper trail in the event of discrepancies.
So I have a two part question:
Is there a simple way to output the raw received xml from the suds.client object? In my searches for the answer to this I have learned this can be done through logging, but I was hoping to not have to dig that information back out of the logs to put into the database field. I have also looked into the MessagePlugin.recieved() hook, but could not really figure out how to access this information after it has been parsed, only that I can override that function and have access to the raw xml as it is being parsed (which is before I have decided whether or not it is actually worth saving yet or not). I have also explored the retxml option but I would like to use the parsed version as well and making two separate calls, one as retxml and the other parsed will cost me twice. I was hoping for a simple function built into the suds client (like response.as_xml() or something equally simple) but have not found anything like that yet. The option bubbling around in my head might be to extend the client object using the .received() plugin hook that saves the xml as an object parameter before it is parsed, to be referenced later... but the execution of such seems a little tricky to me right now, and I have a hard time believing that the suds client doesn't just have this built in somewhere already, so I thought I would ask first.
The other part to my question is: What type of django model field would be best suited to handle up to ~100 kb of text data as raw xml? I was going to simply use a simple CharField with a stupidly long max_length, but that feels wrong.
Thanks in advance.
I solved this by using the flag retxml on client initialization:
client = Client(settings.WSDL_ADDRESS, retxml=True)
raw_reply = client.service.PersonSearch(soapified_search_object)
I was then able to save raw_reply as the raw xml into a django models.TextField()
and then inject the raw xml to get a suds parsed result without having to re-submit my search lika so:
parsed_result = client.service.PersonSearch(__inject={'reply': raw_reply})
I suppose if I had wanted to strip off the suds envelope stuff from raw reply I could have used a python xml library for further usage of the reply, but as my existing code was already taking the information I wanted from the suds client result I just used that.
Hope this helps someone else.
I have used kyrayzk solution for a while, but have always found it a bit hackish, as I had to create a separate dummy client just for when I needed to process the raw XML.
So I sort of reimplemented .last_received() and .last_sent() methods (which were (IMHO, mistakenly) removed in suds-jurko 0.4.1) through a MessagePlugin.
Hope it helps someone:
class MyPlugin(MessagePlugin):
def __init__(self):
self.last_sent_raw = None
self.last_received_raw = None
def sending(self, context):
self.last_sent_raw = str(context.envelope)
def received(self, context):
self.last_received_raw = str(context.reply)
Usage:
plugin = MyPlugin()
client = Client(TRTH_WSDL_URL, plugins=[plugin])
client.service.SendSomeRequest()
print plugin.last_sent_raw
print plugin.last_received_raw
And as an extra, if you want a nicely indented XML, try this:
from lxml import etree
def xmlpprint(xml):
return etree.tostring(etree.fromstring(xml), pretty_print=True)

Parsing a SOAP response in Python

I'm trying to parse a SOAP response from a server. I'm 100% new to SOAP and pretty new to communicating using HTTP/HTTPS. I'm using Python 2.7 on Ubuntu 12.04.
It looks like SOAP is very much like XML. However, I seem to be unable to parse it as such. I've tried to use ElementTree but keep getting errors. From searches I've been able to conclude that there may be issues with the SOAP tags. (I could be way off here...let me know if I am.)
So, here is an example of the SOAP message I have and what I'm trying to do to parse it (this is an actual server response from Link Point Gateway, in case that's relevant).
import xml.etree.ElementTree as ET
soap_string = '<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Body><fdggwsapi:FDGGWSApiOrderResponse xmlns:fdggwsapi="http://secure.linkpt.net/fdggwsapi/schemas_us/fdggwsapi"><fdggwsapi:CommercialServiceProvider/><fdggwsapi:TransactionTime>Wed Jul 25 10:26:40 2012</fdggwsapi:TransactionTime><fdggwsapi:TransactionID/><fdggwsapi:ProcessorReferenceNumber/><fdggwsapi:ProcessorResponseMessage/><fdggwsapi:ErrorMessage>SGS-002303: Invalid credit card number.</fdggwsapi:ErrorMessage><fdggwsapi:OrderId>1</fdggwsapi:OrderId><fdggwsapi:ApprovalCode/><fdggwsapi:AVSResponse/><fdggwsapi:TDate/><fdggwsapi:TransactionResult>FAILED</fdggwsapi:TransactionResult><fdggwsapi:ProcessorResponseCode/><fdggwsapi:ProcessorApprovalCode/><fdggwsapi:CalculatedTax/><fdggwsapi:CalculatedShipping/><fdggwsapi:TransactionScore/><fdggwsapi:FraudAction/><fdggwsapi:AuthenticationResponseCode/></fdggwsapi:FDGGWSApiOrderResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>'
targetTree = ET.fromstring(soap_string)
This yields the following error:
unbound prefix: line 1, column 0
From another stackoverflow post I've concluded that SOAP-ENV:Body may be causing a namespace problem. (I could be wrong.)
I've done other searches to find a good solution for parsing SOAP but most of them are from 3+ years ago. It seems that suds is pretty highly recommended. I wanted to get "updated" recommendations before I got too far down a path.
Can anyone recommend a solid (and easy) way to parse a SOAP response like the one I received above? It would be appreciated if you could provide a simple example to get me started (as I said above, I'm completely new to SOAP).
I was unable to find a straight-forward approach using Python. I decided to use PHP instead.
Much like the following:
Python:
import subprocess
command = 'php /path/to/script.php "{1}"'.format(soap_string)
process = subprocess.Popen(command, shell = True, stderr = subprocess.PIPE, stdout = subprocess.PIPE)
process.wait()
output = process.communicate()[0]
(error, result, order_id) = output.split(',')
PHP:
#!/usr/bin/php
<?php
$soap_response = $argv[1];
$doc = simplexml_load_string($soap_response);
$doc->registerXPathNamespace('fdggwsapi', 'http://secure.linkpt.net/fdggwsapi/schemas_us/fdggwsapi');
$nodes = $doc->xpath('//fdggwsapi:FDGGWSApiOrderResponse/fdggwsapi:ErrorMessage');
$error = strval($nodes[0]);
$nodes = $doc->xpath('//fdggwsapi:FDGGWSApiOrderResponse/fdggwsapi:TransactionResult');
$result = strval($nodes[0]);
$nodes = $doc->xpath('//fdggwsapi:FDGGWSApiOrderResponse/fdggwsapi:OrderId');
$order_id = strval($nodes[0]);
$array = array($error, $result, $order_id);
$response = implode(',', $array);
echo $response;
This code only parses specific aspects of this particular SOAP response. It should be enough to get you going to solve your problem.
I'm a complete newbie when it comes to PHP (I've used Perl a bit so that helped). I must give credit to #scoffey for his solution to parsing SOAP in a way that finally made sense to me.
EDITED:
Working with SOAP in Python is really fun - most tools are not maintained for years. If we talk about features - maybe ZSI is the leader. But it has lots of bugs if it comes to support some more complex XSD schemas(just one example - it doesn't support unions and complex types based on extensions, where the extended type is not a base type).
Suds is very easy to use, but not so powerful as ZSI - it has worse support for some complex XSD constructs than ZSI.
There is an interesting tool - generateDS, which works with XSD and not directly with WSDL - you have to implement the methods yourself. But it does a pretty good job actually.

simplejson dumps and multi lines

I have a little question.
I use simplejson to dumps a string.
This string contains some new line characters ( \n ),
so when I print it on the server side, I get something like that :
toto
tata
titi
And I want that it displays the same way on the client side (html).
So I did simply :
return json.dumps(data.replace('\n','<br />'))
And it works, but I don't think it's the good way to do it.
Is here another method ?
Thanks.
I don't know the specifics of your situation, so maybe this is fine, but in general I'd recommend that you replace \n in the client, not on the server side. If someone wants to use your JSON API for non-HTML client, having <br> will be pretty annoying, and they'll just have to parse that back out. The server should convey the actual data, and the client should be responsible for turning that into information relevant to their user, including changing the formatting or markup if necessary.

ElementTree in Python 2.6.2 Processing Instructions support?

I'm trying to create XML using the ElementTree object structure in python. It all works very well except when it comes to processing instructions. I can create a PI easily using the factory function ProcessingInstruction(), but it doesn't get added into the elementtree. I can add it manually, but I can't figure out how to add it above the root element where PI's are normally placed. Anyone know how to do this? I know of plenty of alternative methods of doing it, but it seems that this must be built in somewhere that I just can't find.
Try the lxml library: it follows the ElementTree api, plus adds a lot of extras. From the compatibility overview:
ElementTree ignores comments and processing instructions when parsing XML, while etree will read them in and treat them as Comment or ProcessingInstruction elements respectively. This is especially visible where comments are found inside text content, which is then split by the Comment element.
You can disable this behaviour by passing the boolean remove_comments and/or remove_pis keyword arguments to the parser you use. For convenience and to support portable code, you can also use the etree.ETCompatXMLParser instead of the default etree.XMLParser. It tries to provide a default setup that is as close to the ElementTree parser as possible.
Not in the stdlib, I know, but in my experience the best bet when you need stuff that the standard ElementTree doesn't provide.
With the lxml API it couldn't be easier, though it is a bit "underdocumented":
If you need a top-level processing instruction, create it like this:
from lxml import etree
root = etree.Element("anytagname")
root.addprevious(etree.ProcessingInstruction("anypi", "anypicontent"))
The resulting document will look like this:
<?anypi anypicontent?>
<anytagname />
They certainly should add this to their FAQ because IMO it is another feature that sets this fine API apart.
Yeah, I don't believe it's possible, sorry. ElementTree provides a simpler interface to (non-namespaced) element-centric XML processing than DOM, but the price for that is that it doesn't support the whole XML infoset.
There is no apparent way to represent the content that lives outside the root element (comments, PIs, the doctype and the XML declaration), and these are also discarded at parse time. (Aside: this appears to include any default attributes specified in the DTD internal subset, which makes ElementTree strictly-speaking a non-compliant XML processor.)
You can probably work around it by subclassing or monkey-patching the Python native ElementTree implementation's write() method to call _write on your extra PIs before _writeing the _root, but it could be a bit fragile.
If you need support for the full XML infoset, probably best stick with DOM.
I don't know much about ElementTree. But it is possible that you might be able to solve your problem using a library I wrote called "xe".
xe is a set of Python classes designed to make it easy to create structured XML. I haven't worked on it in a long time, for various reasons, but I'd be willing to help you if you have questions about it, or need bugs fixed.
It has the bare bones of support for things like processing instructions, and with a little bit of work I think it could do what you need. (When I started adding processing instructions, I didn't really understand them, and I didn't have any need for them, so the code is sort of half-baked.)
Take a look and see if it seems useful.
http://home.avvanta.com/~steveha/xe.html
Here's an example of using it:
import xe
doc = xe.XMLDoc()
prefs = xe.NestElement("prefs")
prefs.user_name = xe.TextElement("user_name")
prefs.paper = xe.NestElement("paper")
prefs.paper.width = xe.IntElement("width")
prefs.paper.height = xe.IntElement("height")
doc.root_element = prefs
prefs.user_name = "John Doe"
prefs.paper.width = 8
prefs.paper.height = 10
c = xe.Comment("this is a comment")
doc.top.append(c)
If you ran the above code and then ran print doc here is what you would get:
<?xml version="1.0" encoding="utf-8"?>
<!-- this is a comment -->
<prefs>
<user_name>John Doe</user_name>
<paper>
<width>8</width>
<height>10</height>
</paper>
</prefs>
If you are interested in this but need some help, just let me know.
Good luck with your project.
f = open('D:\Python\XML\test.xml', 'r+')
old = f.read()
f.seek(44,0) #place cursor after xml declaration
f.write('<?xml-stylesheet type="text/xsl" href="C:\Stylesheets\expand.xsl"?>'+ old[44:])
I was facing the same problem and came up with this crude solution after failing to insert the PI into the .xml file correctly even after using one of the Element methods in my case root.insert (0, PI) and trying multiple ways to cut and paste the inserted PI to the correct location only to find the data to be deleted from unexpected locations.

Decoding a WBXML SyncML message from an S60 device

I'm trying to decode a WBXML encoded SyncML message from a Nokia N95.
My first attempt was to use the python pywbxml module which wraps calls to libwbxml. Decoding the message with this gave a lot of <unknown> tags and a big chunk of binary within a <Collection> tag. I tried running the contents of the <Collection> through by itself but it failed. Is there something I'm missing?
Also, does anyone know of a pure python implementation of a wbxml parser? Failing that a command line or online tool to decode these messages would be useful -- it would make it a lot easier for me to write my own...
Funnily enough I've been working on the same problem. I'm about halfway through writing my own pure-Python WBXML parser, but it's not yet complete enough to be useful, and I have very little time to work on it right now.
Those <Unknown> tags might be because pywbxml / libwbxml doesn't have the right tag vocabulary loaded. WBXML represents tags by an index number to avoid transmitting the same tag name hundreds of times, and the table that maps index numbers to tag names has to be supplied separately from the WBXML document itself. From a vague glance at the libwbxml source it seems like libwbxml has a bunch of tag tables hard coded. It has tables for SyncML 1.0-1.2; I think my Nokia E71 sends SyncML 1.3 (if so, your N95 probably does too), which it looks like libwbxml doesn't support yet.
Getting it to work might be as simple as adding a SyncML 1.3 table to libwbxml. That said, last time I tried, pywbxml doesn't compile against the vanilla libwbxml source, so you have to apply some patches first... so "simple" may be a relative term.
I ended up writing a python parser myself. I managed to do it by following the spec here:
http://www.w3.org/TR/wbxml/
And then taking the code tables from the horde.org cvs.
The open mobile alliance's site and documentation are terrible, this was a very trying project :(
I used pywbxml ,
Just needed one patch in pywbxml.pyx:
params.lang in function wbxml2xml around line 25 set to:
params.lang = WBXML_LANG_UNKNOWN
works like charm. Also changing base class for WBXMLParseError to exception helps:
class WBXMLParseError(Exception):

Categories

Resources