I'm trying to consume a webservice with python Zeep that has a parameter of type xsd:base64Binary technical document specify type as: Byte[]
Errors are:
urllib3.exceptions.HeaderParsingError: [StartBoundaryNotFoundDefect(), MultipartInvariantViolationDefect()], unparsed data: ''
and on the reply I get: Generic error "data at the root level is invalid.
I can't find the correct way to do it.
My code is:
content=open(fileName,"r").read()
encodedContent = base64.b64encode(content.encode('ascii'))
myParameter=dict(param=dict(XMLFile=encodedContent))
client.service.SendFile(**myParameter)
thanks everyone for the comments.
Mike
This is how the built-in type of Base64Binary looks like in zeep:
class Base64Binary(BuiltinType):
accepted_types = [str]
_default_qname = xsd_ns("base64Binary")
#check_no_collection
def xmlvalue(self, value):
return base64.b64encode(value)
def pythonvalue(self, value):
return base64.b64decode(value)
As you can see, it's doing the encoding and decoding by itself. You don't need to encode the file content, you have to send it as it is and zeep will encode it before putting it on the wire.
Most likely this is causing the issue. When the message element is decoded, an array of bytes is expected but another base64 string is found there.
Related
I am working on a program that reads the content of a Restful API from ImportIO. The connection works, and data is returned, but it's a jumbled mess. I'm trying to clean it to only return Asins.
I have tried using the split keyword and delimiter to no success.
stuff = requests.get('https://data.import.io/extractor***')
stuff.content
I get the content, but I want to extract only Asins.
results
While .content gives you access to the raw bytes of the response payload, you will often want to convert them into a string using a character encoding such as UTF-8. the response will do that for you when you access .text.
response.txt
Because the decoding of bytes to str requires an encoding scheme, requests will try to guess the encoding based on the response’s headers if you do not specify one. You can provide an explicit encoding by setting .encoding before accessing .text:
If you take a look at the response, you’ll see that it is actually serialized JSON content. To get a dictionary, you could take the str you retrieved from .text and deserialize it using json.loads(). However, a simpler way to accomplish this task is to use .json():
response.json()
The type of the return value of .json() is a dictionary, so you can access values in the object by key.
You can do a lot with status codes and message bodies. But, if you need more information, like metadata about the response itself, you’ll need to look at the response’s headers.
For More Info: https://realpython.com/python-requests/
What format is the return information in? Typically Restful API's will return the data as json, you will likely have luck parsing the it as a json object.
https://realpython.com/python-requests/#content
stuff_dictionary = stuff.json()
With that, you can load the content is returned as a dictionary and you will have a much easier time.
EDIT:
Since I don't have the full URL to test, I can't give an exact answer. Given the content type is CSV, using a pandas DataFrame is pretty easy. With a quick StackOverflow search, I found the following answer: https://stackoverflow.com/a/43312861/11530367
So I tried the following in the terminal and got a dataframe from it
from io import StringIO
import pandas as pd
pd.read_csv(StringIO("HI\r\ntest\r\n"))
So you should be able to perform the following
from io import StringIO
import pandas as pd
df = pd.read_csv(StringIO(stuff.content))
If that doesn't work, consider dropping the first three bytes you have in your response: b'\xef\xbb\xf'. Check the answer from Mark Tolonen to get parse this.
After that, selecting the ASIN (your second column) from your dataframe should be easy.
asins = df.loc[:, 'ASIN']
asins_arr = asins.array
The response is the byte string of CSV content encoded in UTF-8. The first three escaped byte codes are a UTF-8-encoded BOM signature. So stuff.content.decode('utf-8-sig') should decode it. stuff.text may also work if the encoding was returned correctly in the response headers.
I am writing a program (python Python 3.5.2) that uses a HTTPSConnection to get a JSON object as a response. I have it working using some example code, but am not sure where a method comes from.
My question is this: In the code below, the decode('utf-9') method doesn't exist in the documentation at https://docs.python.org/3.4/library/http.client.html#http.client.HTTPResponse under "21.12.2. HTTPResponse Objects". How would I know that the return value from the method "response.read()" has the method "decode('utf-8')" available?
Do Python objects inherit from a base class like C# objects do or am I missing something?
http = HTTPSConnection(get_hostname(token))
http.request('GET', uri_path, headers=get_authorization_header(token))
response = http.getresponse()
print(response.status, response.reason)
feed = json.loads(response.read().decode('utf-8'))
Thank you for your help.
The read method of the response object always returns a byte string (in Python 3, which I presume you are using as you use the print function). The byte string does indeed have a decode method, so there should be no problem with this code. Of course it makes the assumption that the response is encoded in UTF-8, which may or may not be correct.
[Technical note: email is a very difficult medium to handle: messages can be made up of different parts, each of which is differently encoded. At least with web traffic you stand a chance of reading the Content-Type header's charset attribute to find the correct encoding].
I am reading an email file stored in my machine,able to extract the headers of the email, but unable to extract the body.
# The following part is working , opening a file and reading the header .
import email
from email.parser import HeaderParser
with open(passedArgument1+filename,"r",encoding="ISO-8859-1") as f:
msg=email.message_from_file(f)
print('message',msg.as_string())
parser = HeaderParser()
h = parser.parsestr(msg.as_string())
print (h.keys())
# The following snippet gives error
msgBody=msg.get_body('text/plain')
Is there any proper way to extract only the body message.Stuck at this point.
For reference the email file can be downloaded from
https://drive.google.com/file/d/0B3XlF206d5UrOW5xZ3FmV3M3Rzg/view
The 3.6 email lib uses an API that is compatible with Python 3.2 by default and that is what is causing you this problem.
Note the default policy in the declaration below from the docs:
email.message_from_file(fp, _class=None, *, policy=policy.compat32)
If you want to use the "new" API that you see in the 3.6 docs, you have to create the message with a different policy.
import email
from email import policy
...
msg=email.message_from_file(f, policy=policy.default)
will give you the new API that you see in the docs which will include the very useful: get_body()
Update
If you are having the AttributeError: 'Message' object has no attribute 'get_body' error, you might want to read what follows.
I did some tests, and it seems the doc is indeed erroneous compared to the current library implementation (July 2017).
What you might be looking for is actually the function get_payload() it seems to do what you want to achieve:
The conceptual model provided by an EmailMessage object is that of an
ordered dictionary of headers coupled with a payload that represents
the RFC 5322 body of the message, which might be a list of
sub-EmailMessage objects
get_payload() is not in current July 2017 Documentation, but the help() says the following:
get_payload(i=None, decode=False) method of email.message.Message instance
Return a reference to the payload.
The payload will either be a list object or a string. If you mutate
the list object, you modify the message's payload in place. Optional
i returns that index into the payload.
Optional decode is a flag indicating whether the payload should be decoded or not, according to the Content-Transfer-Encoding
header (default is False).
When True and the message is not a multipart, the payload will be decoded if this header's value is 'quoted-printable' or 'base64'. If some other encoding is used, or the header is missing, or if the payload has bogus data (i.e. bogus base64 or uuencoded data), the payload is returned as-is.
If the message is a multipart and the decode flag is True, then None is returned.
I'm trying to interpret data from the Twitch API with Python. This is my code:
from twitch.api import v3
import json
streams = v3.streams.all(limit=1)
list = json.loads(streams)
print(list)
Then, when running, I get:
TypeError, "the JSON object must be str, not 'dict'"
Any ideas? Also, is this a method in which I would actually want to use data from an API?
Per the documentation json.loads() will parse a string into a json hierarchy (which is often a dict). Therefore, if you don't pass a string to it, it will fail.
json.loads(s, encoding=None, cls=None, object_hook=None,
parse_float=None, parse_int=None, parse_constant=None,
object_pairs_hook=None, **kw) Deserialize s (a str instance containing
a JSON document) to a Python object using this conversion table.
The other arguments have the same meaning as in load(), except
encoding which is ignored and deprecated.
If the data being deserialized is not a valid JSON document, a
JSONDecodeError will be raised.
From the Twitch API we see that the object being returned by all() is a V3Query. Looking at the source and documentation for that, we see it is meant to return a list. Thus, you should treat that as a list rather than a string that needs to be decoded.
Specifically, the V3Query is a subclass of ApiQuery, in turn a subclass of JsonQuery. That class explicitly runs the query and passes a function over the results, get_json. That source explicitly calls json.loads()... so you don't need to! Remember: never be afraid to dig through the source.
after streams = v3.streams.all(limit=1)
try using
streams = json.dumps(streams)
As the streams should be a JSON string and be in the form:
'{"key":value}'
instead of just dict form:
{"key":value}
I am trying to use the requests library in Python to push data (a raw value) to a firebase location.
Say, I have urladd (the url of the location with authentication token). At the location, I want to push a string, say International. Based on the answer here, I tried
data = {'.value': 'International'}
p = requests.post(urladd, data = sjson.dumps(data))
I get <Response [400]>. p.text gives me:
u'{\n "error" : "Invalid data; couldn\'t parse JSON object, array, or value. Perhaps you\'re using invalid characters in your key names."\n}\n'
It appears that they key .value is invalid. But that is what the answer linked above suggests. Any idea why this may not be working, or how I can do this through Python? There are no problems with connection or authentication because the following works. However, that pushes an object instead of a raw value.
data = {'name': 'International'}
p = requests.post(urladd, data = sjson.dumps(data))
Thanks for your help.
The answer you've linked is a special case for when you want to assign a priority to a value. In general, '.value' is an invalid name and will throw an error.
If you want to write just "International", you should write the stringified-JSON version of that data. I don't have a python example in front of me, but the curl command would be:
curl -X POST -d "\"International\"" https://...
Andrew's answer above works. In case someone else wants to know how to do this using the requests library in Python, I thought this would be helpful.
import simplejson as sjson
data = sjson.dumps("International")
p = requests.post(urladd, data = data)
For some reason I had thought that the data had to be in a dictionary format before it is converted to stringified JSON version. That is not the case, and a simple string can be used as an input to sjson.dumps().