I am attempting am attempting to extract some information from a website that requires a post to an ajax script.
I am trying to create an automated script however I am consitently running into an HTTP 500 error. This is in contrast to a different data pull I did from a
url = 'http://www.ise.com/ExchangeDataService.asmx/Get_ISE_Dividend_Volume_Data/'
paramList = ''
paramList += '"' + 'dtStartDate' + '":07/25/2014"'
paramList += ','
paramList += '"' + 'dtEndDate' + '":07/25/2014"';
paramList = '{' + paramList + '}';
response = requests.post(url, headers={
'Content-Type': 'application/json; charset=UTF-8',
'data': paramList,
'dataType':'json'
})
I was wondering if anyone had any recommendations as to what is happening. This isn't proprietary data as they allow you to manually download it in excel format.
The input you're generating is not valid JSON. It looks like this:
{"dtStartDate":07/25/2014","dtEndDate":07/25/2014"}
If you look carefully, you'll notice a missing " before the first 07.
This is one of many reasons you shouldn't be trying to generate JSON by string concatenation. Either build a dict and use json.dump, or if you must, use a multi-line string as a template for str.format or %.
Also, as bruno desthuilliers points out, you almost certainly want to be sending the JSON as the POST body, not as a data header in an empty POST. Doing it the wrong way does happen to work with some back-ends, but only by accident, and that's certainly not something you should be relying on. And if the server you're talking to isn't one of those back-ends, then you're sending the empty string as your JSON data, which is just as invalid.
So, why does this give you a 500 error? Probably because the backend is some messy PHP code that doesn't have an error handler for invalid JSON, so it just bails with no information on what went wrong, so the server can't do anything better than send you a generic 500 error.
If that's a copy/paste from you actual code, 'data' is probably not supposed to be part of the request headers. As a side note: you don't "post to an ajax script", you post to an URL. The fact that this URL is called via an asynchronous request from some javascript on some page of the site is totally irrelevant.
it sounds like a server error. So what your posting could breaking their api due to its formatting.
Or their api could be down.
http://pcsupport.about.com/od/findbyerrormessage/a/500servererror.htm
Related
I am doing a straight forward request as follows.
import requests
def user_transactions():
url = 'https://webapi.coinfloor.co.uk/v2/bist/XBT/GBP/user_transactions/'
data = {'key':'value'}
r = requests.post(url, data=data, auth=("some_username", "some_password") )
print(r.status_code)
print(r.text)
return
Even though data= is optional in the documents.
https://www.w3schools.com/python/ref_requests_post.asp
If i comment out the data variable then the routine returns a
status_code=415 error.
If i include in the data variable then the routine returns a status_code=200 success.
I have tried to look this up, for example here:
Python request gives 415 error while post data , but with no answer.
The question is: Why is it the case that [1] fails but [2] works ?
Yes, data is optional on the python side. The requests library will happily send a empty request to the server, as you can see. If the argument was not optional, the program would crash before sending a request so there would be no status code.
However, the server needs to be able to process the request. If it does not like what you sent for whatever reason, it might send back a 4xx status code, or otherwise not do what you expect.
In this case, it throws an error that the data is in invalid format. How can a empty request be in invalid format? Because the format is specified in a header. If you supply a data argumet requests will send data in urlencoded format, and specify in the header what format the data is in. If the data is empty, the request will be empty but the header will still be there. This site apparently requires the header to specify a data format it knows.
You can solve this in two ways, giving an empty object:
r = requests.post(url, data={}, auth=("some_username", "some_password") )
Or by explicitly specifying the header:
r = requests.post(url, auth=(...), headers={'Content-Type': 'application/x-www-form-urlencoded'})
Side note: You should not be using W3Schools as a source. It is frequently inaccurate and often recommends bad practices.
I think you are mistaking the documentation of the requests.post function signature with API documentation. It is saying that data is a keyword argument, not that the API optionally takes data.
It depends on the API endpoint you are trying to use. That endpoint must require data to be sent with the request. If you look at the documentation for the API you are using, it will mention what needs to be sent for a valid request.
EDIT:
In a similar vein, when I now try to log into their account with a post request, what is returned is none of the errors they suggest on their site, but is in fact a "JSON exception". Is there any way to debug this, or is an error code 500 completely impossible to deal with?
I'm well aware this question has been asked before. Sadly, when trying the proposed answers, none worked. I have an extremely simple Python project with urllib, and I've never done web programming in Python before, nor am I even a regular Python user. My friend needs to get access to content from this site, but their user-friendly front-end is down and I learned that they have a public API to access their content. Not knowing what I'm doing, but glad to try to help and interested in the challenge, I have very slowly set out.
Note that it is necessary for me to only use standard Python libraries, so that any finished project could easily be emailed to their computer and just work.
The following works completely fine minus the "originalLanguage" query, but when using it, which the API has documented as an array value, no matter whether I comma-separate things, or write "originalLanguage[0]" or "originalLanguage0" or anything that I've seen online, this creates the error message from the server: "Array value expected but string detected" or something along those lines.
Is there any way for me to get this working? Because it clearly can work, otherwise the API wouldn't document it. Many thanks.
In case it helps, when using "[]" or "<>" or "{}" or any delimeter I could think of, my IDE didn't recognise it as part of the URL.
import urllib.request as request
import urllib.parse as parse
def make_query(url, params):
url += "?"
for i in range(len(params)):
url += list(params)[i]
url += '='
url += list(params.values())[i]
if i < len(params) - 1:
url += '&'
return url
base = "https://api.mangadex.org/manga"
params = {
"limit": "50",
"originalLanguage": "en"
}
url = make_query(base, params)
req = request.Request(url)
response = request.urlopen(req)
I'm trying to make an FQL query with the following code:
def get_postData(to_post, access_token):
postData = {}
postData["method"] = "fql.query"
postData["query"] = to_post
postData["access_token"] = access_token
return postData
def make_request(url, to_post, access_token):
postData = get_postData(to_post, access_token)
return requests.post(url, data = postData).json()[u'data']
Using POST requests is not the best documented in the docs, and I'm unable to get this to work. With either "fql.query" or "fql" specified under method (taken from the Javascript specific example here: How can I execute a FQL query with Facebook Graph API), I get the response:
{u'error': {u'message': u'Unsupported method, fql.query', u'code': 100, u'type': u'GraphMethodException'}}
Which is, of course, not covered in the docs. Without that method specification, I get back:
{u'error': {u'message': u'Unsupported post request.', u'code': 100, u'type': u'GraphMethodException'}}
Which is also not covered in the docs. I'm not able to use a get request here (which is trivial), as I'm making a rather large query that at the moment doesn't overflow the get request limits but very well could in the near future.
Thanks for any help you may be able to give with regards to solving this problem.
EDIT: Should note I'm making the request to:
https://graph.facebook.com
First of all, what URL are you trying to access? I mean, why do you need POST request for FQL? FQL is for fetching data, not for posting.
According to docs (https://developers.facebook.com/docs/technical-guides/fql/) your request should look like that:
https://graph.facebook.com/fql?q=QUERY&access_token=TOKEN - where QUERY - is your urlencoded query to FQL, TOKEN - your valid access token.
All you need to do is understand how your requests are built, if you understand that, then the errors will make more sense to you.
postData = {}
postData["method"] = "fql.query"
postData["query"] = to_post
postData["access_token"] = access_token
requests.post(url, data = postData).json()[u'data']
Without even running this, I know the request looks like
POST https://graph.facebook.com/?method=fql.query&query=THE_QUERY&access_token=THE_TOKEN
Which is not the fql.method of method/fql.query as shown as the relative url in the doc https://developers.facebook.com/docs/reference/api/batch/ you presented.
Removing the specification (I don't know why you want to do this) will obviously result in an unknown error since this is now the request you are making
POST https://graph.facebook.com/?query=THE_QUERY&access_token=THE_TOKEN
The correct request will be
GET https://api-read.facebook.com/restserver.php?method=fql.query&query=THE_QUERY&access_token=THE_TOKEN
or
GET https://api.facebook.com/method/fql.query&query=THE_QUERY&access_token=THE_TOKEN
I'm not entire sure what endpoint the batch uses that allows an HTTP POST to method/fql.query so I wouldn't rely on it unless you are actually doing batch requests.
In the end using fql.query may not be the best way to go since it's on its way to deprecation.
I'm still unsure how your query could be so long that it exceeds the GET request limit. Consider re-evaluating how you structure your query as a multi-query or in batch.
Is there a way to easily extract the json data portion in the body of a POST request?
For example, if someone posts to www.example.com/post with the body of the form with json data, my GAE server will receive the request by calling:
jsonstr = self.request.body
However, when I look at the jsonstr, I get something like :
str: \r\n----------------------------8cf1c255b3bd7f2\r\nContent-Disposition: form-data;
name="Actigraphy"\r\n Content-Type: application/octet-
stream\r\n\r\n{"Data":"AfgCIwHGAkAB4wFYAZkBKgHwAebQBaAD.....
I just want to be able to call a function to extract the json part of the body which starts at the {"Data":...... section.
Is there an easy function I can call to do this?
there is a misunderstanding, the string you show us is not json data, it looks like a POST body. You have to parse the body with something like cgi.parse_multipart.
Then you could parse json like answered by aschmid00. But instead of the body, you parse only the data.
Here you can find a working code that shows how to use cgi.FieldStorage for parsing the POST body.
This Question is also answered here..
It depends on how it was encoded on the browser side before submitting, but normally you would get the POST data like this:
jsonstr = self.request.POST["Data"]
If that's not working you might want to give us some info on how "Data" was encoded into the POST data on the client side.
you can try:
import json
values = 'random stuff .... \r\n {"data":{"values":[1,2,3]}} more rnandom things'
json_value = json.loads(values[values.index('{'):values.rindex('}') + 1])
print json_value['data'] # {u'values': [1, 2, 3]}
print json_value['data']['values'] # [1, 2, 3]
but this is dangerous and takes a fair amount of assumptions, Im not sure which framework you are using, bottle, flask, theres many, please use the appropriate call to POST
to retrieve the values, based on the framework, if indeed you are using one.
I think you mean to do this self.request.get("Data") If you are using the GAE by itself.
https://developers.google.com/appengine/docs/python/tools/webapp/requestclass#Request_get
https://developers.google.com/appengine/docs/python/tools/webapp/requestclass#Request_get_all
I'm trying to extract the response header of a URL request. When I use firebug to analyze the response output of a URL request, it returns:
Content-Type text/html
However when I use the python code:
urllib2.urlopen(URL).info()
the resulting output returns:
Content-Type: video/x-flv
I am new to python, and to web programming in general; any helpful insight is much appreciated. Also, if more info is needed please let me know.
Thanks in advance for reading this post
Try to request as Firefox does. You can see the request headers in Firebug, so add them to your request object:
import urllib2
request = urllib2.Request('http://your.tld/...')
request.add_header('User-Agent', 'some fake agent string')
request.add_header('Referer', 'fake referrer')
...
response = urllib2.urlopen(request)
# check content type:
print response.info().getheader('Content-Type')
There's also HTTPCookieProcessor which can make it better, but I don't think you'll need it in most cases. Have a look at python's documentation:
http://docs.python.org/library/urllib2.html
Content-Type text/html
Really, like that, without the colon?
If so, that might explain it: it's an invalid header, so it gets ignored, so urllib guesses the content-type instead, by looking at the filename. If the URL happens to have ‘.flv’ at the end, it'll guess the type should be video/x-flv.
This peculiar discrepancy might be explained by different headers (maybe ones of the accept kind) being sent by the two requests -- can you check that...? Or, if Javascript is running in Firefox (which I assume you're using when you're running firebug?) -- since it's definitely NOT running in the Python case -- "all bets are off", as they say;-).
Keep in mind that a web server can return different results for the same URL based on differences in the request. For example, content-type negotiation: the requestor can specify a list of content-types it will accept, and the server can return different results to try to accomodate different needs.
Also, you may be getting an error page for one of your requests, for example, because it is malformed, or you don't have cookies set that authenticate you properly, etc. Look at the response itself to see what you are getting.
according to http://docs.python.org/library/urllib2.html there is only get_header() method and nothing about getheader .
Asking because Your code works fine for
response.info().getheader('Set cookie')
but once i execute
response.info().get_header('Set cookie')
i get:
Traceback (most recent call last):
File "baza.py", line 11, in <module>
cookie = response.info().get_header('Set-Cookie')
AttributeError: HTTPMessage instance has no attribute 'get_header'
edit:
Moreover
response.headers.get('Set-Cookie') works fine as well, not mentioned in urlib2 doc....
for getting raw data for the headers in python2, a little bit of a hack but it works.
"".join(urllib2.urlopen("http://google.com/").info().__dict__["headers"])
basically "".join(list) will the list of headers, which all include "\n" at the end.
__dict__ is a built in python variable for all dicts, basically you can select a list out of a 2d array with it.
and ofcourse ["headers"] is selecting the list value from the .info() response value dict
hope this helped you learn a few ez python tricks :)