python requests.put() fails when urllib3 http.request('PUT', ...) succeeds. What gives? - python

I am trying to hit the Atlassian Confluence REST API using python requests.
I've successfully called a GET api, but when I call the PUT to update a confluence page, it returns 200, but didn't update the page.
I used chrome::YARC to verify that the API was working properly (which it was). After a while trying to debug it, I reverted to try using urllib3, which worked just fine.
I'd really like to use requests, but I can't for the life of me figure this one out after hours and hours of trying to debug, Google, etc.
I'm running Mac/Python3:
$ uname -a
Darwin mylaptop.local 16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64 x86_64
$ python3 --version
Python 3.6.1
Here's my code that shows all three ways I'm trying this (two requests and one urllib3):
def update(self, spaceKey, pageTitle, newContent, contentType='storage'):
if contentType not in ('storage', 'wiki', 'plain'):
raise ValueError("Invalid contentType={}".format(contentType))
# Get current page info
self._refreshPage(spaceKey, pageTitle) # I retrieve it before I update it.
orig_version = self.version
# Content already same as requested content. Do nothing
if self.wiki == newContent:
return
data_dict = {
'type' : 'page',
'version' : {'number' : self.version + 1},
'body' : {
contentType : {
'representation' : contentType,
'value' : str(newContent)
}
}
}
data_json = json.dumps(data_dict).encode('utf-8')
put = 'urllib3' #for now until I figure out why requests.put() doesn't work
enable_http_logging()
if put == 'requests':
r = self._cs.api.content(self.id).PUT(json=data_dict)
r.raise_for_status()
elif put == 'urllib3':
urllib3.disable_warnings() # I know, you can quit your whining now!!!
headers = { 'Content-Type' : 'application/json;charset=utf-8' }
auth_header = urllib3.util.make_headers(basic_auth=":".join(self._cs.session.auth))
headers = {**headers, **auth_header}
http = urllib3.PoolManager()
r = http.request('PUT', str(self._cs.api.content(self.id)), body=data_json, headers=headers)
else:
raise ValueError("Huh? Unknown put type: {}".format(put))
enable_http_logging(False)
# Verify page was updated
self._refreshPage(spaceKey, pageTitle) # Check for changes
if self.version != orig_version + 1:
raise RuntimeError("Page not updated. Still at version {}".format(self.version))
if self.wiki != newContent:
raise RuntimeError("Page version updated, but not content.")
Any help would be great.
Update 1: Adding request dump
-----------START-----------
PUT http://confluence.myco.com/rest/api/content/101904815
User-Agent: python-requests/2.18.4
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive
Content-Length: 141
Content-Type: application/json
Authorization: Basic <auth-token-here>==
b'{"type": "page", "version": {"number": 17}, "body": {"storage": {"representation": "storage", "value": "new body here version version 17"}}}'

requests never went back to PUT (Bug???)
What you're observing is requests behaving consistently with web browsers: reacting to HTTP 302 redirect with a GET request.
From Wikipedia:
The user agent (e.g. a web browser) is invited by a response with this code to make a second, otherwise identical, request to the new URL specified in the location field.
(...)
Many web browsers implemented this code in a manner that violated this standard, changing the request type of the new request to GET, regardless of the type employed in the original request (e.g. POST)
(...)
As a consequence, the update of RFC 2616 changes the definition to allow user agents to rewrite POST to GET.
So this behaviour is consistent with RFC 2616. I don't think we can say which of the two libraries behaves "more correctly".

Looks like a difference in how the requests and urllib3 modules deal with switching from http to https. (See #Kos answer above). Here's what I found when I checked the debug logs.
So I got to thinking after #JonClements suggested I send him the Response dump. After doing some research I found the magic runs to enable debugging for requests and urllib3 (See here).
In looking at the diffs from both, I noticed that they were being redirected from http to https for my companies confluence site:
urllib3:
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): confluence.myco.com
DEBUG:urllib3.connectionpool:http://confluence.myco.com:80 "PUT /rest/api/content/101906196 HTTP/1.1" 302 237
DEBUG:urllib3.util.retry:Incremented Retry for (url='http://confluence.myco.com/rest/api/content/101906196'): Retry(total=2, connect=None, read=None, redirect=None, status=None)
INFO:urllib3.poolmanager:Redirecting
http://confluence.myco.com/rest/api/content/101906196 ->
https://confluence.myco.com/rest/api/content/101906196
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): confluence.myco.com
DEBUG:urllib3.connectionpool:https://confluence.myco.com:443 "PUT /rest/api/content/101906196 HTTP/1.1" 200 None
while requests tried with my PUT and then after redirecting went to GET:
DEBUG:urllib3.connectionpool:http://confluence.myco.com:80 "PUT /rest/api/content/101906196 HTTP/1.1" 302 237
DEBUG:urllib3.connectionpool:https://confluence.myco.com:443 "GET /rest/api/content/101906196 HTTP/1.1" 200 None
requests never went back to PUT
I changed my initial url from http: to https: and everything worked fine.

Related

How to make Python Requests follow a POST redirect?

Using the following code, the server responds with a 301 redirect, but the client changes POST to GET, which is useless behavior because that GET endpoint does not exist. Using CURL -L -X POST works properly. This behavior is the same using python2 and python3 and on several versions of Raspbian.
>>> import requests
>>> url = "https://registry.micronets.in/mud/v1/register-
device/DAWG/AgoNDQcDDgg/aabbccddeeffgg"
>>> response = requests.post(url)
>>> response
<Response [404]>
# Server Log: (Note - both endpoints, are on the same server using virtual hosts)
redirecting to: https://hotdawg.micronets.in/registry/devices/register- device/AgoNDQcDDgg/aabbccddeeffgg
POST /registry/v1/register-device/DAWG/AgoNDQcDDgg/aabbccddeeffgg 301 16.563 ms - 122
{
"status": 404
}
GET /vendors//register-device/AgoNDQcDDgg/aabbccddeeffgg 404 0.604 ms - 14
# CURL version (succeeds)
curl -L -X POST "https://registry.micronets.in/mud/v1/register-
device/DAWG/AgoNDQcDDgg/aabbccddeeffgg"
Device registered (insert): {
"model": "AgoNDQcDDgg",
"pubkey": "aabbccddeeffgg",
"timestamp": "2019-12-27 15:44:14 UTC",
"_id": "HBlQzXfBnoB3N4fN"
}
# Server Log: (from CURL)
redirecting to: https://hotdawg.micronets.in/registry/devices/register-
device/AgoNDQcDDgg/aabbccddeeffgg
POST /registry/v1/register-device/DAWG/AgoNDQcDDgg/aabbccddeeffgg 301 0.364 ms - 122
POST /vendors//register-device/AgoNDQcDDgg/aabbccddeeffgg 200 1.745 ms - 157
I'd rather accept a better answer, but otherwise I plan to work around the problem as follows:
response = requests.post(url, allow_redirects=False)
if response.status_code == 301:
response = requests.post(response.headers['Location'])
or
response = requests.post(url, allow_redirects=False)
i=10
while i > 0 and response.status_code == 301:
response = requests.post(response.headers['Location'], allow_redirects=False)
i -= 1
Check your nginx configuration: Reference
You've already sent a POST request which is already reached the server successfully.
Now the server is handling the POST request and handling it according to your assigned rules.
Here's the issue occurred as the redirection assignment on your configuration is calling the end url by GET request.
You are blaming requests that it's not handling the redirect but it's already handled it correctly. but you compare it to curl which you used with it -L which is force the server the server by default to push the request. (etc, curl --list-only -X POST "https://registry.micronets.in/mud/v1/register-device/DAWG/AgoNDQcDDgg/aabbccddeeffgg") where you are using -X which is for HTTP over an HTTPS url.
-X, --request will be used for all requests, which if you for example use -L, --location may cause unintended side-effects when curl doesn't change request method according to the HTTP 30x response codes - and similar. <<< Pay attention to the part of curl doesn't change request method which is POST in your case and already forced the server with it. but for requestsit's completely different thing.
import requests
with requests.Session() as ses:
r = ses.post(
"https://registry.micronets.in/mud/v1/register-device/DAWG/AgoNDQcDDgg/aabbccddeeffgg", allow_redirects=True)
print(r.history[0].headers['Location'])
Output:
https://hotdawg.micronets.in/registry/devices/register-device/AgoNDQcDDgg/aabbccddeeffgg
at the end, i do believe this nginx server which is behind linode servers, and that's a common issue.
the knowledge is here:
import requests
with requests.Session() as ses:
r = ses.post(
"https://registry.micronets.in/mud/v1/register-device/DAWG/AgoNDQcDDgg/aabbccddeeffgg", allow_redirects=True)
print(r.history, r.history[0].reason)
print(r.status_code, r.reason)
Output:
[<Response [301]>] Moved Permanently
404 Not Found
But POST To it !
r = requests.post(
"https://hotdawg.micronets.in/registry/devices/register-device/AgoNDQcDDgg/aabbccddeeffgg")
print(r)
Output:
<Response [200]>
Which confirm that you sent POST request and it's completely switched to GET.
with that said, as i explained above that curl -X is set to force the server with the method used which is POST till the end point.

Python request sessions is not working for put/post in next call

Hi i am trying to access one rest api with can be accessed only after login. I was using below code but getting 401, access denied. I am sure if same cookies will be applied to next put call, it will not give access denied. but python session is not using the same cookies.. instead adding new cookies..thanks..
with requests.Session() as s:
logging.info("Trying to login")
response1 = s.post("https://localhost:8080/api/authentication?j_username=admin&j_password=admin", verify=False)
for cookie in s.cookies:
logging.info(str(cookie.name) + " : " + str(cookie.value))
logging.info("logged in successfully " + str(response1.status_code))
url = url1 % (params['key'])
logging.info("inspector profile inpect api : " + url)
response = s.put(url, verify=False)
for cookie in s.cookies:
logging.info(str(cookie.name) + " :: " + str(cookie.value))
logging.info("code:-->"+ str(response.status_code))
Output is
CSRF-TOKEN : c3ea875b-3df9-4bd4-992e-2b976c150ea6
JSESSIONID : M3WWdp0PO95ENQSJciqiEbiHZR6ge7O8HkKDkY6R
logged in successfully 200
profile api : localhost:8080/api/test/283
CSRF-TOKEN :--> e5b64a66-5402-430b-8f51-d8d7549fd84e
JSESSIONID :--> JUZBHKmqsitvlrPvWuaqfTJNH1PIJcEXPTkPYPKk
CSRF-TOKEN :--> c3ea875b-3df9-4bd4-992e-2b976c150ea6
JSESSIONID :--> M3WWdp0PO95ENQSJciqiEbiHZR6ge7O8HkKDkY6R
code:401
Looks like next api call is not using the cookies, please help me out.
Just finished debugging the same issue.
By RFC 2965:
The term effective host name is related to host name. If a host name
contains no dots, the effective host name is that name with the
string .local appended to it. Otherwise the effective host name is
the same as the host name. Note that all effective host names
contain at least one dot.
Python Requests module uses http.cookiejar module to handle the cookies. It verifies the received cookies before applying them to a session.
Use the following code to get debug output:
import logging
import http.cookiejar
logging.basicConfig(level=logging.DEBUG)
http.cookiejar.debug = True
Here is an example, when received cookie is not applied:
DEBUG:http.cookiejar:add_cookie_header
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost
DEBUG:urllib3.connectionpool:http://localhost:80 "POST /api/login HTTP/1.1" 200 6157
DEBUG:http.cookiejar:extract_cookies: Date: Thu, 30 Apr 2020 15:45:11 GMT
Server: Werkzeug/0.14.1 Python/3.5.3
Content-Type: application/json
Content-Length: 6157
Set-Cookie: token=1234; Domain=localhost; Path=/
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
DEBUG:http.cookiejar: - checking cookie token=1234
DEBUG:http.cookiejar: non-local domain .localhost contains no embedded dot
Requests sent to localhost, expect web server to set domain part of a cookie to localhost.local
Here is an example, when received cookie was applied correctly:
DEBUG:http.cookiejar:add_cookie_header
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): localhost
DEBUG:urllib3.connectionpool:http://localhost:80 "POST /api/login HTTP/1.1" 200 6157
DEBUG:http.cookiejar:extract_cookies: Date: Thu, 30 Apr 2020 15:52:08 GMT
Server: Werkzeug/0.14.1 Python/3.5.3
Content-Type: application/json
Content-Length: 6157
Set-Cookie: token=1234; Domain=localhost.local; Path=/
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
DEBUG:http.cookiejar: - checking cookie token=1234
DEBUG:http.cookiejar: setting cookie: <Cookie token=1234 for .localhost.local/>
If you cannot fix the web server, use 127.0.0.1 instead of localhost in your request:
response1 = s.post("https://127.0.0.1:8080/api/authentication?j_username=admin&j_password=admin", verify=False)
This code worked for me:
from requests import Session
s = Session()
s.auth = ('username', 'password')
s.get('http://host'+'/login/page/')
response = s.get('http://host'+'/login-required-pages/')
You did not actually authenticate successfully to the website despite having CSRF-TOKEN and JSESSIONID cookies. The session data, including whether or not you're authenticated, are stored on the server side, and those cookies you're getting are only keys to such session data.
One problem I see with the way you're authenticating is that you're posting username and password as query string, which is usually only for GET requests.
Try posting with proper payload instead:
response1 = s.post("https://localhost:8080/api/authentication", data={'j_username': 'admin', 'j_password': 'admin'}, verify=False)

JSON in post request works in HttpRequester but not in python Requests

I'm stuck in web scraping a page using Python. Basically, the following is the request from HttpRequester (in Mozilla) and it gives me the right response.
POST https://www.hpe.com/h20195/v2/Library.aspx/LoadMore
Content-Type: application/json
{"sort": "csdisplayorder", "hdnOffset": "1", "uniqueRequestId": "d6da6a30bdeb4d77b0e607a6b688de1e", "test": "", "titleSearch": "false", "facets": "wildcatsearchcategory#HPE,cshierarchycategory#No,csdocumenttype#41,csproducttype#18964"}
-- response --
200 OK
Cache-Control: private, max-age=0
Content-Length: 13701
Content-Type: application/json; charset=utf-8
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Date: Sat, 28 May 2016 04:12:57 GMT
Connection: keep-alive
The exact same operation in python 2.7.1 using Requests, fails with an error. The following is the code snippet:
jsonContent = {"sort": "csdisplayorder", "hdnOffset": "1", "uniqueRequestId": "d6da6a30bdeb4d77b0e607a6b688de1e", "test": "", "titleSearch": "false", "facets": "wildcatsearchcategory#HPE,cshierarchycategory#No,csdocumenttype#41,csproducttype#18964"}
catResponse = requests.post('https://www.hpe.com/h20195/v2/Library.aspx/LoadMore', json = jsonContent)
The following is the error that I get:
{"Message":"Value cannot be null.\r\nParameter name: source","StackTrace":" at
System.Linq.Enumerable.Contains[TSource](IEnumerable`1 source, TSource value, I
EqualityComparer`1 comparer)\r\n
More information:
The Post request that I'm looking for is fired upon:
opening this web page: https://www.hpe.com/h20195/v2/Library.aspx?doctype=41&doccompany=HPE&footer=41&filter_doctype=no&filter_doclang=no&country=&filter_country=no&cc=us&lc=en&status=A&filter_status=rw#doctype-41&doccompany-HPE&prodtype_oid-18964&status-a&sortorder-csdisplayorder&teasers-off&isRetired-false&isRHParentNode-false&titleCheck-false
Clicking on the "Load more" grey button at the end of the page
I'm capturing the exact set of request headers and response from the browser operation and trying to mimic that in Postman, Python code and HttpRequester (Mozilla).
It flags the same error (mentioned above) with Postman and Python, but works with no headers set on my part with HttpRequester.
Can anyone think of an explanation for this?
If both Postman and requests are receiving an error, then there is more context than what HttpRequester is showing. There are a number of headers that I'd expect to be set almost always, including User-Agent and Content-Length, that are missing here.
The usual suspects are cookies (look for Set-Cookie headers in earlier requests, preserve those by using a requests.Session() object), the User-Agent header and perhaps a Referrer header, but do look for other headers like anything starting with Accept, for example.
Have HttpRequester post to http://httpbin.org/post instead for example, and inspect the returned JSON, which tells you what headers were sent. This won't include cookies (those are domain-specific), but anything else could potentially be something the server looks for. Try such headers one by one if cookies are not helping.

python-requests making a GET instead of POST request

I have a daily cron which handles some of the recurring events at my app, and from time to time I notice one weird error that pops up in logs. The cron, among other things, does a validation of some codes, and it uses the webapp running on the same server, so the validation request is made via POST request with some data.
url = 'https://example.com/validate/'
payload = {'pin': pin, 'sku': sku, 'phone': phone, 'AR': True}
validation_post = requests.post(url, data=payload)
So, this makes the actual request and I log the response. From time to time, and recently up to 50% of the request, the response contains the following message from nginx:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>405 Method Not Allowed</title>
<h1>Method Not Allowed</h1>
<p>The method GET is not allowed for the requested URL.</p>
So, the actual request was made using the GET method, not the POST as it was instructed in the code. In the nginx access.log I can see that entry:
123.123.123.123 - - [18/Feb/2015:12:26:50 -0500] "GET /validate/ HTTP/1.1" 405 182 "-" "python-requests/2.2.1 CPython/2.7.6 Linux/3.13.0-37-generic"
And the uwsgi log for the app shows the similar thing:
[pid: 6888|app: 0|req: 1589/58763] 123.123.123.123 () {40 vars in 613 bytes} [Mon Apr 6 11:42:41 2015] GET /validate/ => generated 182 bytes in 1 msecs (HTTP/1.1 405) 4 headers in 234 bytes (1 switches on core 0)
So, everything points out that the actual request was not made using the POST. The app route that handles this code is simple, and this is an excerpt:
#app.route('/validate/', methods=['POST'])
#login_required
def validate():
if isinstance(current_user.user, Sales):
try:
#do the stuff here
except Exception, e:
app.logger.exception(str(e))
return 0
abort(403)
The app route can fail, and there are some returns inside the try block, but even if those fails or there is an expcetion, there is nothing that could raise the 405 error code in this block, only 403 which rarely happens since I construct and login the user manually from the cron.
I have found similar thing here but the soultion there was that there was a redirect from HTTP to HTTPS version, and I also have that redirect present in the server, but the URL the request is being made has the HTTPS in it, so I doubt this is the cause.
The stack I am running this on is uwsgi+nginx+flask. Can anyone see what might be causing this? To repeat, its not happening always, so sometimes its working as expected, sometimes not. I recently migrated from apache and mod_wsgi to this new stack and from that point I have started encontering this error; can't recally ever seeing it on apache environment.
Thanks!
The only time we ever change a POST request to a GET is when we're handling a redirect. Depending on the redirect code, we will change the request method. If you want to be sure that we don't follow redirects, you need to pass allow_redirects=False. That said, you need to figure out why your application is generating redirects (including if it's redirecting to HTTP or to a different domain, or using a specific status code).
Not sure if it's by design, but removing the forward slash at the end of the URL fixed it for me:
url = 'https://example.com/validate/' # remove the slash
payload = {'pin': pin, 'sku': sku, 'phone': phone, 'AR': True}
validation_post = requests.post(url, data=payload)

Django view sending empty reply with proper headers

I have Django project on Dreamhost server which has several views that returns Json response.Yesterday I have ported my Django project from local machine(localhost) to dreamhost server running apache.Now if I call my django view through jquery for
http://www.abc.com/projects/
It should return me all projects that i have in my mongodb database but instead of that it returns :
On Firefox - just headers with no response
Connection Keep-Alive
Content-Type application/json
Date Thu, 19 Jan 2012 09:03:34 GMT
Keep-Alive timeout=2, max=100
Server Apache
Status 200 OK
Transfer-Encoding chunked
On Chrome - No headers and response data.It throws an error:
XMLHttpRequest cannot load http://abc.com/Projects/. Origin null is not allowed by Access-Control-Allow-Origin.**
If I just access the http://www.abc.com/projects/ through my web-browser it returns me results in json format,but not in case if I use JavaScript/Jquery.
Earlier I was using this middleware to allow other domains to request and get response on my local-machine with django in-built server.But now when I am running on apache server It stops working so I removed It from Settings.py file.
I don't know why is this error coming .Please help
*EDIT*
As #burhan suggested I used jsonp on client side and now my server is returning json but browser is giving error before parsing it.Error is : unexpected token
JSON i am getting in reply is :
{"projects": [{"projectName": "carmella", "projectId": "4f13c7475fcff30710000000"}, {"projectName": "SeaMonkey", "projectId": "4f1677b75fcff37c03000001"}]}
You are running into the same origin policy sandbox. Since your server is www.abc.com and you are accessing abc.com - the origin is not the same, which is why the script is not executing.
You have a few options:
Make sure the URL matches exactly - to avoid the same origin policy sandbox.
Use jsonp in your javascript libary.

Categories

Resources