I want to be able to send a HTTP request with python without slash "/" in the path.
Simply, here is what the request should be like:
GET test HTTP/1.1
Host: example.com
Connection: keep-alive
Cache-Control: max-age=0
What I want to do is GET test HTTP/1.1 rather than GET /test HTTP/1.1
I am able to send the request using request repeating tools, but I am not sure how to do that with Python.
To clarify more: I don't want the request path to start with "/"
I am looking for the equivalent of this in python.
Thanks!
Related
import requests
print(requests.get("http://zfxxgk.nea.gov.cn/2022-01/17/c_1310427545.htm").url)
#returns 'http://zfxxgk.nea.gov.cn/2022-01/17/c_1310427545.htm'
Above code returns the same url as I passed, but the url actually redirect to the other url.
How can I get the after redirected url?
Thank you
with a curl-call
curl http://zfxxgk.nea.gov.cn/2022-01/17/c_1310427545.htm
you get the source witch contains the javascript-redirect as furas mentioned:
<script language="javascript" type="text/javascript">window.location.href="http://zfxxgk.nea.gov.cn/2021-12/31/c_1310427545.htm";</script>
If you got a "real" redirection you will find the new "location" in the header:
curl -I http://something.somewhere
HTTP/1.1 301 Moved Permanently
Content-Type: text/html
Content-Length: 185
Connection: keep-alive
Location: https://inder.net
If you need it an automation with python you should look at request.history as described in this answer to a similar question to find all redirections your call triggered: "Python Requests library redirect new url"
I'm trying to authenticate a bot which posts on stocktwits (see API) with Python/pycurl. As I understand it, first I need to request that the application is authorized to use Stocktwits user data via oauth/authorize, which returns an authorization code. Then I need to confirm permission via oauth/token, which would send back an access token that can be used to post stocktwits.
The problem I'm running into is that after I make a post request to auth/authorize, the response returned is
Host: api.stocktwits.com
Authorization: Basic bmljay5vc2hpbm92QGdtYWlsLmNvbTp6YWRuaWsxMg==
User-Agent: PycURL/7.43.0.2 libcurl/7.60.0 OpenSSL/1.1.0h zlib/1.2.11 c-
ares/1.14.0 WinIDN libssh2/1.8.0 nghttp2/1.32.0
Accept: */*
Content-Type: application/x-www-form-urlencoded
Expect: 100-continue
< HTTP/1.1 100 Continue
and the program just stalls waiting. How can I handle the http 100 response?
Code for api call is below
def get_auth_token:
auth_data = ""
c = pycurl.Curl()
c.setopt(c.CAINFO, self.CA_CERTS)
c.setopt(pycurl.POST, 1)
c.setopt(c.URL, uri_to_ouath_authorize_call)
c.setopt(c.FOLLOWLOCATION, True)
c.setopt(c.WRITEDATA, auth_data)
c.setopt(c.USERPWD, 'bot_account_username:password')
I'm guessing you need to use POSTFIELDS or READFUNCTION to supply the post data, it doesn't look like you have anything for the request body configured. See https://curl.haxx.se/libcurl/c/CURLOPT_POST.html.
Using pycurl.POST is a common mistake, normally POSTFIELDS should be used instead.
Why does django ignore the HTTP_X_FORWARDED_PROTO if it comes through the wire?
I added to the settings.xml the following config:
# make sure we know when we are secure when we are behind a proxy
SECURE_PROXY_SSL_HEADER = ('HTTP_X_FORWARDED_PROTO', 'https')
I made a test to test that if
def testHttpSupport(self):
url = reverse('configuration-list')
response = self.client.get(url, HTTP_X_FORWARDED_PROTO='https')
cfg = response.data[0]
cfg_url = cfg['url']
self.assertTrue(cfg_url.startswith('https'))
this works fine. The url of the return object starts with https.
however if I try :
curl -v -H 'HTTP_X_FORWARDED_PROTO: https' http://localhost:8000/api/users/
...
> GET /api/users/ HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.51.0
> Accept: */*
> HTTP_X_FORWARDED_PROTO: https
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Mon, 03 Jul 2017 16:22:04 GMT
< Server: WSGIServer/0.2 CPython/3.6.1
< Content-Type: application/json
< Allow: GET, POST, OPTIONS
< Vary: Accept, Cookie
< X-Frame-Options: SAMEORIGIN
< Content-Length: 197
<
* Curl_http_done: called premature == 0
* Closing connection 0
[{"url":"http://localhost:8000/api/users/1/",...
How come it does not return 'https://' based urls like in my unit-test?
The issue is the header name. When accessing Django through a WSGI server, you should use the X-Forwarded-Proto header instead of the HTTP_X_FORWARDED_PROTO:
curl -v -H 'X-Forwarded-Proto: https' http://localhost:8000/api/users/
The WSGI protocol states that the relevant CGI specifications must be followed, which say:
Meta-variables with names beginning with 'HTTP_' contain values read
from the client request header fields, if the protocol used is HTTP.
The HTTP header field name is converted to upper case, has all
occurrences of "-" replaced with "_" and has 'HTTP_' prepended to
give the meta-variable name.
(source)
So whenever you are using a WSGI server, the X-Forwarded-Proto header is automatically converted to HTTP_X_FORWARDED_PROTO before it is passed in to Django. When you pass in the HTTP_X_FORWARDED_PROTO header instead, HTTP_ must still be prepended according to the specification. Thus, you end up with a header named HTTP_HTTP_X_FORWARDED_PROTO in Django.
self.client is not a WSGI server, and values passed in through the kwargs are inserted directly into the WSGI environment, without any processing. So in that case you have to do the conversion yourself and actually use the HTTP_X_FORWARDED_PROTO key:
CGI specification
The headers sent via **extra should follow CGI specification. For example, emulating a different “Host” header as sent in the HTTP request from the browser to the server should be passed as HTTP_HOST.
(source)
I have a working bit of PHP code that uploads a binary to a remote server I don't have shell access to. The PHP code is:
function upload($uri, $filename) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, array('file' => '#' . $filename));
curl_exec($ch);
curl_close($ch);
}
This results in a header like:
HTTP/1.1
Host: XXXXXXXXX
Accept: */*
Content-Length: 208045596
Expect: 100-continue
Content-Type: multipart/form-data; boundary=----------------------------360aaccde050
I'm trying to port this over to python using requests and I cannot get the server to accept my POST. I have tried every which way to use requests.post, but the header will not mimic the above.
This will successfully transfer the binary to the server (can tell by watching wireshark) but because the header is not what the server is expecting it gets rejected. The response_code is a 200 though.
files = {'bulk_test2.mov': ('bulk_test2.mov', open('bulk_test2.mov', 'rb'))}
response = requests.post(url, files=files)
The requests code results in a header of:
HTTP/1.1
Host: XXXX
Content-Length: 160
Content-Type: multipart/form-data; boundary=250852d250b24399977f365f35c4e060
Accept-Encoding: gzip, deflate, compress
Accept: */*
User-Agent: python-requests/2.2.1 CPython/2.7.5 Darwin/13.1.0
--250852d250b24399977f365f35c4e060
Content-Disposition: form-data; name="bulk_test2.mov"; filename="bulk_test2.mov"
--250852d250b24399977f365f35c4e060--
Any thoughts on how to make requests match the header that the PHP code generates?
There are two large differences:
The PHP code posts a field named file, your Python code posts a field named bulk_test2.mov.
Your Python code posts an empty file. There Content-Length header is 160 bytes, exactly the amount of space the multipart boundaries and Content-Disposition part header take up. Either the bulk_test2.mov file is indeed empty, or you tried to post the file multiple times without rewinding or reopening the file object.
To fix the first problem, use 'file' as the key in your files dictionary:
files = {'file': open('bulk_test2.mov', 'rb')}
response = requests.post(url, files=files)
I used just the open file object as the value; requests will get the filename directly from the file object in that case.
The second issue is something only you can fix. Make sure you don't reuse files when repeatedly posting. Reopen, or use files['file'].seek(0) to rewind the read position back to the start.
The Expect: 100-continue header is an optional client feature that asks the server to confirm that the body upload can go ahead; it is not a required header and any failure to post your file object is not going to be due to requests using this feature or not. If an HTTP server were to misbehave if you don't use this feature, it is in violation of the HTTP RFCs and you'll have bigger problems on your hands. It certainly won't be something requests can fix for you.
If you do manage to post actual file data, any small variations in Content-Length are due to the (random) boundary being a different length between Python and PHP. This is normal, and not the cause of upload problems, unless your target server is extremely broken. Again, don't try to fix such brokenness with Python.
However, I'd assume you overlooked something much simpler. Perhaps the server blacklists certain User-Agent headers, for example. You could clear some of the default headers requests sets by using a Session object:
files = {'file': open('bulk_test2.mov', 'rb')}
session = requests.Session()
del session.headers['User-Agent']
del session.headers['Accept-Encoding']
response = session.post(url, files=files)
and see if that makes a difference.
If the server fails to handle your request because it fails to handle HTTP persistent connections, you could try to use the session as a context manager to ensure that all session connections are closed:
files = {'file': open('bulk_test2.mov', 'rb')}
with requests.Session() as session:
response = session.post(url, files=files, stream=True)
and you could add:
response.raw.close()
for good measure.
I have been struggling to post a diff to ReviewBoard through their API. I've managed to login to the server and create a new post, but I've failed to post correctly the contents of the diff file.
I'm new to writing this kind of application, but my goal is to have a one step script to:
diff a file (pre-commit) with the svn repository,
add a review request to ReviewBoard and post the diff from the current file,
May be later, the script can be part of a svn pre-commit hook.
My python attempt looks like:
import urllib.request
import urllib.parse
import os.path
... login to the reviewboard server with
urllib.request.HTTPBasicAuthHandler ...
diff_path = '/path/to/file'
diff_name = 'my.diff'
diff_path = os.path.join(diff_path, diff_name)
diff_val = open(diff_path,'r')
# load the diff into the http data POST request
diff_header = \
'-- SoMe BoUnDaRy \n' \
+ 'Content-Disposition: form-data; name=path; filename=' \
+ '"' + diff_name + '"\n\n' \
+ diff_val.read() + '\n' \
+ '-- SoMe BoUnDaRy --'
data ={'path': diff_header, 'basedir': '/path/to/file/in/rep'}
print( data['path'] )
data = urllib.parse.urlencode(data)
data = data.encode('utf-8')
opener.open( \
'http://xxx.xxx.x.xxx/api/review-requests/26/diffs/', data)
With this code I get a BAD REQUEST(400) error, specifically: "One or more fields had errors" (105).
I'm aware that there are some libraries out there that can talk with the ReviewBoard API. I'm also aware that post-review exists. I'd rather not have to distribute to the other developers another python library and post-review seems less flexible when diffing files from multiple locations.
From the suggestion below, I've add the server response here:
CREATING PASSWD MANAGER...
CREATING PASSWD MANAGER... done
CREATING PASSWD HANDLER...
CREATING PASSWD HANDLER... done
CREATING URL OPENER...
CREATING URL OPENER... done
LOADING DIFF...
send: b'POST /api/review-requests/26/diffs/ HTTP/1.1\r\nAccept-Encoding:
identity\r\nContent-Length: 723\r\nHost: xxx.xxx.x.xxx\r\nContent-Type:
application/x-www-form-urlencoded\r\nConnection: close\r\nUser-Agent:
[empty no username+password] Python-urllib/3.2\r\n\r\
npath=--+SoMe+BoUnDaRy+++%...[the rest of my post]
reply: 'HTTP/1.1 401 UNAUTHORIZED\r\n'
header: Date header: Server header: Content-Language header: Expires header:
Vary header: Cache-Control header: WWW-Authenticate header:
Content-Length header: Last-Modified header: Connection header:
Content-Type send: b'POST /api/review-requests/26/diffs/
HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-Length: 723\r\nHost:
xxx.xxx.x.xxx\r\nUser-Agent: Python-urllib/3.2\r\nConnection:
close\r\nContent-Type: application/x-www-form-urlencoded\r\nAuthorization:
Basic [with username+password]\r\n\r\npath=
--+SoMe+BoUnDaRy+++%0AContent-Disposition%...
reply: 'HTTP/1.1 400 BAD REQUEST\r\n'
header: Date header: Server header: Content-Language header: Expires header:
Vary header: Cache-Control header: Set-Cookie header: Content-Length header:
Last-Modified header: Connection header: Content-Type HTTPError thrown
At first glance my guess is that something is happening to my password handler. I'm not sure what is happening to it. Just in case, this is how I'm generate my authentication:
manager_passwd = urllib.request.HTTPPasswordMgr()
manager_passwd.add_password(...)
handler_passwd = urllib.request.HTTPBasicAuthHandler(manager_passwd)
opener = urllib.request.build_opener(handler_passwd)
The authentication seems to working. I've tested it by create a new review post. So it is when I post the diff that the authentication fails.
Reviewboard have already a python tool for posting diff with their API, it's called postreview.py. You can found it at :
http://reviewboard.googlecode.com/svn/trunk/wxpostreview/postreview.py
Grab and use their ReviewBoardServer for login and post a diff !
(In addition, in your request, the authentification is required yes, but also the cookie file. That's why you need 2 requests (one for login and get the cookie, another one for sending the diff.))