Python send packet over 443 - python

I have looked and perhaps i missed it. I currently have a file such as the one below:
PUT /URL/TO/SEND/REQUEST
Host: 127.0.0.1
Connection: keep-alive
...
bunch of data here
This file contains the header & the data i want to send over ssl. I know on windows i can use fiddler etc.. to send this raw data BUT i was hoping to use python. I tried looking (may be not hard enough) at urllib2 urllib & httplib to see if i could just send this file as the entire request i don't want to deal with parsing the file etc... Is this possible?
I did notice that in httplib i can use request where "body can be a file object." but from the description seems as though it still sends the header seperately and that file is only for the data being sent.
Thanks

It isn't documented, but it looks like you should be able to use httplib.HTTPConnection.send() for this:
In [13]: httplib.HTTPConnection.send??
Type: instancemethod
String Form:<unbound method HTTPConnection.send>
File: /usr/local/lib/python2.7/httplib.py
Definition: httplib.HTTPConnection.send(self, data)
Source:
def send(self, data):
"""Send `data' to the server."""
if self.sock is None:
if self.auto_open:
self.connect()
else:
raise NotConnected()
if self.debuglevel > 0:
print "send:", repr(data)
blocksize = 8192
if hasattr(data,'read') and not isinstance(data, array):
if self.debuglevel > 0: print "sendIng a read()able"
datablock = data.read(blocksize)
while datablock:
self.sock.sendall(datablock)
datablock = data.read(blocksize)
else:
self.sock.sendall(data)
The request() method combines the header and body and passes it to this function, which looks like it should handle strings or file objects.
Of course you will still need to know the host so that you can create the HTTPConnection object, so your code might look something like this (untested):
import httplib
conn = httplib.HTTPConnection('127.0.0.1')
conn.send(open(filename))
response = conn.getresponse()
edit: It turns out there is some internal state stuff that keeps this from working as is, here is a workaround (full example with google main page), but it is a bit of a hack. Tested using Python 2.6 and 2.7, does not appear to work on 3.x by just replacing httplib with http.client:
import httplib
conn = httplib.HTTPConnection('www.google.com')
conn.send('GET / HTTP/1.1\r\nHost: www.google.com\r\n\r\n')
conn._HTTPConnection__state = httplib._CS_REQ_SENT
response = conn.getresponse()
The key part here is setting conn.__state (mangled name) to the httplib._CS_REQ_SENT after calling send().

Related

Trying to send Python HTTPConnection content after accepting 100-continue header

I've been trying to debug a Python script I've inherited. It's trying to POST a CSV to a website via HTTPLib. The problem, as far as I can tell, is that HTTPLib doesn't handle receiving a 100-continue response, as per python http client stuck on 100 continue. Similarly to that post, this "Just Works" via Curl, but for various reasons we need this to run from a Python script.
I've tried to employ the work-around as detailed in an answer on that post, but I can't find a way to use that to submit the CSV after accepting the 100-continue response.
The general flow needs to be like this:
-> establish connection
-> send data including "expect: 100-continue" header, but not including the JSON body yet
<- receive "100-continue"
-> using the same connection, send the JSON body of the request
<- receive the 200 OK message, in a JSON response with other information
Here's the code in its current state, with my 10+ other commented remnants of other attempted workarounds removed:
#!/usr/bin/env python
import os
import ssl
import http.client
import binascii
import logging
import json
#classes taken from https://stackoverflow.com/questions/38084993/python-http-client-stuck-on-100-continue
class ContinueHTTPResponse(http.client.HTTPResponse):
def _read_status(self, *args, **kwargs):
version, status, reason = super()._read_status(*args, **kwargs)
if status == 100:
status = 199
return version, status, reason
def begin(self, *args, **kwargs):
super().begin(*args, **kwargs)
if self.status == 199:
self.status = 100
def _check_close(self, *args, **kwargs):
return super()._check_close(*args, **kwargs) and self.status != 100
class ContinueHTTPSConnection(http.client.HTTPSConnection):
response_class = ContinueHTTPResponse
def getresponse(self, *args, **kwargs):
logging.debug('running getresponse')
response = super().getresponse(*args, **kwargs)
if response.status == 100:
setattr(self, '_HTTPConnection__state', http.client._CS_REQ_SENT)
setattr(self, '_HTTPConnection__response', None)
return response
def uploadTradeIngest(ingestFile, certFile, certPass, host, port, url):
boundary = binascii.hexlify(os.urandom(16)).decode("ascii")
headers = {
"accept": "application/json",
"Content-Type": "multipart/form-data; boundary=%s" % boundary,
"Expect": "100-continue",
}
context = ssl.SSLContext(ssl.PROTOCOL_SSLv23)
context.load_cert_chain(certfile=certFile, password=certPass)
connection = ContinueHTTPSConnection(
host, port=port, context=context)
with open(ingestFile, "r") as fh:
ingest = fh.read()
## Create form-data boundary
ingest = "--%s\r\nContent-Disposition: form-data; " % boundary + \
"name=\"file\"; filename=\"%s\"" % os.path.basename(ingestFile) + \
"\r\n\r\n%s\r\n--%s--\r\n" % (ingest, boundary)
print("pre-request")
connection.request(
method="POST", url=url, headers=headers)
print("post-request")
#resp = connection.getresponse()
resp = connection.getresponse()
if resp.status == http.client.CONTINUE:
resp.read()
print("pre-send ingest")
ingest = json.dumps(ingest)
ingest = ingest.encode()
print(ingest)
connection.send(ingest)
print("post-send ingest")
resp = connection.getresponse()
print("response1")
print(resp)
print("response2")
print(resp.read())
print("response3")
return resp.read()
But this simply returns a 400 "Bad Request" response. The problem (I think) lies with the formatting and type of the "ingest" variable. If I don't run it through json.dumps() and encode() then the HTTPConnection.send() method rejects it:
ERROR: Got error: memoryview: a bytes-like object is required, not 'str'
I had a look at using the Requests library instead, but I couldn't get it to use my local certificate bundle to accept the site's certificate. I have a full chain with an encrypted key, which I did decrypt, but still ran into constant SSL_VERIFY errors from Requests. If you have a suggestion to solve my current problem with Requests, I'm happy to go down that path too.
How can I use HTTPLib or Requests (or any other libraries) to achieve what I need to achieve?
In case anyone comes across this problem in future, I ended up working around it with a bit of a kludge. I tried HTTPLib, Requests, and URLLib3 are all known to not handle the 100-continue header, so... I just wrote a Python wrapper around Curl via the subprocess.run() function, like this:
def sendReq(upFile):
sendFile=f"file=#{upFile}"
completed = subprocess.run([
curlPath,
'--cert',
args.cert,
'--key',
args.key,
targetHost,
'-H',
'accept: application/json',
'-H',
'Content-Type: multipart/form-data',
'-H',
'Expect: 100-continue',
'-F',
sendFile,
'-s'
], stdout=subprocess.PIPE, universal_newlines=True)
return completed.stdout
The only issue I had with this was that it fails if Curl was built against the NSS libraries, which I resolved by including a statically-built Curl binary with the package, the path to which is contained in the curlPath variable in the code. I obtained this binary from this Github repo.

Robot Framework: send binary data in POST request body with

I have a problem with getting my test running using Robot Framework and robotframework-requests. I need to send a POST request and a binary data in the body. I looked at this question already, but it's not really answered. Here's how my test case looks like:
Upload ${filename} file
Create Session mysession http://${ADDRESS}
${data} = Get Binary File ${filename}
&{headers} = Create Dictionary Content-Type=application/octet-stream Accept=application/octet-stream
${resp} = Post Request mysession ${CGIPath} data=${data} headers=&{headers}
[Return] ${resp.status_code} ${resp.text}
The problem is that my binary data is about 250MB. When the data is read with Get Binary File I see that memory consumption goes up to 2.x GB. A few seconds later when the Post Request is triggered my test is killed by OOM. I already looked at files parameter, but it seems it uses multipart encoding upload, which is not what I need.
My other thought was about passing open file handler directly to underlying requests library, but I guess that would require robotframework-request modification. Another idea is to fall back to curl for this test only.
Am I missing something in my test? What is the better way to address this?
I proceeded with the idea of robotframework-request modification and added this method
def post_request_binary(
self,
alias,
uri,
path=None,
params=None,
headers=None,
allow_redirects=None,
timeout=None):
session = self._cache.switch(alias)
redir = True if allow_redirects is None else allow_redirects
self._capture_output()
method_name = "post"
method = getattr(session, method_name)
with open(path, 'rb') as f:
resp = method(self._get_url(session, uri),
data=f,
params=self._utf8_urlencode(params),
headers=headers,
allow_redirects=allow_redirects,
timeout=self._get_timeout(timeout),
cookies=self.cookies,
verify=self.verify)
self._print_debug()
# Store the last session object
session.last_resp = resp
self.builtin.log(method_name + ' response: ' + resp.text, 'DEBUG')
return resp
I guess I can improve it a bit and create a pull request.

How can I read exactly one response chunk with python's http.client?

Using http.client in Python 3.3+ (or any other builtin python HTTP client library), how can I read a chunked HTTP response exactly one HTTP chunk at a time?
I'm extending an existing test fixture (written in python using http.client) for a server which writes its response using HTTP's chunked transfer encoding. For the sake of simplicity, let's say that I'd like to be able to print a message whenever an HTTP chunk is received by the client.
My code follows a fairly standard pattern for reading a large response:
conn = http.client.HTTPConnection(...)
conn.request(...)
response = conn.getresponse()
resbody = []
while True:
chunk = response.read(1024)
if len(chunk):
resbody.append(chunk)
else:
break
conn.close();
But this reads 1024 byte chunks regardless of whether or not the server is sending 10 byte chunks or 10MiB chunks.
What I'm looking for would be something like the following:
while True:
chunk = response.readchunk()
if len(chunk):
resbody.append(chunk)
else
break
If this is not possible with http.client, is it possible with another builtin http client library? If it's not possible with a builtin client lib, is it possible with pip installable module?
I found it easier to use the requests library like so
r = requests.post(url, data=foo, headers=bar, stream=True)
for chunk in (r.raw.read_chunked()):
print(chunk)
Update:
The benefit of chunked transfer encoding is to allow the transmission of dynamically generated content. Whether a HTTP library lets you read individual chunks or not is a separate issue (see RFC 2616 - Section 3.6.1).
I can see how what you are trying to do would be useful, but the standard python http client libraries don't do what you want without some hackery (see http.client and httplib).
What you are trying to do may be fine for use in your test fixture, but in the wild there are no guarantees. It is possible for the chunking of the data read by your client to be be different from the chunking of the data sent by your server. E.g. the data could have been "re-chunked" by a proxy server before it arrived (see RFC 2616 - Section 3.2 - Framing Techniques).
The trick is to tell the response object that it isn't chunked (resp.chunked = False) so that it returns the raw bytes. This allows you to parse the size and data of each chunk as it is returned.
import http.client
conn = http.client.HTTPConnection("localhost")
conn.request('GET', "/")
resp = conn.getresponse()
resp.chunked = False
def get_chunk_size():
size_str = resp.read(2)
while size_str[-2:] != b"\r\n":
size_str += resp.read(1)
return int(size_str[:-2], 16)
def get_chunk_data(chunk_size):
data = resp.read(chunk_size)
resp.read(2)
return data
respbody = ""
while True:
chunk_size = get_chunk_size()
if (chunk_size == 0):
break
else:
chunk_data = get_chunk_data(chunk_size)
print("Chunk Received: " + chunk_data.decode())
respbody += chunk_data.decode()
conn.close()
print(respbody)

Fetch a file from a local url with Python requests?

I am using Python's requests library in one method of my application. The body of the method looks like this:
def handle_remote_file(url, **kwargs):
response = requests.get(url, ...)
buff = StringIO.StringIO()
buff.write(response.content)
...
return True
I'd like to write some unit tests for that method, however, what I want to do is to pass a fake local url such as:
class RemoteTest(TestCase):
def setUp(self):
self.url = 'file:///tmp/dummy.txt'
def test_handle_remote_file(self):
self.assertTrue(handle_remote_file(self.url))
When I call requests.get with a local url, I got the KeyError exception below:
requests.get('file:///tmp/dummy.txt')
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/packages/urllib3/poolmanager.pyc in connection_from_host(self, host, port, scheme)
76
77 # Make a fresh ConnectionPool of the desired type
78 pool_cls = pool_classes_by_scheme[scheme]
79 pool = pool_cls(host, port, **self.connection_pool_kw)
80
KeyError: 'file'
The question is how can I pass a local url to requests.get?
PS: I made up the above example. It possibly contains many errors.
As #WooParadog explained requests library doesn't know how to handle local files. Although, current version allows to define transport adapters.
Therefore you can simply define you own adapter which will be able to handle local files, e.g.:
from requests_testadapter import Resp
import os
class LocalFileAdapter(requests.adapters.HTTPAdapter):
def build_response_from_file(self, request):
file_path = request.url[7:]
with open(file_path, 'rb') as file:
buff = bytearray(os.path.getsize(file_path))
file.readinto(buff)
resp = Resp(buff)
r = self.build_response(request, resp)
return r
def send(self, request, stream=False, timeout=None,
verify=True, cert=None, proxies=None):
return self.build_response_from_file(request)
requests_session = requests.session()
requests_session.mount('file://', LocalFileAdapter())
requests_session.get('file://<some_local_path>')
I'm using requests-testadapter module in the above example.
Here's a transport adapter I wrote which is more featureful than b1r3k's and has no additional dependencies beyond Requests itself. I haven't tested it exhaustively yet, but what I have tried seems to be bug-free.
import requests
import os, sys
if sys.version_info.major < 3:
from urllib import url2pathname
else:
from urllib.request import url2pathname
class LocalFileAdapter(requests.adapters.BaseAdapter):
"""Protocol Adapter to allow Requests to GET file:// URLs
#todo: Properly handle non-empty hostname portions.
"""
#staticmethod
def _chkpath(method, path):
"""Return an HTTP status for the given filesystem path."""
if method.lower() in ('put', 'delete'):
return 501, "Not Implemented" # TODO
elif method.lower() not in ('get', 'head'):
return 405, "Method Not Allowed"
elif os.path.isdir(path):
return 400, "Path Not A File"
elif not os.path.isfile(path):
return 404, "File Not Found"
elif not os.access(path, os.R_OK):
return 403, "Access Denied"
else:
return 200, "OK"
def send(self, req, **kwargs): # pylint: disable=unused-argument
"""Return the file specified by the given request
#type req: C{PreparedRequest}
#todo: Should I bother filling `response.headers` and processing
If-Modified-Since and friends using `os.stat`?
"""
path = os.path.normcase(os.path.normpath(url2pathname(req.path_url)))
response = requests.Response()
response.status_code, response.reason = self._chkpath(req.method, path)
if response.status_code == 200 and req.method.lower() != 'head':
try:
response.raw = open(path, 'rb')
except (OSError, IOError) as err:
response.status_code = 500
response.reason = str(err)
if isinstance(req.url, bytes):
response.url = req.url.decode('utf-8')
else:
response.url = req.url
response.request = req
response.connection = self
return response
def close(self):
pass
(Despite the name, it was completely written before I thought to check Google, so it has nothing to do with b1r3k's.) As with the other answer, follow this with:
requests_session = requests.session()
requests_session.mount('file://', LocalFileAdapter())
r = requests_session.get('file:///path/to/your/file')
The easiest way seems using requests-file.
https://github.com/dashea/requests-file (available through PyPI too)
"Requests-File is a transport adapter for use with the Requests Python library to allow local filesystem access via file:// URLs."
This in combination with requests-html is pure magic :)
packages/urllib3/poolmanager.py pretty much explains it. Requests doesn't support local url.
pool_classes_by_scheme = {
'http': HTTPConnectionPool,
'https': HTTPSConnectionPool,
}
In a recent project, I've had the same issue. Since requests doesn't support the "file" scheme, I'll patch our code to load the content locally. First, I define a function to replace requests.get:
def local_get(self, url):
"Fetch a stream from local files."
p_url = six.moves.urllib.parse.urlparse(url)
if p_url.scheme != 'file':
raise ValueError("Expected file scheme")
filename = six.moves.urllib.request.url2pathname(p_url.path)
return open(filename, 'rb')
Then, somewhere in test setup or decorating the test function, I use mock.patch to patch the get function on requests:
#mock.patch('requests.get', local_get)
def test_handle_remote_file(self):
...
This technique is somewhat brittle -- it doesn't help if the underlying code calls requests.request or constructs a Session and calls that. There may be a way to patch requests at a lower level to support file: URLs, but in my initial investigation, there didn't seem to be an obvious hook point, so I went with this simpler approach.
To load a file from a local URL, e.g. an image file you can do this:
import urllib
from PIL import Image
Image.open(urllib.request.urlopen('file:///path/to/your/file.png'))
I think simple solution for this will be creating temporary http server using python and using it.
Put all your files in temporary folder eg. tempFolder
Go to that directory and create a temporary http server in terminal/cmd as per your OS using command python -m http.server 8000 (Note 8000 is port no.)
This will you give you a link to http server. You can access it from http://127.0.0.1:8000/
Open your desired file in browser and copy the link to your url.

Python seek on remote file using HTTP

How do I seek to a particular position on a remote (HTTP) file so I can download only that part?
Lets say the bytes on a remote file were: 1234567890
I wanna seek to 4 and download 3 bytes from there so I would have: 456
and also, how do I check if a remote file exists?
I tried, os.path.isfile() but it returns False when I'm passing a remote file url.
If you are downloading the remote file through HTTP, you need to set the Range header.
Check in this example how it can be done. Looks like this:
myUrlclass.addheader("Range","bytes=%s-" % (existSize))
EDIT: I just found a better implementation. This class is very simple to use, as it can be seen in the docstring.
class HTTPRangeHandler(urllib2.BaseHandler):
"""Handler that enables HTTP Range headers.
This was extremely simple. The Range header is a HTTP feature to
begin with so all this class does is tell urllib2 that the
"206 Partial Content" reponse from the HTTP server is what we
expected.
Example:
import urllib2
import byterange
range_handler = range.HTTPRangeHandler()
opener = urllib2.build_opener(range_handler)
# install it
urllib2.install_opener(opener)
# create Request and set Range header
req = urllib2.Request('http://www.python.org/')
req.header['Range'] = 'bytes=30-50'
f = urllib2.urlopen(req)
"""
def http_error_206(self, req, fp, code, msg, hdrs):
# 206 Partial Content Response
r = urllib.addinfourl(fp, hdrs, req.get_full_url())
r.code = code
r.msg = msg
return r
def http_error_416(self, req, fp, code, msg, hdrs):
# HTTP's Range Not Satisfiable error
raise RangeError('Requested Range Not Satisfiable')
Update: The "better implementation" has moved to github: excid3/urlgrabber in the byterange.py file.
I highly recommend using the requests library. It is easily the best HTTP library I have ever used. In particular, to accomplish what you have described, you would do something like:
import requests
url = "http://www.sffaudio.com/podcasts/ShellGameByPhilipK.Dick.pdf"
# Retrieve bytes between offsets 3 and 5 (inclusive).
r = requests.get(url, headers={"range": "bytes=3-5"})
# If a 4XX client error or a 5XX server error is encountered, we raise it.
r.raise_for_status()
AFAIK, this is not possible using fseek() or similar. You need to use the HTTP Range header to achieve this. This header may or may not be supported by the server, so your mileage may vary.
import urllib2
myHeaders = {'Range':'bytes=0-9'}
req = urllib2.Request('http://www.promotionalpromos.com/mirrors/gnu/gnu/bash/bash-1.14.3-1.14.4.diff.gz',headers=myHeaders)
partialFile = urllib2.urlopen(req)
s2 = (partialFile.read())
EDIT: This is of course assuming that by remote file you mean a file stored on a HTTP server...
If the file you want is on an FTP server, FTP only allows to to specify a start offset and not a range. If this is what you want, then the following code should do it (not tested!)
import ftplib
fileToRetrieve = 'somefile.zip'
fromByte = 15
ftp = ftplib.FTP('ftp.someplace.net')
outFile = open('partialFile', 'wb')
ftp.retrbinary('RETR '+ fileToRetrieve, outFile.write, rest=str(fromByte))
outFile.close()
You can use httpio to access remote HTTP files as if they were local:
pip install httpio
import zipfile
import httpio
url = "http://some/large/file.zip"
with httpio.open(url) as fp:
zf = zipfile.ZipFile(fp)
print(zf.namelist())

Categories

Resources