I've got a piece of code that I can't figure out how to unit test! The module pulls content from external XML feeds (twitter, flickr, youtube, etc.) with urllib2. Here's some pseudo-code for it:
params = (url, urlencode(data),) if data else (url,)
req = Request(*params)
response = urlopen(req)
#check headers, content-length, etc...
#parse the response XML with lxml...
My first thought was to pickle the response and load it for testing, but apparently urllib's response object is unserializable (it raises an exception).
Just saving the XML from the response body isn't ideal, because my code uses the header information too. It's designed to act on a response object.
And of course, relying on an external source for data in a unit test is a horrible idea.
So how do I write a unit test for this?
urllib2 has a functions called build_opener() and install_opener() which you should use to mock the behaviour of urlopen()
import urllib2
from StringIO import StringIO
def mock_response(req):
if req.get_full_url() == "http://example.com":
resp = urllib2.addinfourl(StringIO("mock file"), "mock message", req.get_full_url())
resp.code = 200
resp.msg = "OK"
return resp
class MyHTTPHandler(urllib2.HTTPHandler):
def http_open(self, req):
print "mock opener"
return mock_response(req)
my_opener = urllib2.build_opener(MyHTTPHandler)
urllib2.install_opener(my_opener)
response=urllib2.urlopen("http://example.com")
print response.read()
print response.code
print response.msg
It would be best if you could write a mock urlopen (and possibly Request) which provides the minimum required interface to behave like urllib2's version. You'd then need to have your function/method which uses it able to accept this mock urlopen somehow, and use urllib2.urlopen otherwise.
This is a fair amount of work, but worthwhile. Remember that python is very friendly to ducktyping, so you just need to provide some semblance of the response object's properties to mock it.
For example:
class MockResponse(object):
def __init__(self, resp_data, code=200, msg='OK'):
self.resp_data = resp_data
self.code = code
self.msg = msg
self.headers = {'content-type': 'text/xml; charset=utf-8'}
def read(self):
return self.resp_data
def getcode(self):
return self.code
# Define other members and properties you want
def mock_urlopen(request):
return MockResponse(r'<xml document>')
Granted, some of these are difficult to mock, because for example I believe the normal "headers" is an HTTPMessage which implements fun stuff like case-insensitive header names. But, you might be able to simply construct an HTTPMessage with your response data.
Build a separate class or module responsible for communicating with your external feeds.
Make this class able to be a test double. You're using python, so you're pretty golden there; if you were using C#, I'd suggest either in interface or virtual methods.
In your unit test, insert a test double of the external feed class. Test that your code uses the class correctly, assuming that the class does the work of communicating with your external resources correctly. Have your test double return fake data rather than live data; test various combinations of the data and of course the possible exceptions urllib2 could throw.
Aand... that's it.
You can't effectively automate unit tests that rely on external sources, so you're best off not doing it. Run an occasional integration test on your communication module, but don't include those tests as part of your automated tests.
Edit:
Just a note on the difference between my answer and #Crast's answer. Both are essentially correct, but they involve different approaches. In Crast's approach, you use a test double on the library itself. In my approach, you abstract the use of the library away into a separate module and test double that module.
Which approach you use is entirely subjective; there's no "correct" answer there. I prefer my approach because it allows me to build more modular, flexible code, something I value. But it comes at a cost in terms of additional code to write, something that may not be valued in many agile situations.
You can use pymox to mock the behavior of anything and everything in the urllib2 (or any other) package. It's 2010, you shouldn't be writing your own mock classes.
I think the easiest thing to do is to actually create a simple web server in your unit test. When you start the test, create a new thread that listens on some arbitrary port and when a client connects just returns a known set of headers and XML, then terminates.
I can elaborate if you need more info.
Here's some code:
import threading, SocketServer, time
# a request handler
class SimpleRequestHandler(SocketServer.BaseRequestHandler):
def handle(self):
data = self.request.recv(102400) # token receive
senddata = file(self.server.datafile).read() # read data from unit test file
self.request.send(senddata)
time.sleep(0.1) # make sure it finishes receiving request before closing
self.request.close()
def serve_data(datafile):
server = SocketServer.TCPServer(('127.0.0.1', 12345), SimpleRequestHandler)
server.datafile = datafile
http_server_thread = threading.Thread(target=server.handle_request())
To run your unit test, call serve_data() then call your code that requests a URL that looks like http://localhost:12345/anythingyouwant.
Why not just mock a website that returns the response you expect? then start the server in a thread in setup and kill it in the teardown. I ended up doing this for testing code that would send email by mocking an smtp server and it works great. Surely something more trivial could be done for http...
from smtpd import SMTPServer
from time import sleep
import asyncore
SMTP_PORT = 6544
class MockSMTPServer(SMTPServer):
def __init__(self, localaddr, remoteaddr, cb = None):
self.cb = cb
SMTPServer.__init__(self, localaddr, remoteaddr)
def process_message(self, peer, mailfrom, rcpttos, data):
print (peer, mailfrom, rcpttos, data)
if self.cb:
self.cb(peer, mailfrom, rcpttos, data)
self.close()
def start_smtp(cb, port=SMTP_PORT):
def smtp_thread():
_smtp = MockSMTPServer(("127.0.0.1", port), (None, 0), cb)
asyncore.loop()
return Thread(None, smtp_thread)
def test_stuff():
#.......snip noise
email_result = None
def email_back(*args):
email_result = args
t = start_smtp(email_back)
t.start()
sleep(1)
res.form["email"]= self.admin_email
res = res.form.submit()
assert res.status_int == 302,"should've redirected"
sleep(1)
assert email_result is not None, "didn't get an email"
Trying to improve a bit on #john-la-rooy answer, I've made a small class allowing simple mocking for unit tests
Should work with python 2 and 3
try:
import urllib.request as urllib
except ImportError:
import urllib2 as urllib
from io import BytesIO
class MockHTTPHandler(urllib.HTTPHandler):
def mock_response(self, req):
url = req.get_full_url()
print("incomming request:", url)
if url.endswith('.json'):
resdata = b'[{"hello": "world"}]'
headers = {'Content-Type': 'application/json'}
resp = urllib.addinfourl(BytesIO(resdata), header, url, 200)
resp.msg = "OK"
return resp
raise RuntimeError('Unhandled URL', url)
http_open = mock_response
#classmethod
def install(cls):
previous = urllib._opener
urllib.install_opener(urllib.build_opener(cls))
return previous
#classmethod
def remove(cls, previous=None):
urllib.install_opener(previous)
Used like this:
class TestOther(unittest.TestCase):
def setUp(self):
previous = MockHTTPHandler.install()
self.addCleanup(MockHTTPHandler.remove, previous)
Related
I'm not sure if I have used the correct terminology in the question.
Currently, I am trying to make a wrapper/interface around Google's Blogger API (Blog service).
[I know it has been done already, but I am using this as a project to learn OOP/python.]
I have made a method that gets a set of 25 posts from a blog:
def get_posts(self, **kwargs):
""" Makes an API request. Returns list of posts. """
api_url = '/blogs/{id}/posts'.format(id=self.id)
return self._send_request(api_url, kwargs)
def _send_request(self, url, parameters={}):
""" Sends an HTTP GET request to the Blogger API.
Returns JSON decoded response as a dict. """
url = '{base}{url}?'.format(base=self.base, url=url)
# Requests formats the parameters into the URL for me
try:
r = requests.get(url, params=parameters)
except:
print "** Could not reach url:\n", url
return
api_response = r.text
return self._jload(api_response)
The problem is, I have to specify the API key every time I call the get_posts function:
someblog = BloggerClient(url='http://someblog.blogger.com', key='0123')
someblog.get_posts(key=self.key)
Every API call requires that the key be sent as a parameter on the URL.
Then, what is the best way to do that?
I'm thinking a possible way (but probably not the best way?), is to add the key to the kwargs dictionary in the _send_request():
def _send_request(self, url, parameters={}):
""" Sends an HTTP get request to Blogger API.
Returns JSON decoded response. """
# Format the full API URL:
url = '{base}{url}?'.format(base=self.base, url=url)
# The api key will be always be added:
parameters['key']= self.key
try:
r = requests.get(url, params=parameters)
except:
print "** Could not reach url:\n", url
return
api_response = r.text
return self._jload(api_response)
I can't really get my head around what is the best way (or most pythonic way) to do it.
You could store it in a named constant.
If this code doesn't need to be secure, simply
API_KEY = '1ih3f2ihf2f'
If it's going to be hosted on a server somewhere or needs to be more secure, you could store the value in an environment variable
In your terminal:
export API_KEY='1ih3f2ihf2f'
then in your python script:
import os
API_KEY = os.environ.get('API_KEY')
The problem is, I have to specify the API key every time I call the get_posts function:
If it really is just this one method, the obvious idea is to write a wrapper:
def get_posts(blog, *args, **kwargs):
returns blog.get_posts(*args, key=key, **kwargs)
Or, better, wrap up the class to do it for you:
class KeyRememberingBloggerClient(BloggerClient):
def __init__(self, *args, **kwargs):
self.key = kwargs.pop('key')
super(KeyRememberingBloggerClient, self).__init__(*args, **kwargs)
def get_posts(self, *args, **kwargs):
return super(KeyRememberingBloggerClient, self).get_posts(
*args, key=self.key, **kwargs)
So now:
someblog = KeyRememberingBloggerClient(url='http://someblog.blogger.com', key='0123')
someblog.get_posts()
Yes, you can override or monkeypatch the _send_request method that all of the other methods use, but if there's only 1 or 2 methods that need to be fixed, why delve into the undocumented internals of the class, and fork the body of one of those methods just so you can change it in a way you clearly weren't expected to, instead of doing it cleanly?
Of course if there are 90 different methods scattered across 4 different classes, you might want to consider building these wrappers programmatically (and/or monkeypatching the classes)… or just patching the one private method, as you're doing. That seems reasonable.
I have a custom HTTP request handler that can be simplified to something like this:
# Python 3:
from http import server
class MyHandler(server.BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
# Here's where all the complicated logic is done to generate HTML.
# For clarity here, replace with a simple stand-in:
html = "<html><p>hello world</p></html>"
self.wfile.write(html.encode())
I'd like to unit-test this handler (i.e. make sure that my do_GET executes without an exception) without actually starting a web server. Is there any lightweight way to mock the SimpleHTTPServer so that I can test this code?
Expanding on the answer from jakevdp, I managed to be able to check the output, too:
try:
import unittest2 as unittest
except ImportError:
import unittest
try:
from io import BytesIO as IO
except ImportError:
from StringIO import StringIO as IO
from server import MyHandlerSSL # My BaseHTTPRequestHandler child
class TestableHandler(MyHandlerSSL):
# On Python3, in socketserver.StreamRequestHandler, if this is
# set it will use makefile() to produce the output stream. Otherwise,
# it will use socketserver._SocketWriter, and we won't be able to get
# to the data
wbufsize = 1
def finish(self):
# Do not close self.wfile, so we can read its value
self.wfile.flush()
self.rfile.close()
def date_time_string(self, timestamp=None):
""" Mocked date time string """
return 'DATETIME'
def version_string(self):
""" mock the server id """
return 'BaseHTTP/x.x Python/x.x.x'
class MockSocket(object):
def getsockname(self):
return ('sockname',)
class MockRequest(object):
_sock = MockSocket()
def __init__(self, path):
self._path = path
def makefile(self, *args, **kwargs):
if args[0] == 'rb':
return IO(b"GET %s HTTP/1.0" % self._path)
elif args[0] == 'wb':
return IO(b'')
else:
raise ValueError("Unknown file type to make", args, kwargs)
class HTTPRequestHandlerTestCase(unittest.TestCase):
maxDiff = None
def _test(self, request):
handler = TestableHandler(request, (0, 0), None)
return handler.wfile.getvalue()
def test_unauthenticated(self):
self.assertEqual(
self._test(MockRequest(b'/')),
b"""HTTP/1.0 401 Unauthorized\r
Server: BaseHTTP/x.x Python/x.x.x\r
Date: DATETIME\r
WWW-Authenticate: Basic realm="MyRealm", charset="UTF-8"\r
Content-type: text/html\r
\r
<html><head><title>Authentication Failed</title></html><body><h1>Authentication Failed</h1><p>Authentication Failed. Authorised Personnel Only.</p></body></html>"""
)
def main():
unittest.main()
if __name__ == "__main__":
main()
The code I am testing returns a 401 Unauthorised for "/". Change the response as appopriate for your test case.
Here's one approach I came up with to mock the server. Note that this should be compatible with both Python 2 and python 3. The only issue is that I can't find a way to access the result of the GET request, but at least the test will catch any exceptions it comes across!
try:
# Python 2.x
import BaseHTTPServer as server
from StringIO import StringIO as IO
except ImportError:
# Python 3.x
from http import server
from io import BytesIO as IO
class MyHandler(server.BaseHTTPRequestHandler):
"""Custom handler to be tested"""
def do_GET(self):
# print just to confirm that this method is being called
print("executing do_GET") # just to confirm...
self.send_response(200)
self.send_header("Content-type", "text/html")
self.end_headers()
# Here's where all the complicated logic is done to generate HTML.
# For clarity here, replace with a simple stand-in:
html = "<html><p>hello world</p></html>"
self.wfile.write(html.encode())
def test_handler():
"""Test the custom HTTP request handler by mocking a server"""
class MockRequest(object):
def makefile(self, *args, **kwargs):
return IO(b"GET /")
class MockServer(object):
def __init__(self, ip_port, Handler):
handler = Handler(MockRequest(), ip_port, self)
# The GET request will be sent here
# and any exceptions will be propagated through.
server = MockServer(('0.0.0.0', 8888), MyHandler)
test_handler()
So this is a little tricky depending on how "deep" you want to go into the BaseHTTPRequestHandler behavior to define your unit test. At the most basic level I think you can use this example from the mock library:
>>> from mock import MagicMock
>>> thing = ProductionClass()
>>> thing.method = MagicMock(return_value=3)
>>> thing.method(3, 4, 5, key='value')
3
>>> thing.method.assert_called_with(3, 4, 5, key='value')
So if you know which methods in the BaseHTTPRequestHandler your class is going to call you could mock the results of those methods to be something acceptable. This can of course get pretty complex depending on how many different types of server responses you want to test.
I am writing an application that performs REST operations using Kenneth Reitz's requests library and I'm struggling to find a nice way to unit test these applications, because requests provides its methods via module-level methods.
What I want is the ability to synthesize the conversation between the two sides; provide a series of request assertions and responses.
It is in fact a little strange that the library has a blank page about end-user unit testing, while targeting user-friendliness and ease of use. There's however an easy-to-use library by Dropbox, unsurprisingly called responses. Here is its intro post. It says they've failed to employ httpretty, while stating no reason of the fail, and written a library with similar API.
import unittest
import requests
import responses
class TestCase(unittest.TestCase):
#responses.activate
def testExample(self):
responses.add(**{
'method' : responses.GET,
'url' : 'http://example.com/api/123',
'body' : '{"error": "reason"}',
'status' : 404,
'content_type' : 'application/json',
'adding_headers' : {'X-Foo': 'Bar'}
})
response = requests.get('http://example.com/api/123')
self.assertEqual({'error': 'reason'}, response.json())
self.assertEqual(404, response.status_code)
If you use specifically requests try httmock. It's wonderfully simple and elegant:
from httmock import urlmatch, HTTMock
import requests
# define matcher:
#urlmatch(netloc=r'(.*\.)?google\.com$')
def google_mock(url, request):
return 'Feeling lucky, punk?'
# open context to patch
with HTTMock(google_mock):
# call requests
r = requests.get('http://google.com/')
print r.content # 'Feeling lucky, punk?'
If you want something more generic (e.g. to mock any library making http calls) go for httpretty.
Almost as elegant:
import requests
import httpretty
#httpretty.activate
def test_one():
# define your patch:
httpretty.register_uri(httpretty.GET, "http://yipit.com/",
body="Find the best daily deals")
# use!
response = requests.get('http://yipit.com')
assert response.text == "Find the best daily deals"
HTTPretty is far more feature-rich - it offers also mocking status code, streaming responses, rotating responses, dynamic responses (with a callback).
You could use a mocking library such as Mocker to intercept the calls to the requests library and return specified results.
As a very simple example, consider this class which uses the requests library:
class MyReq(object):
def doSomething(self):
r = requests.get('https://api.github.com', auth=('user', 'pass'))
return r.headers['content-type']
Here's a unit test that intercepts the call to requests.get and returns a specified result for testing:
import unittest
import requests
import myreq
from mocker import Mocker, MockerTestCase
class MyReqTests(MockerTestCase):
def testSomething(self):
# Create a mock result for the requests.get call
result = self.mocker.mock()
result.headers
self.mocker.result({'content-type': 'mytest/pass'})
# Use mocker to intercept the call to requests.get
myget = self.mocker.replace("requests.get")
myget('https://api.github.com', auth=('user', 'pass'))
self.mocker.result(result)
self.mocker.replay()
# Now execute my code
r = myreq.MyReq()
v = r.doSomething()
# and verify the results
self.assertEqual(v, 'mytest/pass')
self.mocker.verify()
if __name__ == '__main__':
unittest.main()
When I run this unit test I get the following result:
.
----------------------------------------------------------------------
Ran 1 test in 0.004s
OK
Missing from these answers is requests-mock.
From their page:
>>> import requests
>>> import requests_mock
As a context manager:
>>> with requests_mock.mock() as m:
... m.get('http://test.com', text='data')
... requests.get('http://test.com').text
...
'data'
Or as a decorator:
>>> #requests_mock.mock()
... def test_func(m):
... m.get('http://test.com', text='data')
... return requests.get('http://test.com').text
...
>>> test_func()
'data'
using mocker like in srgerg's answer:
def replacer(method, endpoint, json_string):
from mocker import Mocker, ANY, CONTAINS
mocker = Mocker()
result = mocker.mock()
result.json()
mocker.count(1, None)
mocker.result(json_string)
replacement = mocker.replace("requests." + method)
replacement(CONTAINS(endpoint), params=ANY)
self.mocker.result(result)
self.mocker.replay()
For the requests library, this would intercept the request by method and endpoint you're hitting and replace the .json() on the response with the json_string passed in.
If you break out your response handler/parser into a separate function, you can work with requests.Response objects directly, without needing to mock the client-server interaction.
Code under test
from xml.dom import minidom
from requests.models import Response
def function_under_test(s3_response: Response):
doc = minidom.parseString(s3_response.text)
return (
s3_response.status_code,
doc.getElementsByTagName('Code').item(0).firstChild.data,
)
Test code
import unittest
from io import BytesIO
class Test(unittest.TestCase):
def test_it(self):
s3_response = Response()
s3_response.status_code = 404
s3_response.raw = BytesIO(b"""<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>NoSuchKey</Code>
<Message>The resource you requested does not exist</Message>
<Resource>/mybucket/myfoto.jpg</Resource>
<RequestId>4442587FB7D0A2F9</RequestId>
</Error>
""")
parsed_response = function_under_test(s3_response)
self.assertEqual(404, parsed_response[0])
self.assertEqual("NoSuchKey", parsed_response[1])
There's a library for this, if you want to write your test server with Flask: requests-flask-adaptor
You just have to be careful with the order of imports when monkeypatching.
I am implementing a SOAP web service using tornado (and the third party tornadows module). One of the operations in my service needs to call another so I have the chain:
External request in (via SOAPUI) to operation A
Internal request (via requests module) in to operation B
Internal response from operation B
External response from operation A
Because it is all running in one service it is being blocked somewhere though. I'm not familiar with tornado's async functionality.
There is only one request handling method (post) because everything comes in on the single url and then the specific operation (method doing processing) is called based on the SOAPAction request header value. I have decorated my post method with #tornado.web.asynchronous and called self.finish() at the end but no dice.
Can tornado handle this scenario and if so how can I implement it?
EDIT (added code):
class SoapHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def post(self):
""" Method post() to process of requests and responses SOAP messages """
try:
self._request = self._parseSoap(self.request.body)
soapaction = self.request.headers['SOAPAction'].replace('"','')
self.set_header('Content-Type','text/xml')
for operations in dir(self):
operation = getattr(self,operations)
method = ''
if callable(operation) and hasattr(operation,'_is_operation'):
num_methods = self._countOperations()
if hasattr(operation,'_operation') and soapaction.endswith(getattr(operation,'_operation')) and num_methods > 1:
method = getattr(operation,'_operation')
self._response = self._executeOperation(operation,method=method)
break
elif num_methods == 1:
self._response = self._executeOperation(operation,method='')
break
soapmsg = self._response.getSoap().toprettyxml()
self.write(soapmsg)
self.finish()
except Exception as detail:
#traceback.print_exc(file=sys.stdout)
wsdl_nameservice = self.request.uri.replace('/','').replace('?wsdl','').replace('?WSDL','')
fault = soapfault('Error in web service : {fault}'.format(fault=detail), wsdl_nameservice)
self.write(fault.getSoap().toxml())
self.finish()
This is the post method from the request handler. It's from the web services module I'm using (so not my code) but I added the async decorator and self.finish(). All it basically does is call the correct operation (as dictated in the SOAPAction of the request).
class CountryService(soaphandler.SoapHandler):
#webservice(_params=GetCurrencyRequest, _returns=GetCurrencyResponse)
def get_currency(self, input):
result = db_query(input.country, 'currency')
get_currency_response = GetCurrencyResponse()
get_currency_response.currency = result
headers = None
return headers, get_currency_response
#webservice(_params=GetTempRequest, _returns=GetTempResponse)
def get_temp(self, input):
get_temp_response = GetTempResponse()
curr = self.make_curr_request(input.country)
get_temp_response.temp = curr
headers = None
return headers, get_temp_response
def make_curr_request(self, country):
soap_request = """<soapenv:Envelope xmlns:soapenv='http://schemas.xmlsoap.org/soap/envelope/' xmlns:coun='CountryService'>
<soapenv:Header/>
<soapenv:Body>
<coun:GetCurrencyRequestget_currency>
<country>{0}</country>
</coun:GetCurrencyRequestget_currency>
</soapenv:Body>
</soapenv:Envelope>""".format(country)
headers = {'Content-Type': 'text/xml;charset=UTF-8', 'SOAPAction': '"http://localhost:8080/CountryService/get_currency"'}
r = requests.post('http://localhost:8080/CountryService', data=soap_request, headers=headers)
try:
tree = etree.fromstring(r.content)
currency = tree.xpath('//currency')
message = currency[0].text
except:
message = "Failure"
return message
These are two of the operations of the web service (get_currency & get_temp). So SOAPUI hits get_temp, which makes a SOAP request to get_currency (via make_curr_request and the requests module). Then the results should just chain back and be sent back to SOAPUI.
The actual operation of the service makes no sense (returning the currency when asked for the temperature) but i'm just trying to get the functionality working and these are the operations I have.
I don't think that your soap module, or requests is asyncronous.
I believe adding the #asyncronous decorator is only half the battle. Right now you aren't making any async requests inside of your function (every request is blocking, which ties up the server until your method finishes)
You can switch this up by using tornados AsynHttpClient. This can be used pretty much as an exact replacement for requests. From the docoumentation example:
class MainHandler(tornado.web.RequestHandler):
#tornado.web.asynchronous
def get(self):
http = tornado.httpclient.AsyncHTTPClient()
http.fetch("http://friendfeed-api.com/v2/feed/bret",
callback=self.on_response)
def on_response(self, response):
if response.error: raise tornado.web.HTTPError(500)
json = tornado.escape.json_decode(response.body)
self.write("Fetched " + str(len(json["entries"])) + " entries "
"from the FriendFeed API")
self.finish()
Their method is decorated with async AND they are making asyn http requests. This is where the flow gets a little strange. When you use the AsyncHttpClient it doesn't lock up the event loop (PLease I just started using tornado this week, take it easy if all of my terminology isn't correct yet). This allows the server to freely processs incoming requests. When your asynchttp request is finished the callback method will be executed, in this case on_response.
Here you can replace requests with the tornado asynchttp client realtively easily. For your soap service, though, things might be more complicated. You could make a local webserivce around your soap client and make async requests to it using the tornado asyn http client???
This will create some complex callback logic which can be fixed using the gen decorator
This issue was fixed since yesterday.
Pull request:
https://github.com/rancavil/tornado-webservices/pull/23
Example: here a simple webservice that doesn't take arguments and returns the version.
Notice you should:
Method declaration: decorate the method with #gen.coroutine
Returning results: use raise gen.Return(data)
Code:
from tornado import gen
from tornadows.soaphandler import SoapHandler
...
class Example(SoapHandler):
#gen.coroutine
#webservice(_params=None, _returns=Version)
def Version(self):
_version = Version()
# async stuff here, let's suppose you ask other rest service or resource for the version details.
# ...
# returns the result.
raise gen.Return(_version)
Cheers!
I am using Python's requests library in one method of my application. The body of the method looks like this:
def handle_remote_file(url, **kwargs):
response = requests.get(url, ...)
buff = StringIO.StringIO()
buff.write(response.content)
...
return True
I'd like to write some unit tests for that method, however, what I want to do is to pass a fake local url such as:
class RemoteTest(TestCase):
def setUp(self):
self.url = 'file:///tmp/dummy.txt'
def test_handle_remote_file(self):
self.assertTrue(handle_remote_file(self.url))
When I call requests.get with a local url, I got the KeyError exception below:
requests.get('file:///tmp/dummy.txt')
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/requests/packages/urllib3/poolmanager.pyc in connection_from_host(self, host, port, scheme)
76
77 # Make a fresh ConnectionPool of the desired type
78 pool_cls = pool_classes_by_scheme[scheme]
79 pool = pool_cls(host, port, **self.connection_pool_kw)
80
KeyError: 'file'
The question is how can I pass a local url to requests.get?
PS: I made up the above example. It possibly contains many errors.
As #WooParadog explained requests library doesn't know how to handle local files. Although, current version allows to define transport adapters.
Therefore you can simply define you own adapter which will be able to handle local files, e.g.:
from requests_testadapter import Resp
import os
class LocalFileAdapter(requests.adapters.HTTPAdapter):
def build_response_from_file(self, request):
file_path = request.url[7:]
with open(file_path, 'rb') as file:
buff = bytearray(os.path.getsize(file_path))
file.readinto(buff)
resp = Resp(buff)
r = self.build_response(request, resp)
return r
def send(self, request, stream=False, timeout=None,
verify=True, cert=None, proxies=None):
return self.build_response_from_file(request)
requests_session = requests.session()
requests_session.mount('file://', LocalFileAdapter())
requests_session.get('file://<some_local_path>')
I'm using requests-testadapter module in the above example.
Here's a transport adapter I wrote which is more featureful than b1r3k's and has no additional dependencies beyond Requests itself. I haven't tested it exhaustively yet, but what I have tried seems to be bug-free.
import requests
import os, sys
if sys.version_info.major < 3:
from urllib import url2pathname
else:
from urllib.request import url2pathname
class LocalFileAdapter(requests.adapters.BaseAdapter):
"""Protocol Adapter to allow Requests to GET file:// URLs
#todo: Properly handle non-empty hostname portions.
"""
#staticmethod
def _chkpath(method, path):
"""Return an HTTP status for the given filesystem path."""
if method.lower() in ('put', 'delete'):
return 501, "Not Implemented" # TODO
elif method.lower() not in ('get', 'head'):
return 405, "Method Not Allowed"
elif os.path.isdir(path):
return 400, "Path Not A File"
elif not os.path.isfile(path):
return 404, "File Not Found"
elif not os.access(path, os.R_OK):
return 403, "Access Denied"
else:
return 200, "OK"
def send(self, req, **kwargs): # pylint: disable=unused-argument
"""Return the file specified by the given request
#type req: C{PreparedRequest}
#todo: Should I bother filling `response.headers` and processing
If-Modified-Since and friends using `os.stat`?
"""
path = os.path.normcase(os.path.normpath(url2pathname(req.path_url)))
response = requests.Response()
response.status_code, response.reason = self._chkpath(req.method, path)
if response.status_code == 200 and req.method.lower() != 'head':
try:
response.raw = open(path, 'rb')
except (OSError, IOError) as err:
response.status_code = 500
response.reason = str(err)
if isinstance(req.url, bytes):
response.url = req.url.decode('utf-8')
else:
response.url = req.url
response.request = req
response.connection = self
return response
def close(self):
pass
(Despite the name, it was completely written before I thought to check Google, so it has nothing to do with b1r3k's.) As with the other answer, follow this with:
requests_session = requests.session()
requests_session.mount('file://', LocalFileAdapter())
r = requests_session.get('file:///path/to/your/file')
The easiest way seems using requests-file.
https://github.com/dashea/requests-file (available through PyPI too)
"Requests-File is a transport adapter for use with the Requests Python library to allow local filesystem access via file:// URLs."
This in combination with requests-html is pure magic :)
packages/urllib3/poolmanager.py pretty much explains it. Requests doesn't support local url.
pool_classes_by_scheme = {
'http': HTTPConnectionPool,
'https': HTTPSConnectionPool,
}
In a recent project, I've had the same issue. Since requests doesn't support the "file" scheme, I'll patch our code to load the content locally. First, I define a function to replace requests.get:
def local_get(self, url):
"Fetch a stream from local files."
p_url = six.moves.urllib.parse.urlparse(url)
if p_url.scheme != 'file':
raise ValueError("Expected file scheme")
filename = six.moves.urllib.request.url2pathname(p_url.path)
return open(filename, 'rb')
Then, somewhere in test setup or decorating the test function, I use mock.patch to patch the get function on requests:
#mock.patch('requests.get', local_get)
def test_handle_remote_file(self):
...
This technique is somewhat brittle -- it doesn't help if the underlying code calls requests.request or constructs a Session and calls that. There may be a way to patch requests at a lower level to support file: URLs, but in my initial investigation, there didn't seem to be an obvious hook point, so I went with this simpler approach.
To load a file from a local URL, e.g. an image file you can do this:
import urllib
from PIL import Image
Image.open(urllib.request.urlopen('file:///path/to/your/file.png'))
I think simple solution for this will be creating temporary http server using python and using it.
Put all your files in temporary folder eg. tempFolder
Go to that directory and create a temporary http server in terminal/cmd as per your OS using command python -m http.server 8000 (Note 8000 is port no.)
This will you give you a link to http server. You can access it from http://127.0.0.1:8000/
Open your desired file in browser and copy the link to your url.