Web proxy in python/django? - python

I need to have a proxy that acts as an intermediary to fetch images. An example would be, my server requests domain1.com/?url=domain2.com/image.png and domain1.com server will respond with the data at domain2.com/image.png via domain1.com server.
Essentially I want to pass to the proxy the URL I want fetched, and have the proxy server respond with that resource.
Any suggestions on where to start on this?
I need something very easy to use or implement as I'm very much a beginner at all of this.
Most solutions I have found in python and/or django have the proxy acts as a "translater" i.e. domain1.com/image.png translates to domain2.com/image.png, which is obviously not the same.
I currently have the following code, but fetching images results in garbled data:
import httplib2
from django.conf.urls.defaults import *
from django.http import HttpResponse
def proxy(request, url):
conn = httplib2.Http()
if request.method == "GET":
url = request.GET['url']
resp, content = conn.request(url, request.method)
return HttpResponse(content)

Old question but for future googlers, I think this is what you want:
# proxies the google logo
def test(request):
url = "http://www.google.com/logos/classicplus.png"
req = urllib2.Request(url)
response = urllib2.urlopen(req)
return HttpResponse(response.read(), mimetype="image/png")

A very simple Django proxy view with requests and StreamingHttpResponse:
import requests
from django.http import StreamingHttpResponse
def my_proxy_view(request):
url = request.GET['url']
response = requests.get(url, stream=True)
return StreamingHttpResponse(
response.raw,
content_type=response.headers.get('content-type'),
status=response.status_code,
reason=response.reason)
The advantage of this approach is that you don't need to load the complete file in memory before streaming the content to the client.
As you can see, it forwards some response headers. Depending on your needs, you may want to forward the request headers as well; for example:
response = requests.get(url, stream=True,
headers={'user-agent': request.headers.get('user-agent')})

If you need something more complete than my previous answer, you can use this class:
import requests
from django.http import StreamingHttpResponse
class ProxyHttpResponse(StreamingHttpResponse):
def __init__(self, url, headers=None, **kwargs):
upstream = requests.get(url, stream=True, headers=headers)
kwargs.setdefault('content_type', upstream.headers.get('content-type'))
kwargs.setdefault('status', upstream.status_code)
kwargs.setdefault('reason', upstream.reason)
super().__init__(upstream.raw, **kwargs)
for name, value in upstream.headers.items():
self[name] = value
You can use this class like so:
def my_proxy_view(request):
url = request.GET['url']
return ProxyHttpResponse(url, headers=request.headers)
The advantage of this version is that you can reuse it in multiple views. Also, it forwards all headers, and you can easily extend it to add or exclude some other headers.

If the file you're fetching and returning is an image, you'll need to change the mimetype of your HttpResponse Object.

Use mechanize, it allow you to choose a proxy and act like a browser, making it easy to change the user agent, to go back and forth in the history and to handle authentification or cookies.

Related

How to make a request inside a simple mitmproxy script?

Good day,
I am currently trying to figure out a way to make non blocking requests inside a simple script of mitmproxy, but the documentation doesn't seem to be clear for me for the first look.
I think it's probably the easiest if I show my current code and describe my issue below:
from copy import copy
from mitmproxy import http
def request(flow: http.HTTPFlow):
headers = copy(flow.request.headers)
headers.update({"Authorization": "<removed>", "Requested-URI": flow.request.pretty_url})
req = http.HTTPRequest(
first_line_format="origin_form",
scheme=flow.request.scheme,
port=443,
path="/",
http_version=flow.request.http_version,
content=flow.request.content,
host="my.api.xyz",
headers=headers,
method=flow.request.method
)
print(req.get_text())
flow.response = http.HTTPResponse.make(
200, req.content,
)
Basically I would like to intercept any HTTP(S) request done and make a non blocking request to an API endpoint at https://my.api.xyz/ which should take all original headers and return a png screenshot of the originally requested URL.
However the code above produces an empty content and the print returns nothing either.
My issue seems to be related to: mtmproxy http get request in script and Resubmitting a request from a response in mitmproxy but I still couldn't figure out a proper way of sending requests inside mitmproxy.
The following piece of code probably does what you are looking for:
from copy import copy
from mitmproxy import http
from mitmproxy import ctx
from mitmproxy.addons import clientplayback
def request(flow: http.HTTPFlow):
ctx.log.info("Inside request")
if hasattr(flow.request, 'is_custom'):
return
headers = copy(flow.request.headers)
headers.update({"Authorization": "<removed>", "Requested-URI": flow.request.pretty_url})
req = http.HTTPRequest(
first_line_format="origin_form",
scheme='http',
port=8000,
path="/",
http_version=flow.request.http_version,
content=flow.request.content,
host="localhost",
headers=headers,
method=flow.request.method
)
req.is_custom = True
playback = ctx.master.addons.get('clientplayback')
f = flow.copy()
f.request = req
playback.start_replay([f])
It uses the clientplayback addon in order to send out the request. When this new request is sent, that will generate another request event which will then be an infinite loop. That is the reason for the is_custom attribute I added to the request there. If the request that generated this event is the one that we have created, then we don't want to create a new request from it.

How to test `#authenticated` handler using tornado.testing?

I am new to unit testing using script. I tried to verify login with arguments in post data, but I am getting login page as response and not get logged in.Because of #tornado.web.authenticated i can't access other functions without login and it responding to login page
import tornado
from tornado.testing import AsyncTestCase
from tornado.web import Application, RequestHandler
import app
import urllib
class MyTestCase(AsyncTestCase):
#tornado.testing.gen_test
def test_http_fetch_login(self):
data = urllib.urlencode(dict(username='admin', password=''))
client = AsyncHTTPClient(self.io_loop)
response = yield client.fetch("http://localhost:8888/console/login/?", method="POST",body=data)
# Test contents of response
self.assertIn("Automaton web console", response.body)
#tornado.testing.gen_test
def test_http_fetch_config(self):
client = AsyncHTTPClient(self.io_loop)
response = yield client.fetch("http://localhost:8888/console/configuration/?")
self.assertIn("server-version",response.body)
To test code that uses #authenticated (unless you are testing the redirection to the login page itself), you need to pass a cookie (or whatever form of authentication you're using) that will be accepted by your get_current_user method. The details of this will vary depending on how exactly you are doing your authentication, but if you're using Tornado's secure cookies you'll probably use the create_signed_value function to encode a cookie.
From documentation:
If you decorate post() methods with the authenticated decorator, and the user is not logged in, the server will send a 403 response. The #authenticated decorator is simply shorthand for if not self.current_user: self.redirect() and may not be appropriate for non-browser-based login schemes.
So you can use mock do like pointed in this answer: https://stackoverflow.com/a/18286742/1090700
Another 403 status will be thrown on POST calls if you have secure_cookies=True app param. If you want to test the code in a non debug fashion, you can also using mock POST actions while keeping the secure_cookies=True application parameter and then just mock the check_xsrf_cookie Handler method as well, like this:
# Some previous stuff...
with mock.patch.object(YourSecuredHandler, 'get_secure_cookie') as mget:
mget.return_value = 'user_email'
response = self.fetch('/', method='GET')
with mock.patch.object(
YourSecuredHandler, 'check_xsrf_cookie') as mpost:
mpost.return_value = None
response = self.fetch(
url, method="POST", body=parse.urlencode(body1),
headers=headers)
self.assertEqual(response.code, 201)

Python requests lib, is requests.Session equivalent to urllib2's opener?

I need to accomplish a login task in my own project.Luckily I found someone has it done already.
Here is the related code.
import re,urllib,urllib2,cookielib
class Login():
cj = cookielib.LWPCookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
def __init__(self,name='',password='',domain=''):
self.name=name
self.password=password
self.domain=domain
urllib2.install_opener(self.opener)
def login(self):
params = {'domain':self.domain,'email':self.name,'password':self.password}
req = urllib2.Request(
website_url,
urllib.urlencode(params)
)
self.openrate = self.opener.open(req)
print self.openrate.geturl()
info = self.openrate.read()
I've tested the code, it works great (according to info).
Now I want to port it to Python 3 as well as using requests lib instead of urllib2.
My thoughts:
since the original code use opener, though not sure, I think its equivalent in requests is requests.Session
Am I supposed to pass in a jar = cookiejar.CookieJar() when making request? Not sure either.
I've tried something like
import requests
from http import cookiejar
from urllib.parse import urlencode
jar = cookiejar.CookieJar()
s = requests.Session()
s.post(
website_url,
data = urlencode(params),
allow_redirects = True,
cookies = jar
)
Also, followed the answer in Putting a `Cookie` in a `CookieJar`, I tried making the same request again, but none of these worked.
That's why I'm here for help.
Will someone show me the right way to do this job? Thank you~
An opener and a Session are not entirely analogous, but for your particular use-case they match perfectly.
You do not need to pass a CookieJar when using a Session: Requests will automatically create one, attach it to the Session, and then persist the cookies to the Session for you.
You don't need to urlencode the data: requests will do that for you.
allow_redirects is True by default, you don't need to pass that parameter.
Putting all of that together, your code should look like this:
import requests
s = requests.Session()
s.post(website_url, data = params)
Any future requests made using the Session you just created will automatically have cookies applied to them if they are appropriate.

using python urlopen for a url query

Using urlopen also for url queries seems obvious. What I tried is:
import urllib2
query='http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627'
f = urllib2.urlopen(query)
s = f.read()
f.close()
However, for this specific url query it fails with HTTP error 403 forbidden
When entering this query in my browser, it works.
Also when using http://www.httpquery.com/ to submit the query, it works.
Do you have suggestions how to use Python right to grab the correct response?
Looks like it requires cookies... (which you can do with urllib2), but an easier way if you're doing this, is to use requests
import requests
session = requests.session()
r = session.get('http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627')
This is generally a much easier and less-stressful method of retrieving URLs in Python.
requests will automatically store and re-use cookies for you. Creating a session is slightly overkill here, but is useful for when you need to submit data to login pages etc..., or re-use cookies across a site... etc...
using urllib2 is something like
import urllib2, cookielib
cookies = cookielib.CookieJar()
opener = urllib2.build_opener( urllib2.HTTPCookieProcessor(cookies) )
data = opener.open('url').read()
It appears that the urllib2 default user agent is banned by the host. You can simply supply your own user agent string:
import urllib2
url = 'http://www.onvista.de/aktien/snapshot.html?ID_OSI=86627'
request = urllib2.Request(url, headers={"User-Agent" : "MyUserAgent"})
contents = urllib2.urlopen(request).read()
print contents

Capture access_token from facebook's server-side authentication in Django

I want to capture the access_token returned by this url(below)
https://graph.facebook.com/oauth/access_token?client_id=YOUR_APP_ID&redirect_uri=YOUR_REDIRECT_URI&client_secret=YOUR_APP_SECRET&code=CODE_GENERATED_BY_FACEBOOK
But if I HttpResponseredirect, it takes me to a blank page with access_token and expiry secs printed. I want to capture the returned access_token and use it later. Below is my code
def fb_return(request):
code = request.GET.get('code')
fb_id = settings.FB_ID
fb_s = settings.FB_SECRET
url = 'https://graph.facebook.com/oauth/access_token?client_id=%(id)s&redirect_uri=http://127.0.0.1:8000/facebook/return&client_secret=%(secret)s&code=%(code)s'%{'id':fb_id,'secret':fb_s,'code':code}
return HttpResponseRedirect(url)
You can use urllib to perform the request:
import urllib2
url = 'https://graph.facebook.com/oauth/access_token?client_id=%(id)s&redirect_uri=http://127.0.0.1:8000/facebook/return&client_secret=%(secret)s&code=%(code)s'%{'id':fb_id,'secret':fb_s,'code':code}
response = urllib2.urlopen(url)
html = response.read()
If the response is json, you can decode it like so:
import simplejson
json = response.read()
dict = simplejson.load(json)
Here's a similar question dealing with this
Depending on what you are trying to do, there are probably easier ways to interact with Facebook:
If you only need to do client side thing, you can use the Facebook javascript SDK.
If you are creating a canvas app you can use django-fandjango.
If you are creating a website with Facebook login, you can use django-social-auth
If you want server side interaction with the graph api you can use facepy

Categories

Resources