How to make a request inside a simple mitmproxy script? - python

Good day,
I am currently trying to figure out a way to make non blocking requests inside a simple script of mitmproxy, but the documentation doesn't seem to be clear for me for the first look.
I think it's probably the easiest if I show my current code and describe my issue below:
from copy import copy
from mitmproxy import http
def request(flow: http.HTTPFlow):
headers = copy(flow.request.headers)
headers.update({"Authorization": "<removed>", "Requested-URI": flow.request.pretty_url})
req = http.HTTPRequest(
first_line_format="origin_form",
scheme=flow.request.scheme,
port=443,
path="/",
http_version=flow.request.http_version,
content=flow.request.content,
host="my.api.xyz",
headers=headers,
method=flow.request.method
)
print(req.get_text())
flow.response = http.HTTPResponse.make(
200, req.content,
)
Basically I would like to intercept any HTTP(S) request done and make a non blocking request to an API endpoint at https://my.api.xyz/ which should take all original headers and return a png screenshot of the originally requested URL.
However the code above produces an empty content and the print returns nothing either.
My issue seems to be related to: mtmproxy http get request in script and Resubmitting a request from a response in mitmproxy but I still couldn't figure out a proper way of sending requests inside mitmproxy.

The following piece of code probably does what you are looking for:
from copy import copy
from mitmproxy import http
from mitmproxy import ctx
from mitmproxy.addons import clientplayback
def request(flow: http.HTTPFlow):
ctx.log.info("Inside request")
if hasattr(flow.request, 'is_custom'):
return
headers = copy(flow.request.headers)
headers.update({"Authorization": "<removed>", "Requested-URI": flow.request.pretty_url})
req = http.HTTPRequest(
first_line_format="origin_form",
scheme='http',
port=8000,
path="/",
http_version=flow.request.http_version,
content=flow.request.content,
host="localhost",
headers=headers,
method=flow.request.method
)
req.is_custom = True
playback = ctx.master.addons.get('clientplayback')
f = flow.copy()
f.request = req
playback.start_replay([f])
It uses the clientplayback addon in order to send out the request. When this new request is sent, that will generate another request event which will then be an infinite loop. That is the reason for the is_custom attribute I added to the request there. If the request that generated this event is the one that we have created, then we don't want to create a new request from it.

Related

How to intercept request with mitmproxy before response is streamed?

The request I am trying to intercept and modify is a get request with only one parameter and I try to modify it:
from mitmproxy import http
def request(flow: http.HTTPFlow) -> None:
if flow.request.pretty_url.startswith(BASE_URL):
flow.request.url = BASE_URL.replace('abc', 'def')
The above shows what I am trying to do in a nutshell. But unfortunately, according to docs,
this event fires after the entire body has been streamed.
In the end, I am not able to modify the request. Am I missing something here? Because if modify requests is not possible, then what is the point of mitmproxy?

Modify json body with mitmproxy

I am trying to intercept and modify a graphql response's body. Here is my addon code:
from mitmproxy import ctx
from mitmproxy import http
import json
def response(flow: http.HTTPFlow) -> None:
if flow.request.pretty_url == "https://my.graphql/endpoint":
request_data = json.loads(flow.request.get_text())
if request_data["operationName"] == "MyOperationName":
data = json.loads(flow.response.get_text())
data["data"]["product"]["name"] = "New Name"
flow.response.text = json.dumps(data)
I can see the modified response in mitmproxy console. But the iOS simulator I am using is still getting the original response. Does anyone know how can I pass the modified response to the device?
From the documentation
def response(self, flow: mitmproxy.http.HTTPFlow):
"""
The full HTTP response has been read.
Note: If response streaming is active, this event fires after the entire body has been streamed.
HTTP trailers, if present, have not been transmitted to the client yet and can still be modified.
"""
ctx.log(f"response: {flow=}")
It appears that you might be streaming the response body which would mean that modifications would be ignored.
Consider using def request event hook instead

Yggdrasil authentication with Python

I decided to try to make an automated login script for Minecraft. However, the new authentication API is stumping me. I can't find any mentions of the new functionality of the API on here. This is my code as it stands:
import requests
import json
data = json.dumps({"agent":{"name":"Minecraft","version":1},"username":"abcdef","password":"abcdef","clientToken":""})
headers = {'Content-Type': 'application/json'}
r = requests.post('https://authserver.mojang.com', data=data, headers=headers)
print (r.text)
Unfortunately, this returns:
{"error":"Method Not Allowed","errorMessage":"The method specified in the request is not allowed for the resource identified by the request URI"}
According to this resource on request format, this error means that I didn't correctly send a post request. However, I clearly declared requests.post(), so my first question is how am I incorrect, and what is the correct way to go about this?
My second question is, since I'm relatively new to Python and JSON, how would I replace the username and password fields with my own data, inside a variable?
You haven't specified an endpoint in your POST request, for example:
https://authserver.mojang.com/authenticate
The root of the website probably does not accept POST requests
http://wiki.vg/Authentication#Authenticate

how to encode POST data using python's urllib

I'm trying to connect to my Django server using a client side python script. Right now I'm just trying to connect to a view and retrieve the HttpResponse. The following works fine
import urllib2
import urllib
url = "http://localhost:8000/minion/serve"
request = urllib2.Request(url)
response = urllib2.urlopen(request)
html = response.read()
print html
However if I change it to
import urllib2
import urllib
url = "http://localhost:8000/minion/serve"
values = {'name': 'Bob'}
data = urllib.urlencode(values)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)
html = response.read()
print html
I get urllib2.HTTPError: HTTP Error 500: INTERNAL SERVER ERROR. Am I doing something wrong here? Here are the tutorials I was trying to follow, http://techmalt.com/?p=212 http://docs.python.org/2/howto/urllib2.html.
EDIT: tried to make the following change as per That1Guy's suggestion (other lines left the same)
request = urllib2.Request(url, data)
response = urllib2.urlopen(request, data)
This returned the same error message as before.
EDIT: It seems to work if I change the page I'm viewing so the error isn't in the client side script. So in light of that revelation, here's the server side view that's being accessed:
def serve(request):
return HttpResponse("You've been served!")
As you can see, it's very straight forward.
EDIT: Tested to see if Internal Error might be caused by missing csrf token, but csrf_exempt decorator failed to resolve error.
Finally figured it out with this. How to POST dictionary from python script to django url?
Turns out my url was missing a trailing slash. It's the little things that always get you I guess.

detect if a web page is changed

In my python application I have to read many web pages to collect data. To decrease the http calls I would like to fetch only changed pages. My problem is that my code always tells me that the pages have been changed (code 200) but in reality it is not.
This is my code:
from models import mytab
import re
import urllib2
from wsgiref.handlers import format_date_time
from datetime import datetime
from time import mktime
def url_change():
urls = mytab.objects.all()
# this is some urls:
# http://www.venere.com/it/pensioni/venezia/pensione-palazzo-guardi/#reviews
# http://www.zoover.it/italia/sardegna/cala-gonone/san-francisco/hotel
# http://www.orbitz.com/hotel/Italy/Venice/Palazzo_Guardi.h161844/#reviews
# http://it.hotels.com/ho292636/casa-del-miele-susegana-italia/
# http://www.expedia.it/Venezia-Hotel-Palazzo-Guardi.h1040663.Hotel-Information#reviews
# ...
for url in urls:
request = urllib2.Request(url.url)
if url.last_date == None:
now = datetime.now()
stamp = mktime(now.timetuple())
url.last_date = format_date_time(stamp)
url.save()
request.add_header("If-Modified-Since", url.last_date)
try:
response = urllib2.urlopen(request) # Make the request
# some actions
now = datetime.now()
stamp = mktime(now.timetuple())
url.last_date = format_date_time(stamp)
url.save()
except urllib2.HTTPError, err:
if err.code == 304:
print "nothing...."
else:
print "Error code:", err.code
pass
I do not understand what has gone wrong. Can anyone help me?
Web servers aren't required to send a 304 header as the response when you send an 'If-Modified-Since' header. They're free to send a HTTP 200 and send the entire page again.
Sending a 'If-Modified-Since' or 'If-None-Since' alerts the server that you'd like a cached response if available. It's like sending an 'Accept-Encoding: gzip, deflate' header -- you're just telling the server you'll accept something, not requiring it.
A good way to check if a site returns 304 is to use google chromes dev tools. E.g. below is an annotated example of using chrome on the bls website. Keep refreshing and you will see that the server keeps returning 304. If you force refresh with Ctrl+F5 (windows), you will see that instead it returns status code 200.
You can use this technique on your example to find out if the server does not return 304, or if you have incorrectly formatted your request headers somehow. Sometimes a webpage has a resource imported on to it which does not respect the If- headers and so it returns 200 whatever you do (If any resource on the page does not return 304, the whole page will return 200), but sometimes you are only looking at a specific part of a website and you can cheat by loading the resource directly and bypassing the whole document.

Categories

Resources