How to properly forward requests through proxies with MITMProxy?

How to properly forward requests through proxies with MITMProxy? - python

Trying to use MITMProxy to do custom forwarding to requests made from the Firefox browser, so that they go through one of several proxies selected at runtime. It is performing too slow for our purposes. Please bear in mind we are running this in Python 2.7.
The process is as follows:
Firefox sends request to configured MITMProxy.
MITMProxy takes the request from Firefox and generates a requests request and gets the data from the target server through a given proxy (which is not controlled by us and require authentication).
The response from the proxy-forwarded request gets converted into a response for the browser.
MITMProxy returns the data to the browser.
The situation seems to be that this process is too slow, which I believe could be for a number of reasons. It could be that there are settings enabled which bring down performance (such as too much logging, for example), the procedure being used is not the right one for the job (totally plausible) or something completely different.
How can we make this run faster?
Thanks very much! Any and all suggestions will be appreciated!

In this particular case, we were using the script feature of MITMProxy, which meant every modified request was executed synchronously (i.e., we could not use proper asynchronous behavior). This naturally became an issue once we started using the scripts with more clients.
As #Puciek mentioned in his comment, this was more a design issue than a problem with the library.

Related

Pycups - create and send IPPRequest with IPPAttribute and parse response (enable IPP:Hold-New-Jobs)

In CUPS, I want to hold all new printer jobs, check the cartridge levels of my printer and one by one release jobs for printing. (My printer kills the cartridge-chips when falling below a certain percentage to prevent refilling.)
I have CUPS notifications working (RSS) and can get my cartridge levels. Now, I want to enable the 'hold-new-jobs' attribute on a specific printer in CUPS (.../printers/myPrinter). After this, I have to find out how to get all jobs that are on-hold and release one (by job ID, or so) (preferably FIFO(FHFR)) For completeness: If cartridges are below a threshold, the jobs are held until cartridges are reset (manually) and a button in Home Assistant is pressed.
Pycups (libcups Python bindings) seems undocumented and different enough from libcups itself to get lost. There seems to be no example of IPPRequests for pycups. This is my shot at it (to_pdf as testprinter):
import cups
# Response is a set of bytes. Not clear (to me) what it means and if its parsed or can be parsed somewhere.
def ipp_hold_new_jobs_request_handler(response):
print("Hold_all_jobs handler called. Response: {}".format(response))
ipp_request = cups.IPPRequest(cups.IPP_OP_HOLD_NEW_JOBS)
ipp_attribute = cups.IPPAttribute(cups.IPP_TAG_OPERATION, cups.IPP_TAG_URI, "printer-uri",
"http://cups.mydomain.ext:631/printers/to_pdf")
ipp_request.add(ipp_attribute)
response = ipp_request.writeIO(ipp_hold_new_jobs_request_handler)
print('repsonse: {}'.format(response)) # Returns -1
while True:
pass
I expect new jobs to be shown as on-hold in the CUPS web-interface if this request succeeds.
Maybe I am doing this entirely the wrong way. I hope that someone can help me get this right. Thanks.

I think that it is not possible to use the pycups API like this. The correct way to send an IPP request is through the cupsDoRequest API call, but that part of the API currently is not directly exposed by pycups.
The writeIO method works on a much lower level and is typically used internally by the CUPS code to actually serialize a request. The callback, that it expects is supposed to write the serialized data to the HTTP connection, so your ipp_hold_new_jobs_request_handler function would need to do that, but it cannot because it does not have access to the underlying HTTP connection.
The closest thing to what happens when calling cupsDoRequest that could actually be done in this function would be calling the writeRequestData method of a Connection object, but this will not work either because before calling writeRequestData, the appropriate HTTP headers must be sent (that is part of what is done by cupsDoRequest) and pycups does not expose the APIs that are needed to do that.
In summary, pycups does currently not allow to send arbitrary IPP requests. Only the methods provided by the Connection class can be used to send requests.

Tornado: Websocket connection limit

I am developing a web application with Tornado and have encountered the following problem:
I can't run more than 6 instances of my application in one browser probably because each instance creates websocket connection to Tornado server. I use standard WebSocketHandler class. They close properly, i.e. if I close the 6th tab, then I'd be able to open another application tab.
Is there any way to circumvent it? I will provide any additional information if needed.
EDIT: Connection information (I have 6 identical tabs here, 7th won't load):

Are you sure the limitation is not on the browser? I've seen the same issue (long-polling requests, 7th or 8th won't load), but opening the URL in another browser or location works fine.
Edit: each browser has indeed a limit of simultaneous persistent connections per server, as well as global limit. See this question, and especially this response which has more up-to-date values.

Socket.io POST Requests from Socket.IO-Client-Swift

I am running socket.io on an Apache server through Python Flask. We're integrating it into an iOS app (using the Socket.IO-Client-Swift library) and we're having a weird issue.
From the client side code in the app (written in Swift), I can view the actual connection log (client-side in XCode) and see the connection established from the client's IP and the requests being made. The client never receives the information back (or any information back; even when using a global event response handler) from the socket server.
I wrote a very simple test script in Javascript on an HTML page and sent requests that way and received the proper responses back. With that said, it seems to likely be an issue with iOS. I've found these articles (but none of them helped fix the problem):
https://github.com/nuclearace/Socket.IO-Client-Swift/issues/95
https://github.com/socketio/socket.io-client-swift/issues/359
My next thought is to extend the logging of socket.io to find out exact what data is being POSTed to the socket namespace. Is there a way to log exactly what data is coming into the server (bear in mind that the 'on' hook on the server side that I've set up is not getting any data; I've tried to log it from there but it doesn't appear to even get that far).
I found mod_dumpio for Linux to log all POST requests but I'm not sure how well it will play with multi-threading and a socket server.
Any ideas on how to get the exact data being posted so we can at least troubleshoot the syntax and make sure the data isn't being malformed when it's sent to the server?
Thanks!
Update
When testing locally, we got it working (it was a setting in the Swift code where the namespace wasn't being declared properly). This works fine now on localhost but we are having the exact same issues when emitting to the Apache server.
We are not using mod_wsgi (as far as I know; I'm relatively new to mod_wsgi, apologies for any ignorance). We used to have a .wsgi file that called the main app script to run but we had to change that because mod_wsgi is not compatible with Flask SocketIO (as stated in the uWSGI Web Server section here). The way I am running the script now is by using supervisord to run the .py file as a daemon (using that specifically so it will autostart in the event of a server crash).
Locally, it worked great once we installed the eventlet module through pip. When I ran pip freeze on my virtual environment on the server, eventlet was installed. I uninstalled and reinstalled it just to see if that cleared anything up and that did nothing. No other Python modules that are on my local copy seem to be something that would affect this.
One other thing to keep in mind is that in the function that initializes the app, we change the port to port 80:
socketio.run(app,host='0.0.0.0',port=80)
because we have other API functions that run through a domain that is pointing to the server in this app. I'm not sure if that would affect anything but it doesn't seem to matter on the local version.
I'm at a dead end again and am trying to find anything that could help. Thanks for your assistance!
Another Update
I'm not exactly sure what was happening yet but we went ahead and rewrote some of the code, making sure to pay extra special attention to the namespace declarations within each socket event on function. It's working fine now. As I get more details, I will post them here as I figure this will be something useful for other who have the same problem. This thread also has some really valuable information on how to go about debugging/logging these types of issues although we never actually fully figured out the answer to the original question.

I assume you have verified that Apache does get the POST requests. That should be your first test, if Apache does not log the POST requests coming from iOS, then you have a different kind of problem.
If you do get the POST requests, then you can add some custom code in the middleware used by Flask-SocketIO and print the request data forwarded by Apache's mod_wsgi. The this is in file flask_socketio/init.py. The relevant portion is this:
class _SocketIOMiddleware(socketio.Middleware):
# ...
def __call__(self, environ, start_response):
# log what you need from environ here
environ['flask.app'] = self.flask_app
return super(_SocketIOMiddleware, self).__call__(environ, start_response)
You can find out what's in environ in the WSGI specification. In particular, the body of the request is available in environ['wsgi.input'], which is a file-like object you read from.
Keep in mind that once you read the payload, this file will be consumed, so the WSGI server will not be able to read from it again. Seeking the file back to the position it was before the read may work on some WSGI implementations. A safer hack I've seen people do to avoid this problem is to read the whole payload into a buffer, then replace environ['wsgi.input'] with a brand new StringIO or BytesIO object.

Are you using flask-socketio on the server side? If you are, there is a lot of debugging available in the constructor.
socketio = SocketIO(app, async_mode=async_mode, logger=True, engineio_logger=True)

Monitor the Download process in Chrome

I am trying to hack together a Python script to monitor ongoing downloads in Chrome and shut-down my PC automatically after the download process closes. I know little JavaScript and am considering using the PyJs library, if required.
1) Is this the best approach? I don't need the app to be portable, just working.
2) How would you identify the download process?
3) How would you monitor the download progress? Apparently the Chrome API doesn't provide a specific function for it.

Nice question, may be because I can relate with the need of automating the shutdown. ;)
I just googled. There happens to be an experimental API but only for the dev channel as of now. I am not on a dev channel to try that out, so I just hope I am pointing you in the right direction.
One approach would be:
Have a Python HTTP server listening on some port XYZ
To your extension add the permission to the URL http://localhost:XYZ/
In your extension, you could use:
chrome.downloads.search(query, function (arrayOfDownloadItem){ .. })
Where, query is an instance of DownloadQuery, and contains state property as in_progress
You could probably check for the length of arrayOfDownloadItem.
If its zero, create a new XMLHttpRequest to your HTTP server end point, and then let the server shutdown your machine.
HTH

Python: Asynchronous http requests sent in order with automatic handling of cookies?

I am coding a python (2.6) interface to a web service. I need to communicate via http so that :
Cookies are handled automatically,
The requests are asynchronous,
The order in which the requests are sent is respected (the order in which the responses to these requests are received does not matter).
I have tried what could be easily derived from the build-in libraries, facing different problems :
Using httplib and urllib2, the requests are synchronous unless I use thread, in which case the order is not guaranteed to be respected,
Using asyncore, there was no library to automatically deal with cookies send by the web service.
After some googling, it seems that there are many examples of python scripts or libraries that match 2 out of the 3 criteria, but not the 3 of them. I am thinking of reading through the cookielib sources and adapting what I need of it to asyncore (or only to my application in a ad hoc manner), but it seems strange that nothing like this exists yet, as I guess I am not the only one interested. If anyone knows of pointers about this problem, it would be greatly appreciated.
Thank you.
Edit to clarify :
What I am doing is a local proxy that interfaces my IRC client with a webchat. It creates a socket that listens to IRC connections, then upon receiving one, it logs in the webchat via http. I don't have access to the behaviour of the webchat, and it uses cookies for session IDs. When client sends several IRC requests to my python proxy, I have to forward them to the webchat's server via http and with cookies. I also want to do this asynchronously (I don't want to wait for the http response before I send the next request), and currently what happens is that the order in which the http requests are sent is not the order in which the IRC commands were received.
I hope this clarifies the question, and I will of course detail more if it doesn't.

Using httplib and urllib2, the
requests are synchronous unless I use
thread, in which case the order is not
guaranteed to be respected
How would you know that the order has been respected unless you get your response back from the first connection before you send the response to the second connection? After all, you don't care what order the responses come in, so it's very possible that the responses come back in the order you expect but that your requests were processed in the wrong order!
The only way you can guarantee the ordering is by waiting for confirmation that the first request has successfully arrived (eg. you start receiving the response for it) before beginning the second request. You can do this by not launching the second thread until you reach the response handling part of the first thread.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.