Monkey patching sockets library to use a specifc network interface - python

I have been trying to make requests to a website using the requests library but using different network interfaces. Following are a list of answers that I have tried to use but did not work.
This answer describes how to achieve what I want, but it uses pycurl. I could use pycurl but I have learned about this monkey patching thing and want to give it a try.
This other answer seemed to work at first, since it does not raise any error. However, I monitored my network traffic using Wireshark and the packets were sent from my default interface. I tried to print messages inside the function set_src_addr defined by the author of the answer but the message did not show up. Therefore, I think it is patching a function that is never called. I get a HTTP 200 response, which should not occur since I have bound my socket to 127.0.0.1.
import socket
real_create_conn = socket.create_connection
def set_src_addr(*args):
address, timeout = args[0], args[1]
source_address = ('127.0.0.1', 0)
return real_create_conn(address, timeout, source_address)
socket.create_connection = set_src_addr
import requests
r = requests.get('http://www.google.com')
r
<Response [200]>
I have also tried this answer. I can get two kind of errors using this method:
import socket
true_socket = socket.socket
def bound_socket(*a, **k):
sock = true_socket(*a, **k)
sock.bind(('127.0.0.1', 0))
return sock
socket.socket = bound_socket
import requests
This will not allow me to create a socket and raise this error. I have also tried to make a variation of this answer which looks like this:
import requests
import socket
true_socket = socket.socket
def bound_socket(*a, **k):
sock = true_socket(*a, **k)
sock.bind(('192.168.0.10', 0))
print(sock)
return sock
socket.socket = bound_socket
r = requests.get('https://www.google.com')
This also do not work and raises this error.
I have the following problem: I want to have each process sending requests through a specific network interface. I thought that since threads share global memory (including libraries), I should change my code to work with processes. Now, I want to apply a monkey patching solution somewhere, in a way that each process can use a different interface for communication. Am I missing something? Is this the best way to approach this problem?
Edit:
I also would like to know if it is possible for different process to have different versions of the same library. If they are shared, how can I have different versions of a library in Python (one for each process)?

This appears to work for python3:
In [1]: import urllib3
In [2]: real_create_conn = urllib3.util.connection.create_connection
In [3]: def set_src_addr(address, timeout, *args, **kw):
...: source_address = ('127.0.0.1', 0)
...: return real_create_conn(address, timeout=timeout, source_address=source_address)
...:
...: urllib3.util.connection.create_connection = set_src_addr
...:
...: import requests
...: r = requests.get('http://httpbin.org')
It fails with the following exception:
ConnectionError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: / (Caused by NewConnectionError("<urllib3.connection.HTTPConnection object at 0x10c4b89b0>: Failed to establish a new connection: [Errno 49] Can't assign requested address",))

I will document the solution I have found and list some problems I had in the process.
salparadise had it right. It is very similar to the first answer I have found. I am assuming that the requests module import the urllib3 and the latter has its own version of the socket library. Therefore, it is very likely that the requests module will never directly call the socket library, but will have its functionality provided by the urllib3 module.
I have not noticed it first, but the third snippet I had in my question was working. The problem why I had a ConnectionError is because I was trying to use a macvlan virtual interface over a wireless physical interface (which, if I understood correctly, drops packets if the MAC addresses do not match). Therefore, the following solution does work:
import requests
from socket import socket as backup
import socket
def socket_custom_src_ip(src_ip):
original_socket = backup
def bound_socket(*args, **kwargs):
sock = original_socket(*args, **kwargs)
sock.bind((src_ip, 0))
print(sock)
return sock
return bound_socket
In my problem, I will need to change the IP address of a socket several times. One of the problems I had was that if no backup of the socket function is made, changing it several times would cause an error RecursionError: maximum recursion depth exceeded. This occurs since on the second change, the socket.socket function would not be the original one. Therefore, my solution above creates a copy of the original socket function to use as a backup for further bindings of different IPs.
Lastly, following is a proof of concept of how to achieve multiple processes using different libraries. With this idea, I can import and monkey-patch each socket inside my processes, having different monkey-patched versions of them.
import importlib
import multiprocessing
class MyProcess(multiprocessing.Process):
def __init__(self, module):
super().__init__()
self.module = module
def run(self):
i = importlib.import_module(f'{self.module}')
print(f'{i}')
p1 = MyProcess('os')
p2 = MyProcess('sys')
p1.start()
<module 'os' from '/usr/lib/python3.7/os.py'>
p2.start()
<module 'sys' (built-in)>
This also works using the import statement and global keyword to provide transparent access inside all functions as the following
import multiprocessing
def fun(self):
import os
global os
os.var = f'{repr(self)}'
fun2()
def fun2():
print(os.system(f'echo "{os.var}"'))
class MyProcess(multiprocessing.Process):
def __init__(self):
super().__init__()
def run(self):
if 'os' in dir():
print('os already imported')
fun(self)
p1 = MyProcess()
p2 = MyProcess()
p2.start()
<MyProcess(MyProcess-2, started)>
p1.start()
<MyProcess(MyProcess-1, started)>

I faced a similar issue where I wanted to have some localhost traffic originating not from 127.0.0.1 ( I was testing a https connection over localhost)
This is how I did it using the python core libraries ssl and http.client (cf docs), as it seemed cleaner than the solutions I found online using the requests library.
import http.client as http
import ssl
dst= 'sever.infsec.local' # dns record was added to OS
src = ('127.0.0.2',0) # 0 -> select available port
context = ssl.SSLContext()
context.load_default_certs() # loads OS certifcate context
request = http.HTTPSConnection(dst, 443, context=context,
source_address=src)
request.connect()
request.request("GET", '/', json.dumps(request_data))
response = request.getresponse()

Related

How to create nntplib objects using Multiprocessing

trying for 2 days to get multiprocessing to work when creating connections to an NNTP server. Goal: make a bunch of connections (like 50) as fast as possible. As making connections can be slow in a for loop (like upto 10 sec), i want to make them all 'at once' using multiprocessing. After creation of the connections, they remain open, as 10,000+ request will be made in some future multiprocessing part, relying on similar principle.
Some simplified part of the code:
#!/usr/bin/env python3
import sys
import ssl
from nntplib import NNTP_SSL
from multiprocessing import Pool
def MakeCon(i, host, port):
context = ssl.SSLContext(ssl.PROTOCOL_TLS)
s = NNTP_SSL(host, port=port, ssl_context=context, readermode=True)
print('created connection', i) # print to see progress
sys.stdout.flush()
return s
def Main():
host = 'reader.xsnews.nl'
port = 563
num_con = 4
y=MakeCon(1, host, port).getwelcome() #request some message from NNTP host to see if it works
print(y)
# the actual part that has the issue:
if __name__ == '__main__':
cons = range(num_con)
s = [None] * num_con
pool = Pool()
for con in cons:
s[con]=pool.apply_async(MakeCon, args=(con, host, port))
pool.close
print(s[1])
for con in cons:
t=s[con].getwelcome() #request some message from NNTP host to see if it works
print(t)
print('end')
Main()
Showing that the connection to the NNTP server etc works, but I fail at the part to extract the connections into some object I can use in combination with the nntplib options. I would say I ain't that experienced with python, especially not multiprocessing.
There are a few different issues with your approach. The biggest is that it won't work to create the connection in different processes and then send them to the main process. This is because each connection opens a socket and sockets are not serializable (pickable) and can therefore not be sent between processes.
And even if it had worked, the usage of .apply_sync() is not the right way to go. It is better to use .map() which returns the output from the function call directly (as opposed to .apply_sync() that returns an object from which the return value can be extracted).
However, in the current situation, the program is I/O bound, rather than CPU bound, and in these situations threading works just as well as multiprocessing, since the GIL won't hold back the execution. Thus, changing to threads instead of multiprocessing and to .map()from .apply_sync() gives the following solution:
#!/usr/bin/env python3
import sys
import ssl
from nntplib import NNTP_SSL
from multiprocessing.pool import ThreadPool
def MakeCon(i, host, port):
context = ssl.SSLContext(ssl.PROTOCOL_TLS)
s = NNTP_SSL(host, port=port, ssl_context=context, readermode=True)
print('created connection', i) # print to see progress
sys.stdout.flush()
return s
def Main():
host = 'reader.xsnews.nl'
port = 563
num_con = 4
y=MakeCon(1, host, port).getwelcome() #request some message from NNTP host to see if it works
print(y)
return con
cons = range(num_con)
s = [None] * num_con
pool = ThreadPool()
s=pool.map(lambda con: MakeCon(con, host, port), cons)
pool.close
if __name__ == "__main__":
Main()
A small word of advice, though. Be careful with creating too many connections, since that might not be looked to nicely upon from the server, since you are draining resources doing this.
Also, if you are to use your different connections to fetch articles these calls should probably also be done in different threads.
And, as a final comment, the same effect as using threads is to use asyncio. That, however, is something you probably need to study a while before you feel comfortable using.

Malformed DNS response packet (python + scapy)

I'm working on creating a proxy server using Python and scapy. TCP packets seem to be working fine but I'm running into some issues with UDP, specifically DNS requests. Essentially when a DNS request comes in I capture it in my script, preform the DNS lookup, and am trying to return it back to the person requesting the DNS query. The script successfully preforms the lookup and returns the DNS response, however when looking at wireshark it tells me it's a "Malformed Packet". Could someone tell me what I need to do in order to correctly return the DNS response?
#!/usr/bin/env python
from tornado.websocket import WebSocketHandler
from tornado.httpserver import HTTPServer
from tornado.web import Application
from tornado.ioloop import IOLoop
from collections import defaultdict
from scapy.all import *
import threading
outbound_udp = defaultdict(int)
connection = None
class PacketSniffer(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
def run(self):
global connection
while (True):
pkt = sniff(iface="eth0", count=1)
if pkt[0].haslayer(DNS):
print "Returning back has UDP"
print pkt.summary()
ipPacket = pkt[0][IP]
dnsPacket = pkt[0][DNS]
if outbound_udp[(ipPacket.src, dnsPacket.id)] > 0:
outbound_udp[(ipPacket.src, dnsPacket.id)] -= 1
print "Found in outbound_udp"
# Modify the destination address back to the address of the TUN on the host.
ipPacket.dst = "10.0.0.1"
try:
del ipPacket[TCP].chksum
del ipPacket[IP].chksum
del ipPacket[UDP].chksum
except IndexError:
print ""
ipPacket.show2() # Force recompute the checksum
if connection:
connection.write_message(str(ipPacket).encode('base64'))
sniffingThread = PacketSniffer()
sniffingThread.daemon = True
sniffingThread.start()
Some bugs have been fixed recently in Scapy around DNS (and other sophisticated protocols, but DNS is the most frequently seen):
https://bitbucket.org/secdev/scapy/issue/913/
https://bitbucket.org/secdev/scapy/issue/5104/
https://bitbucket.org/secdev/scapy/issue/5105/
Trying with the latest Scapy development version from the Mercurial repository (hg clone http://bb.secdev.org/scapy) should fix this.

How to achieve tcpflow functionality (follow tcp stream) purely within python

I am writing a tool in python (platform is linux), one of the tasks is to capture a live tcp stream and to
apply a function to each line. Currently I'm using
import subprocess
proc = subprocess.Popen(['sudo','tcpflow', '-C', '-i', interface, '-p', 'src', 'host', ip],stdout=subprocess.PIPE)
for line in iter(proc.stdout.readline,''):
do_something(line)
This works quite well (with the appropriate entry in /etc/sudoers), but I would like to avoid calling an external program.
So far I have looked into the following possibilities:
flowgrep: a python tool which looks just like what I need, BUT: it uses pynids
internally, which is 7 years old and seems pretty much abandoned. There is no pynids package
for my gentoo system and it ships with a patched version of libnids
which I couldn't compile without further tweaking.
scapy: this is a package manipulation program/library for python,
I'm not sure if tcp stream
reassembly is supported.
pypcap or pylibpcap as wrappers for libpcap. Again, libpcap is for packet
capturing, where I need stream reassembly which is not possible according
to this question.
Before I dive deeper into any of these libraries I would like to know if maybe someone
has a working code snippet (this seems like a rather common problem). I'm also grateful if
someone can give advice about the right way to go.
Thanks
Jon Oberheide has led efforts to maintain pynids, which is fairly up to date at:
http://jon.oberheide.org/pynids/
So, this might permit you to further explore flowgrep. Pynids itself handles stream reconstruction rather elegantly.See http://monkey.org/~jose/presentations/pysniff04.d/ for some good examples.
Just as a follow-up: I abandoned the idea to monitor the stream on the tcp layer. Instead I wrote a proxy in python and let the connection I want to monitor (a http session) connect through this proxy. The result is more stable and does not need root privileges to run. This solution depends on pymiproxy.
This goes into a standalone program, e.g. helper_proxy.py
from multiprocessing.connection import Listener
import StringIO
from httplib import HTTPResponse
import threading
import time
from miproxy.proxy import RequestInterceptorPlugin, ResponseInterceptorPlugin, AsyncMitmProxy
class FakeSocket(StringIO.StringIO):
def makefile(self, *args, **kw):
return self
class Interceptor(RequestInterceptorPlugin, ResponseInterceptorPlugin):
conn = None
def do_request(self, data):
# do whatever you need to sent data here, I'm only interested in responses
return data
def do_response(self, data):
if Interceptor.conn: # if the listener is connected, send the response to it
response = HTTPResponse(FakeSocket(data))
response.begin()
Interceptor.conn.send(response.read())
return data
def main():
proxy = AsyncMitmProxy()
proxy.register_interceptor(Interceptor)
ProxyThread = threading.Thread(target=proxy.serve_forever)
ProxyThread.daemon=True
ProxyThread.start()
print "Proxy started."
address = ('localhost', 6000) # family is deduced to be 'AF_INET'
listener = Listener(address, authkey='some_secret_password')
while True:
Interceptor.conn = listener.accept()
print "Accepted Connection from", listener.last_accepted
try:
Interceptor.conn.recv()
except: time.sleep(1)
finally:
Interceptor.conn.close()
if __name__ == '__main__':
main()
Start with python helper_proxy.py. This will create a proxy listening for http connections on port 8080 and listening for another python program on port 6000. Once the other python program has connected on that port, the helper proxy will send all http replies to it. This way the helper proxy can continue to run, keeping up the http connection, and the listener can be restarted for debugging.
Here is how the listener works, e.g. listener.py:
from multiprocessing.connection import Client
def main():
address = ('localhost', 6000)
conn = Client(address, authkey='some_secret_password')
while True:
print conn.recv()
if __name__ == '__main__':
main()
This will just print all the replies. Now point your browser to the proxy running on port 8080 and establish the http connection you want to monitor.

How to bind an ip address to telnetlib in Python

The code below binds an ip address to urllib, urllib2, etc.
import socket
true_socket = socket.socket
def bound_socket(*a, **k):
sock = true_socket(*a, **k)
sock.bind((sourceIP, 0))
return sock
socket.socket = bound_socket
Is it also able to bind an ip address to telnetlib?
telnetlib at least in recent Python releases uses socket.create_connection (see telnetlib's sources here) but that should also be caught by your monkeypatch (sources here -- you'll see it uses a bare identifier socket but that's exactly in the module you're monkeypatching). Of course monkeypatching is always extremely fragile (the tiniest optimization in some future release, hoisting the global lookup of socket in create_connection, and you're toast...;-) so maybe you'll want to monkeypath create_connection directly as a modestly-stronger approach.

Source interface with Python and urllib2

How do i set the source IP/interface with Python and urllib2?
Unfortunately the stack of standard library modules in use (urllib2, httplib, socket) is somewhat badly designed for the purpose -- at the key point in the operation, HTTPConnection.connect (in httplib) delegates to socket.create_connection, which in turn gives you no "hook" whatsoever between the creation of the socket instance sock and the sock.connect call, for you to insert the sock.bind just before sock.connect that is what you need to set the source IP (I'm evangelizing widely for NOT designing abstractions in such an airtight, excessively-encapsulated way -- I'll be speaking about that at OSCON this Thursday under the title "Zen and the Art of Abstraction Maintenance" -- but here your problem is how to deal with a stack of abstractions that WERE designed this way, sigh).
When you're facing such problems you only have two not-so-good solutions: either copy, paste and edit the misdesigned code into which you need to place a "hook" that the original designer didn't cater for; or, "monkey-patch" that code. Neither is GOOD, but both can work, so at least let's be thankful that we have such options (by using an open-source and dynamic language). In this case, I think I'd go for monkey-patching (which is bad, but copy and paste coding is even worse) -- a code fragment such as:
import socket
true_socket = socket.socket
def bound_socket(*a, **k):
sock = true_socket(*a, **k)
sock.bind((sourceIP, 0))
return sock
socket.socket = bound_socket
Depending on your exact needs (do you need all sockets to be bound to the same source IP, or...?) you could simply run this before using urllib2 normally, or (in more complex ways of course) run it at need just for those outgoing sockets you DO need to bind in a certain way (then each time restore socket.socket = true_socket to get out of the way for future sockets yet to be created). The second alternative adds its own complications to orchestrate properly, so I'm waiting for you to clarify whether you do need such complications before explaining them all.
AKX's good answer is a variant on the "copy / paste / edit" alternative so I don't need to expand much on that -- note however that it doesn't exactly reproduce socket.create_connection in its connect method, see the source here (at the very end of the page) and decide what other functionality of the create_connection function you may want to embody in your copied/pasted/edited version if you decide to go that route.
This seems to work.
import urllib2, httplib, socket
class BindableHTTPConnection(httplib.HTTPConnection):
def connect(self):
"""Connect to the host and port specified in __init__."""
self.sock = socket.socket()
self.sock.bind((self.source_ip, 0))
if isinstance(self.timeout, float):
self.sock.settimeout(self.timeout)
self.sock.connect((self.host,self.port))
def BindableHTTPConnectionFactory(source_ip):
def _get(host, port=None, strict=None, timeout=0):
bhc=BindableHTTPConnection(host, port=port, strict=strict, timeout=timeout)
bhc.source_ip=source_ip
return bhc
return _get
class BindableHTTPHandler(urllib2.HTTPHandler):
def http_open(self, req):
return self.do_open(BindableHTTPConnectionFactory('127.0.0.1'), req)
opener = urllib2.build_opener(BindableHTTPHandler)
opener.open("http://google.com/").read() # Will fail, 127.0.0.1 can't reach google.com.
You'll need to figure out some way to parameterize "127.0.0.1" there, though.
Here's a further refinement that makes use of HTTPConnection's source_address argument (introduced in Python 2.7):
import functools
import httplib
import urllib2
class BoundHTTPHandler(urllib2.HTTPHandler):
def __init__(self, source_address=None, debuglevel=0):
urllib2.HTTPHandler.__init__(self, debuglevel)
self.http_class = functools.partial(httplib.HTTPConnection,
source_address=source_address)
def http_open(self, req):
return self.do_open(self.http_class, req)
This gives us a custom urllib2.HTTPHandler implementation that is source_address aware. We can add it to a new urllib2.OpenerDirector and install it as the default opener (for future urlopen() calls) with the following code:
handler = BoundHTTPHandler(source_address=("192.168.1.10", 0))
opener = urllib2.build_opener(handler)
urllib2.install_opener(opener)
I thought I'd follow up with a slightly better version of the monkey patch. If you need to be able to set different port options on some of the sockets or are using something like SSL that subclasses socket, the following code works a bit better.
_ip_address = None
def bind_outgoing_sockets_to_ip(ip_address):
"""This binds all python sockets to the passed in ip address"""
global _ip_address
_ip_address = ip_address
import socket
from socket import socket as s
class bound_socket(s):
def connect(self, *args, **kwargs):
if self.family == socket.AF_INET:
if self.getsockname()[0] == "0.0.0.0" and _ip_address:
self.bind((_ip_address, 0))
s.connect(self, *args, **kwargs)
socket.socket = bound_socket
You have to only bind the socket on connect if you need to run something like a webserver in the same process that needs to bind to a different ip address.
Reasoning that I should monkey-patch at the highest level available, here's an alternative to Alex's answer which patches httplib instead of socket, taking advantage of httplib.HTTPSConnection.__init__()'s source_address keyword argument (which is not exposed by urllib2, AFAICT). Tested and working on Python 2.7.2.
import httplib
HTTPSConnection_real = httplib.HTTPSConnection
class HTTPSConnection_monkey(HTTPSConnection_real):
def __init__(*a, **kw):
HTTPSConnection_real.__init__(*a, source_address=(SOURCE_IP, 0), **kw)
httplib.HTTPSConnection = HTTPSConnection_monkey
As of Python 2.7 httplib.HTTPConnection had source_address added to it, allowing you to provide an IP port pair to bind to.
See: http://docs.python.org/2/library/httplib.html#httplib.HTTPConnection

Categories

Resources