Obtaining original destination ip in cherrypy

Obtaining original destination ip in cherrypy - python

I am running a captive portal on a cherrypy server and I have set up iptables rules that REDIRECT all http traffic from unregistered MAC addresses to the portal. After a user registers with me via the portal splash page, I add an iptables exception to let their traffic through.
Now what I want to do is redirect the user to the page they were originally going for - before they got sent to the portal. I know that iptables sets a field with the original destination information for all TCP packets, and I know there is a C function called getsockopt to read that field. However, I don't know how to access the socket associated with a request in cherrypy.
Can anybody help me out? :)

I'm not an expert in low-level networking and don't know how common open Wi-Fi authorisation implementations tag its clients. But what seems true to me is that in OSI model, lower layers know nothing about upper layers. In other words IP has no idea of HTTP terms and a page URL specifically.
This way having a socket reference at hand, which I believe is possible to retrieve through customising CherryPy, will give you original IP address at most, not URL. Also mixing networking (IP) and application (HTTP) layers, and generally managing one application entity in several places, will likely result in issues of all sorts. For instance dealing with HTTP speaking agents, forward and reverse proxies for instance, which won't reserve nuances of lower layer.
Update
Okay, since you say you also have the request URL, here is how you can retrieve the raw socket. As you can see, it is deep under the hood and essentially an implementation detail that an end-user shouldn't rely onto. It is not a part of the contract and it can be changed in any next version. Thus you have a good chance to shoot oneself in the foot.
#!/usr/bin/env python
import cherrypy
config = {
'global' : {
'server.socket_host' : '127.0.0.1',
'server.socket_port' : 8080,
'server.thread_pool' : 8
},
}
class App:
#cherrypy.expose
def index(self):
'''For caveats and details on the slippery slope, take a look at ws4py
https://github.com/Lawouach/WebSocket-for-Python/blob/master/ws4py/server/cherrypyserver.py
'''
print(cherrypy.serving.request.rfile.rfile._sock)
return 'Make sure you know what you are doing.'
if __name__ == '__main__':
cherrypy.quickstart(App(), '/', config)

Related

Conditional upstream proxying with mitmproxy (PAC equivalent module/script)

I have a super special proxy i need to use to access certain hosts ( it turns all other traffic away ), and a bunch of complex libraries and applications that can only take a single http proxy configuration parameter for all their http requests. Which are of course a mix of restricted/proxied traffic and traffic that this proxy is refusing to handle.
I've found an example script showing how to manipulate the upstream proxy host/address in upstream mode, but couldn't find any indication in public API, that "breaking out" of upstream mode in a script is possible, to have mitmproxy directly handle traffic instead of sending it upstream, given certain conditions are met ( request target host mostly )
What am i missing? Should i be trying to do this in "regular" mode?
I invoke PAC in the title because it has the DIRECT keyword that allows the library/application to continue processing the request without going to a proxy.
thanks!

i've found evidence that this is in fact not possible and unlikely to be implemented https://github.com/mitmproxy/mitmproxy/issues/2042#issuecomment-280857954 although this issue and comment is very old, there are some recent related and unanswered questions such as How can I switch mitmproxy mode based on attributes of the proxied request
So instead, i'm pivoting to tinyproxy which does seem to provide this exact functionality https://github.com/tinyproxy/tinyproxy/blob/1.10.0/etc/tinyproxy.conf.in#L143
A shame because the replay/monitoring/interactive editing features of mitmproxy would've been amazing to have

How to allow access to App Engine from only specified IPs?

I am building a simple POST handler on GAE in Python that will accept a POST and write it to a Cloud SQL database.
I would like to limit access to this app to a limited number of IPs - non-GAE webservers where the POST originates. Essentially, how to allow POSTS from my IPs and disallow all other traffic?
Seems like a simple and common operation, but I haven't found a solution online that seems to fit. Most GAE authentication and routing packages are built around user auth.
Where should I look for a solution here? What Google keywords should I be using? Is this going to be written into the app itself or should I be focused on another component of GCP for IP access and routing?
Thanks!

All credit to Paul Collingwood for alerting me to the existence of request.remote_addr.
Here is my solution as of now:
ALLOWED_IP = ['173.47.xx.xx1', '173.47.xx.xx2']
class PostHandler(webapp2.RequestHandler):
def post(self):
# Read the IP of the incoming request
ip = self.request.remote_addr
# If the IP is allowed, execute our code
if ip in ALLOWED_IP:
# Execute some awesome code
# Otherwise, slam the door!
else:
self.error(403)
I'm not entirely sure that my self.error() usage is appropriate here, but this is working! POST requests made from the allowed IPs are accepted and executed. All others are given a 403.
I'm always eager to hear improvement suggestions.

Python - How to detect whether coming connections using proxy or not

I am working on a simple program written in Python which sniffs coming network packets. Then, let user use added modules like DoS detection or Ping prevention. With the help of sniffer, I can get incoming connections' IP address, MAC address, protocol flag and packet content. Now, what I want to do is adding a new module that detects whether sender using proxy or not and do some thing according to it. I was searched on the methods that can be used with Python but can not find useful one. How many ways are there to detect proxy for Python?
My sniffer code part is something like that:
.....
sock = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, 8)
while True:
packet = sock.recvfrom(2048)
ipheader = packet[0][14:34]
ip_hdr = struct.unpack("!8sB3s4s4s", ipheader)
sourceIP = socket.inet_ntoa(ip_hdr[3])
tcpheader = packet[0][34:54]
tcp_hdr = struct.unpack("!HH9ss6s", tcpheader)
protoFlag = binascii.hexlify(tcp_hdr[3])
......

Firstly, you mean incoming packets.
secondly,
From the server TCP's point of view it is connected to the proxy, not the downstream client.
so your server can't identify that there is a proxy involved from the packet.
however, if you are in the application level like http proxy, there might be a X-forwarded-for header available in which there will be the original client IP. I said it might be because proxy server will decide whether or not send this header to you. If you are expecting incoming http connections to your server, you can take a look at python's urllib2 although I'm not sure if you can access the X-forwarded-for using this library.
From the docs:
urllib2.urlopen(url[, data][, timeout])
...
This function returns a file-like object with two additional methods:
geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed
info() — return the meta-information of the page, such as headers, in the form of an mimetools.Message instance (see Quick Reference to HTTP Headers)
so using info() will retrieve the headers. hope you find what you're looking for in there.

There aren't many ways to do this, as proxies / VPNs look like real traffic. To add to what Mid said, you can look for headers and/or user agents to help you determine if the user is using a proxy or a VPN.
The only free solution I know is getIPIntel that uses block lists, machine learning, and statistics to determine if the IP is a proxy / VPN or not.
There are other paid solutions like maxmind and blocked.
What you'll need to do is send API queries to these services and parse the results.

How do I safely get the user's ip address in Flask that has a proxy?

I am using Flask and need to get the user's IP address. This is usually done through request.remote_addr but since this app is hosted at a 3rd party (and using cloudflare) it just returns the localhost.
Flask suggests getting the X-Forwarded-Host but then they immediately say it is a security risk. Is there a safe way to get the client's real ip?

The Problem
The issue here is not that the ProxyFix itself will cause the user to get access to your system, but rather the fact that the ProxyFix will take what was once mostly reliable information and replace it instead with potentially unreliable information.
For starters, when you don't use ProxyFix, the REMOTE_ADDR attribute is most likely retrieved from the source IP address in the TCP packets. While not impossible, the source IP address in TCP packets are tough to spoof. Therefore, if you need a reliable way to retrieve the user's IP address, REMOTE_ADDR is a good way to do it; in most cases, you can rely on it to provide you something that is accurate when you do request.remote_addr.
The problem is, of course, in a reverse-proxy situation the TCP connection is not coming from the end user; instead, the end user makes a TCP connection with the reverse proxy, and the reverse proxy then makes a second TCP connection with your web app. Therefore, the request.remote_addr in your app will have the IP address of the reverse proxy rather than the original user.
A Potential Solution
ProxyFix is supposed to solve this problem so that you can make request.remote_addr have the user's IP address rather than the proxy. It does this by looking at the typical HTTP header that remote proxies (like Apache and Nginx) add into the HTTP header (X-Forwarded-For) and use the user's IP address it finds there. Note that Cloudflare uses a different HTTP Header, so ProxyFix probably won't help you; you'll need to write your own implementation of this middleware to get request.remote_addr to use the original client's IP address. However, in the rest of this answer I will continue to refer to that fix as "ProxyFix".
This solution, however, is problematic. The problem is that while the TCP header is mostly reliable, the HTTP headers are not; if a user can bypass your reverse proxy and send data right to the server, they can put whatever they want in the HTTP header. For example, they can make the IP address in the HTTP header the IP address of someone else! If you use the IP address for authentication, the user can spoof that authentication mechanism. If you store the IP address in your database and then display it in your application to another user in HTML, the user could inject SQL or Javascript into the header, potentially causing SQL injection or XSS vulnerabilities.
So, to summarize; ProxyFix takes a known mostly-safe solution to retrieve the user's IP address from a TCP packet and switches it to using the not-very-safe-by-itself solution of parsing an easily-spoofed HTTP header.
Therefore, the recomendation to use ProxyFix ONLY in reverse proxy situations means just that: don't use this if you accept connections from places that are NOT the proxy. This is often means have the reverse proxy (like Nginx or Apache) handle all your incoming traffic and have your application that actually uses ProxyFix safe behind a firewall.
You should also read this post which explains how ProxyFix was broken in the past (although is now fixed). This will also explains how ProxyFix works, and give you ideas on how to set your num_proxies argument.
A Better Solution
Let's say your user is at point A, they send the request to Cloudflare (B) which eventually sends the request to your final application (point C). Cloudflare will send the IP address of A in the CF-Connecting-IP header.
As explained above, if the user finds the IP address to point C, they could send a specially crafted HTTP request directly to point C which includes any header info they want. ProxyFix will use its logic to determine what the IP address is from the HTTP header, which of course is problematic if you rely on that value for, well, mostly anything.
Therefore, you might want to look at using something like mod_cloudflare, which allows you to do these proxy fixes directly in the Apache mod, but only when the HTTP connection comes from Cloudflare IP addresses (as defined by the TCP IP source). You can also have it only accept connections from Cloudflare. See How do I restore original visitor IP to my server logs for more info on this and help doing this with other servers (like Nginx).
This should give you a start. However, keep in mind that you're still not "safe": you've only shut down one possible attack vector, and that attack vector assumed that the attacker knew the IP address of your actual application. In that case, the malicious user could try to do a TCP attack with a spoofed Cloudflare IP address, although this would be extremely difficult. More likely, if they wanted to cause havoc, they would just DDOS your source server since they've bypassed Cloudflare. So, there are plenty more things to think about in securing, your application. Hopefully this helps you with understanding how to make one part slightly safer.

urllib2 - get resource if you already know the IP

In my python script, I am fetching pages but I already know the IP of the server.
So I could save it the hassle of doing a DNS lookup, if I can some how pass in the IP and hostname in the request.
So, if I call
http://111.111.111.111/
and then pass the hostname in the HOST attribute, I should be OK. However the issue I see is on the server side, if the user looks at the incomming request (ie REQUEST_URI) then they will see I went for the IP.
Anyone have any ideas?

First, the main idea is suspicious. Well, you can "know" IP of the server but this knowledge is temporary and its correctness time is controlled by DNS TTLs. For stable configuration, server admin can provide DNS record with long TTL (e.g. a few days) so DNS request will be always fulfilled using the nearest caching resolver or nscd. For changing configuration, TTL can be reduced to a few seconds or ever to 0 (means no caching), and it can be useful for some kind of load balancers. You try to organize your own resolver cache which is TTL ignorant, and this can lead to requests to non-functioning or wrong servers, with incorrect contents. So, I suggest not to do this.
If you are strictly sure you shall do this and you can't use external tools as custom resolver or even /etc/hosts, try to install custom "opener" (see urllib2.build_opener() function in documentation) which overrides DNS lookup. However I didn't do this ever, the knowledge is only on documentation read just now.

You can add the ip address mapping to the hosts file.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.