Python CURL speficied ip address - python

I want to make a GET request to retrieve the contents of a web-page or a web service.
I want to send specific headers for this request AND
I want to set the IP address FROM WHICH this request will be sent.
(The server on which this code is running has multiple IP addresses available).
How can I achieve this with Python and its libraries?

I checked urllib2 and it won't set the source address (at least not on Python 2.7). The underlying library is httplib, which does have that feature, so you may have some luck using that directly.
From the httplib documentation:
class httplib.HTTPConnection(host[, port[, strict[, timeout[, source_address]]]])
The optional source_address parameter may be a tuple of a (host, port) to use as the source address the HTTP connection is made from.
You may even be able to convince urllib2 to use this feature by creating a custom HTTPHandler class. You will need to duplicate some code from urllib2.py, because AbstractHTTPHandler is using a simpler version of this call:
class AbstractHTTPHandler(BaseHandler):
# ...
def do_open(self, http_class, req):
# ...
h = http_class(host, timeout=req.timeout) # will parse host:port
Where http_class is httplib.HTTPConnection for HTTP connections.
Probably this would work instead, if patching urllib2.py (or duplicating and renaming it) is an acceptable workaround:
h = http_class(host, timeout=req.timeout, source_address=(req.origin_req_host,0))

There are many options available to you for making http requests. I don't even think there is really a commonly agreed upon "best". You could use any of these:
urllib2: http://docs.python.org/library/urllib2.html
requests: http://docs.python-requests.org/en/v0.10.4/index.html
mechanize: http://wwwsearch.sourceforge.net/mechanize/
This list is not exhaustive. Read the docs and take your pick. Some are lower level and some offer rich browser-like features. All of them let you set headers before making request.

Related

Python - How to detect whether coming connections using proxy or not

I am working on a simple program written in Python which sniffs coming network packets. Then, let user use added modules like DoS detection or Ping prevention. With the help of sniffer, I can get incoming connections' IP address, MAC address, protocol flag and packet content. Now, what I want to do is adding a new module that detects whether sender using proxy or not and do some thing according to it. I was searched on the methods that can be used with Python but can not find useful one. How many ways are there to detect proxy for Python?
My sniffer code part is something like that:
.....
sock = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, 8)
while True:
packet = sock.recvfrom(2048)
ipheader = packet[0][14:34]
ip_hdr = struct.unpack("!8sB3s4s4s", ipheader)
sourceIP = socket.inet_ntoa(ip_hdr[3])
tcpheader = packet[0][34:54]
tcp_hdr = struct.unpack("!HH9ss6s", tcpheader)
protoFlag = binascii.hexlify(tcp_hdr[3])
......
Firstly, you mean incoming packets.
secondly,
From the server TCP's point of view it is connected to the proxy, not the downstream client.
so your server can't identify that there is a proxy involved from the packet.
however, if you are in the application level like http proxy, there might be a X-forwarded-for header available in which there will be the original client IP. I said it might be because proxy server will decide whether or not send this header to you. If you are expecting incoming http connections to your server, you can take a look at python's urllib2 although I'm not sure if you can access the X-forwarded-for using this library.
From the docs:
urllib2.urlopen(url[, data][, timeout])
...
This function returns a file-like object with two additional methods:
geturl() — return the URL of the resource retrieved, commonly used to determine if a redirect was followed
info() — return the meta-information of the page, such as headers, in the form of an mimetools.Message instance (see Quick Reference to HTTP Headers)
so using info() will retrieve the headers. hope you find what you're looking for in there.
There aren't many ways to do this, as proxies / VPNs look like real traffic. To add to what Mid said, you can look for headers and/or user agents to help you determine if the user is using a proxy or a VPN.
The only free solution I know is getIPIntel that uses block lists, machine learning, and statistics to determine if the IP is a proxy / VPN or not.
There are other paid solutions like maxmind and blocked.
What you'll need to do is send API queries to these services and parse the results.

Python-Scapy HTTP Traffic Manipulation

I need to intercept an HTTP Response packet from the server and replace it with my own response, or at least modify that response, before it arrives to my browser.
I'm already able to sniff this response and print it, the problem is with manipulating/replacing it.
Is there a way to do so wiht scapy library ?
Or do i have to connect my browser through a proxy to manipulate the response ?
If you want to work from your ordinary browser, then you need proxy between browser and server in order to manipulate it. E.g. see https://portswigger.net/burp/ which is a proxy specifically created for penetration testing with easy replacing of responses/requests (which is sriptable, too).
If you want to script all your session in scapy, then you can create requests and responses to your liking, but response does not go to the browser. Also, you can record ordinary web session (with tcpdump/wireshark/scapy) into pcap, then use scapy to read pcap modify it and send similar requests to the server.

Python urllib2 anonymity through tor

I have been trying to use SocksiPy (http://socksipy.sourceforge.net/) and set my sockets with SOCKS5 and set it to go through a local tor service that I am running on my box.
I have the following:
socks.setdefausocks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "localhost", 9050, True)
socket.socket = socks.socksocket
import urllib2
And I am doing something similar to:
workItem = "http://192.168.1.1/some/stuff" #obviously not the real url
req = urllib2.Request(workItem)
req.add_header('User-agent', 'Mozilla 5.10')
res = urllib2.urlopen(req, timeout=60)
And even using this I have been identified by the website, my understanding was that I would be coming out of a random end point every time and it wouldn't be able to identify me. And I can confirm if I hit whatsmyip.org with this that my end point is different every time. Is there some other steps I have to take to keep anonymous? I am using an IP address in the url so it shouldn't be doing any DNS resolution that might give it away.
There is no such User-Agent 'Mozilla 5.10' in reality. If the server employs even the simplest fingerprinting based on the User-Agent it will identity you based on this uncommon setting.
And I don't think you understand TOR: it does not provide full anonymity. It only helps by providing anonymity by hiding you real IP address. But it does not help if you give your real name on a web site or use such easily detectable features like an uncommon user agent.
You might have a look at the Design and Implementation Notes for the TOR browser bundle to see what kind of additional steps they take to be less detectable and where they still see open problems. You might also read about Device Fingerprinting which is used to identity the seemingly anonymous peer.

Python urllib2 trace route

I'm using Python and urllib2 to make POST requests and I have it working successfully. However, when I make several posts one after the other at times I get the error 502 proxy in use. Our company does us proxy but I'm not set up to hit the proxy since I'm working internally. Is there a way to get a trace route of how the POST request is being routed using urllib2 and Python?
Thanks
I'm not sure what you mean by "a trace route". traceroute is an IP thing, two levels below HTTP. And I doubt you want anything like that. You can find out whether there were any redirects, whether a proxy was used, etc., either by using a general-purpose sniffer or, much more simply, by just asking urllib2.
For example, let's say your code looks like this:
url = 'http://example.com'
data = urllib.urlencode({'spam': 'eggs'})
req = urllib2.Request(url, data)
resp = urllib2.urlopen(req)
respdata = resp.read()
Then req.has_proxy() will tell you whether it's going to use a proxy, resp.geturl() == url will tell you whether there was a redirect, etc. Read the docs for all the info available.
Meanwhile, if you don't want a proxy, you can either disable whatever settings urllib2 picked up that made it auto-configure the proxy (e.g., unset http_proxy before running your script), override the default handler chain to make sure there's no ProxyHandler, build an explicit OpenerDirector instead of using the default one, etc.

Pyramid subrequests

I need to call GET, POST, PUT, etc. requests to another URI because of search, but I cannot find a way to do that internally with pyramid. Is there any way to do it at the moment?
Simply use the existing python libraries for calling other webservers.
On python 2.x, use urllib2, for python 3.x, use urllib.request instead. Alternatively, you could install requests.
Do note that calling external sites from your server while serving a request yourself could mean your visitors end up waiting for a 3rd-party web server that stopped responding. Make sure you set decent time outs.
pyramid uses webob which has a client api as of version 1.2
from webob import Request
r = Request.blank("http://google.com")
response = r.send()
generally anything you want to override for the request you would just pass in as a parameter.
from webob import Request
r = Request.blank("http://facebook.com",method="DELETE")
another handy feature is that you can see the request as the http that is passed over the wire
print r
DELETE HTTP/1.0
Host: facebook.com:80
docs
Also check the response status code: response.status_int
I use it for example, to introspect my internal URIs and see whether or not a given relative URI is really served by the framework (example to generate breadcrumbs and make intermediate paths as links only if there are pages behind)

Categories

Resources