How does Python urllib2 https work?

How does Python urllib2 https work? - python

Looking at the documentation for urlib2 it says it supports HTTPS connections. However what it doesn't make clear is how you enable it do you for example take HTTPBasicAuth and replace the HTTP with HTTPS or do you just need to pass an HTTPS in url when you actually open the connection?

< Python 2.7.9:_
You can simply pass an HTTPS URL when you open the connection. Heed the warning in the Urllib2 documentation that states:
"Warning HTTPS requests do not do any verification of the server’s certificate."
As such, I recommend using Python Requests library that provides a better interface and many features, including SSL Cert verification and Unicode support.
Update 20150120:
Python 2.7.9 Added HTTPS Hostname verification as standard. See change comment in https://docs.python.org/2/library/httplib.html#httplib.HTTPSConnection
Thanks to #EnnoGröper for the notice.

Related

python-requests how to send cipher name/http2

I am trying to replicate the following client requests via python-requests.
Under client connection I see HTTP Version which is 2.0 and TLS version which is 1.3, as up to my knowledge I know that requests utilizes TLS 1.3. My requests are failing as of now.
And I wonder if I need to pass certificates. I would like to understand how this request is different from regular request which would be simply as
r = requests.get('someurl')
How can I use requests to use the exact client connection show in requests? I don't fully understand each pointer, How would I use h2 ALPN/ with that specific cipher name? I am not expecting an solid answer to the question rather an explanation would be much more helpful!

python-requests doesn't support HTTP 2 request. You can use httpx package.
HTTPX is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support for both HTTP/1.1 and HTTP/2.
Example
import ssl
import httpx
# create an ssl context
ssl_context = ssl.SSLContext(protocol=ssl.PROTOCOL_TLS)
# ssl.PROTOCOL_TLS - Selects the highest protocol version that both the client and server support.
# Despite the name, this option can select both "SSL" and "TLS" protocols.
# set protocol to use
ssl_context.set_alpn_protocols(["h2"])
CIPHERS = 'ECDH+AESGCM:ECDH+CHACHA20:DH+AESGCM:DH+CHACHA20:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:ECDH+HIGH:DH+HIGH:RSA+AESGCM:RSA+AES:RSA+HIGH:!aNULL:!eNULL:!MD5:!3DES'
# set ciphers
ssl_context.set_ciphers(CIPHERS)
# httpx verify param lets you pass a standard library ssl.SSLContext
response = httpx.get('https://example.com', verify=ssl_context)
print(response.http_version)
# outputs HTTP/2
Instead of using ssl.SSLContext, you can also use httpx.create_ssl_context() to set the ssl context.

As far as I know python-requests is a library which currently1 doesn't support HTTP/2.0. This question has been answered here.
However there are python libraries like Python httpx supporting HTTP/2.0!
Kind regards,
1 Feb 16, 2021

Python requests being fingerprinted?

I'm hacking together an amazon api and when only using python requests without proxying, it prompts for a captcha. When routing this python requests traffic through fiddler, it seems to pass without a problem. Is it possible that amazon is fingerprinting python requests and fiddler changes the fingerprint since it's a proxy?
I viewed headers sent from fiddler and python requests and they are the same.
There is no exra proxying/fiddler rules/filters set on fiddler to create a change.
To be clear, all mentioned proxying is only done locally, so it will not change the public ip address.
Thank you!

The reason is that websites are fingerprinting your requests with TLS hello package. There exist libraries like JA3 to generate a fingerprint for each request. They will intentionally block http clients like requests or urllib. If you uses a MITM proxy, because the proxy server create a new TLS connection with the server, the server only sees proxy server's fingerprint, so they will not block it.
If the server only blocks certain popular http libraries, you can simply change the TLS version, then you will have different fingerprint than the default one.
If the server only allows popular real-world browsers, and only accepts them as valid requests, you will need libraries that can simulate browser fingerprints, one of which is curl-impersonate and its python binding curl_cffi.
pip install curl_cffi
from curl_cffi import requests
# Notice the impersonate parameter
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome101")
print(r.json())
# output: {'ja3_hash': '53ff64ddf993ca882b70e1c82af5da49'
# the fingerprint should be the same as target browser

Creating a Charles proxy alternative using Python

I am using Charles proxy right now to monitor traffic between my devices and a website. The traffic is SSL and I am able to read it on charles. The issue is charles makes the content hard to read when I am filtering through hundreds of variables in s JSON object. I created a program that will filter the JSON after exporting the charles log. My next step is to get rid of charles completely and create my own proxy in python that can view http and https data. I was wondering if scapy or any other existing libraries existed that would work? I am interested with scapy because I can save the proxy log as a pcap file.

Reading through mitmproxy would be overwhelming since it's a huge source base. If you would like to implement the proxy server from scratch. Here is what I learn during developing Proxyman
Learn how to set up a tiny Proxy server: Basically, open the listening socket at your port (9090 for example). Accept any incoming requests and get the first line of the HTTP Message. It could be done a lightweight http-parser or any Python parser. The raw HTTP message looks like:
CONNECT https://google.com HTTP/1.1
Parse and get the google and the IP: Open the socket connection to the destination IP and start to receive and sent forth and back from the client <-> the destination server.
The first step is essential to implement the HTTP Proxy in this step. Use http-parser to parse the rest of the HTTP Message. Thus, you can get the headers and body from the Request / Response -> Present to UI
Learn how HTTPS and SSL work: Use OpenSSL to generate a self-signed certificate and how to generate the chain certificates too.
Learn how to import those certificate to the macOS keychain by using security CLI or Security framework from Apple.
When you've done: it's time to start the HTTPS interception: Start the 2nd step and do SSL Handshake with appropriate certificate in both sides (Client -> Your Proxy Server and your Proxy Server -> Destination)
Parse the HTTP message as usual and get the rest of the message.
Overall, there are a lot of open sources out there, but I suggest to start from the simple version before moving on.
Hope that could help you.

Requests failing to connect to a TLS server

I'm having an issue tracking down why requests fails to connect to a specific host.
The following works just fine via curl, or browser:
curl https://banking4.anz.com
However if I use requests:
requests.get('https://banking4.anz.com')
I get:
SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
On the wire, I see only the client hello and the server disconnects immediately, so it doesn't seem like any ssl or cipher incompatibility. (I'd expect an SSL-layer error for those) What else could be an issue in this case?
I'm on python 3.6.1 with requests 2.14.2 (with security extras).

This server is broken in multiple ways.
For one, it only understands DES-CBC3-SHA which is considered insecure and not included in the default cipher set used by requests. Additionally it looks like that it only checks a limited number of offered ciphers in the ClientHello and thus will not see that DES-CBC3-SHA is offered by the client if too much other offers are before this cipher.
A quick workaround for this broken server is to only offer the only cipher the server supports:
import requests
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'DES-CBC3-SHA'
requests.get('https://banking4.anz.com')
But note that this sets the default cipher list of requests to an insecure value. Thus this method should not be used if you want to connect to other sites within your application. Instead have a look at this more complex solution of using your own HTTPAdapter with specific cipher settings for the broken site.

Cant seem to get https and socks proxies to work using python requests

So I'm looking at traffic using wireshark and comparing the output for a number of situations. I'm only looking at traffic between me and google.co.za.
Situation 1: Accessing google.co.za using no proxy
requests.get('www.google.co.za')
This returns a response with status=200 and wireshark displays info about traffic passing between my pc and google's servers. This is great so far.
Situation 2: Accessing google.co.za using valid http proxy
requests.get("http://google.co.za",proxies={'http':proxy})
This returns a response with status=200 and wireshark displays no data about traffic passing between my pc and google's servers. This is great and expected and stuff.
Situation 3: Accessing google.co.za using valid socks proxy
requests.get("http://google.co.za",proxies={'socks':proxy})
result as per situation 1. Hmmm
Situation 4: same deal with https
requests.get("http://google.co.za",proxies={'https':proxy})
same result as situation 1.
Question
So it looks like when I try to use https and socks proxies requests acts as though the proxy argument is empty. Now I need to pass traffic through all sorts of proxies and I don't want any silent failures.
My question is: Why is stuff failing silently and what can I do to fix it?

Requests simply does not yet support either SOCKS or HTTPS proxies.
They're working in it, though. See here: https://github.com/kennethreitz/requests/pull/1515
Support for HTTPS proxies has already been merged into the requests 2.0 branch, so if you like you can try that version; be wary though, as it it is currently an unstable branch.
SOCKS proxy support, on the other hand, is still being worked on in the lower-level library, urllib3: https://github.com/shazow/urllib3/pull/68
Also, regardless of that, you are using the proxies argument incorrectly. It should be of the form {protocol_of_sites_you_visit: proxy}, so once support is complete, using a SOCKS5 proxy would actually be more along the lines of {"http": "socks5://127.0.0.1:9050"}.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.