vpn in python for the requests module - python

I need to send requests through requests, but the privacy of my address and location is mandatory. Now the question is, how can I connect VPN and Python or provide privacy in a different way?
import requests
headers = {'user-agent':'my user agent',
'acept':'*/*'}
print(requests.get('https://httpbin.org/ip', headers=headers).text)

Related

Requests Proxy NTLM Authentication (Without Hardcoding Username and Password)

I'm trying to send requests to an HTTPS website behind corporate proxy that requires authentication.
The below code works:
import requests
import urllib
requests.get('https://google.com',
proxies={
'http': 'http://myusername:mypassword#10.20.30.40:8080',
'https': 'http://myusername:mypassword#10.20.30.40:8080'
},
verify=False)
but I want to avoid hardcoding the username and password in the script file, especially that we have to reset our passwords every 60 days. I need it to automatically authenticate as the logged-in Windows user, same behavior as a browser.
I looked online and learned that this is not possible through requests (1, 2, 3, 4) and I would have to resort to something like pycurl or px but all examples online had username and password provided explicitly. There was also this solution utilizing win32com.client but I have no idea how to use this in place of requests.

HTTP GET and POST requests from python without using the requests module

I would like to access a ressource with a particular url. Let's say I have only access to a PC (without admin rights) from which I cannot use the requests module due to different reasons.
Normally, I would address an API und perform HTTP GET and HTTP POST requests with:
import requests
url = r"https://httpbin.org/json"
r = requests.get(url)
If I would like to provide header and authorisation details, I would add
headers = {"Content-Type": "application/json"}
auth = ("username", "password")
r = requests.post(url, auth=auth, headers=headers)
as well as the payload in the data exchange format of the API (either JSON or XML).
Unfortunately, I cannot use the requests module on the aforementioned system. However, I can use the selenium module with the Internet Explorer webdriver (no Firefox and no Chrome).
I tried to access the url of the API with
from selenium import webdriver
driver = webdriver.Ie()
driver.get(url)
This does open an authentication popup, which I cannot access with the selenium "switch_to" functions. Ideally, I would like to perform a HTTP POST via selenium and provide authentication as well as header information. Would that be possible?

How to get info/data from blocked web sites with BeautifulSoup?

I want to write a script with python 3.7. But first I have to scrape it.
I have no problems with connecting and getting data from un-banned sites, but if the site is banned it won't work.
If I use a VPN service I can enter these "banned" sites with Chrome browser.
I tried setting a proxy in pycharm, but I failed. I just got errors all the time.
What's the simplest and free way to solve this problem?
from urllib.request import Request, urlopen
from bs4 import BeautifulSoup as soup
req = Request('https://www.SOMEBANNEDSITE.com/', headers={'User-Agent': 'Mozilla/5.0'}) # that web site is blocked in my country
webpage = urlopen(req).read() # code stops running at this line because it can't connect to the site.
page_soup = soup(webpage, "html.parser")
There are multiple ways to scrape blocked sites. A solid way is to use a proxy service as already mentioned.
A proxy server, also known as a "proxy" is a computer that acts as a gateway between your computer and the internet.
When you are using a proxy, you requests are being forwarded through the proxy. Your ip is not directly exposed to the site that you are scraping.
You cant simply take any ip (say xxx.xx.xx.xxx) and port (say yy) do
import requests
proxies = { 'http': "http://xxx.xx.xx.xxx:yy",
'https': "https://xxx.xx.xx.xxx:yy"}
r = requests.get('http://www.somebannedsite.com', proxies=proxies)
and expect to get a response.
The proxy should be configured to take your request and send you a response.
so, where can you get a proxy?
a. You could buy proxies from many providers.
b. Use a list of free proxies from the internet.
You don't need to buy proxies unless you are doing some massive scale scraping.
For now i will focus on free proxies available on the internet. Just do a google search for "free proxy provider" and you will find a list of sites offering free proxies. Go to any one of them and get any ip and corresponding port.
import requests
#replace the ip and port below with the ip and port you got from any of the free sites
proxies = { 'http': "http://182.52.51.155:39236",
'https': "https://182.52.51.155:39236"}
r = requests.get('http://www.somebannedsite.com', proxies=proxies)
print(r.text)
You should if possible use a proxy having 'Elite' anonymity level (the anonymity level will be specified in most of the sites providing the free proxy). If interested you could also do a google searh to find the difference between 'elite', 'anonymous' and 'transparent' proxies.
Note:
Most of these free proxies are not that reliable. So if you get error with one ip and port combination. try a different one.
Your best solution would be to use a proxy via the requests library. This would be the best solution for you since it has the capability of flexibly handling the requests via a proxy.
Here is a small example:
import requests
from bs4 import BeautifulSoup as soup
# use your usable proxies here
# replace host with you proxy IP and port with port number
proxies = { 'http': "http://host:port",
'https': "https://host:port"}
text = requests.get('http://www.somebannedsite.com', proxies=proxies, headers={'User-Agent': 'Mozilla/5.0'}).text
page_soup = soup(text, "html.parser") # use whatever parser you prefer, maybe lxml?
If you want to use SOCKS5, then you'd have to get the dependencies via pip install requests[socks] and then replace the proxies part by:
# user is your authentication username
# pass is your auth password
# host and port are similar as above
proxies = { 'http': 'socks5://user:pass#host:port',
'https': 'socks5://user:pass#host:port' }
If you don't have proxies at hand, you can fetch some proxies.

Python requests NTLM without password [duplicate]

How can I use automatic NTLM authentication from python on Windows?
I want to be able to access the TFS REST API from windows without hardcoding my password, the same as I do from the web browser (firefox's network.automatic-ntlm-auth.trusted-uris, for example).
I found this answer which works great for me because:
I'm only going to run it from Windows, so portability isn't a problem
The response is a simple json document, so no need to store an open session
It's using the WinHTTP.WinHTTPRequest.5.1 COM object to handle authentication natively:
import win32com.client
URL = 'http://bigcorp/tfs/page.aspx'
COM_OBJ = win32com.client.Dispatch('WinHTTP.WinHTTPRequest.5.1')
COM_OBJ.SetAutoLogonPolicy(0)
COM_OBJ.Open('GET', URL, False)
COM_OBJ.Send()
print(COM_OBJ.ResponseText)
You can do that with https://github.com/requests/requests-kerberos. Under the hood it's using https://github.com/mongodb-labs/winkerberos. The latter is marked as Beta, I'm not sure how stable it is. But I have requests-kerberos in use for a while without any issue.
Maybe a more stable solution would be https://github.com/brandond/requests-negotiate-sspi, which is using pywin32's SSPI implementation.
I found solution here https://github.com/mullender/python-ntlm/issues/21
pip install requests
pip install requests_negotiate_sspi
import requests
from requests_negotiate_sspi import HttpNegotiateAuth
GetUrl = "http://servername/api/controller/Methodname" # Here you need to set your get Web api url
response = requests.get(GetUrl, auth=HttpNegotiateAuth())
print("Get Request Outpot:")
print("--------------------")
print(response.content)
for request by https:
import requests
from requests_negotiate_sspi import HttpNegotiateAuth
import urllib3
urllib3.disable_warnings()
GetUrl = "https://servername/api/controller/Methodname" # Here you need to set your get Web api url
response = requests.get(GetUrl, auth=HttpNegotiateAuth(), verify=False)
print("Get Request Outpot:")
print("--------------------")
print(response.content)
NTLM credentials are based on data obtained during the interactive logon process, and include a one-way hash of the password. You have to provide the credential.
Python has requests_ntlm library that allows for HTTP NTLM authentication.
You can reference this article to access the TFS REST API :
Python Script to Access Team Foundation Server (TFS) Rest API
If you are using TFS 2017 or VSTS, you can try to use Personal Access Token in a Basic Auth HTTP Header along with your REST request.

Python Requests - Emulate Opening a Browser at Work

At work, I sit behind a proxy. When I connect to the company WiFi and open up a browser, a pop-up box usually appears asking for my company credentials before it will let me navigate to any internal/external site.
I am using the Python Requests package to automate pulling data from an external site but am encountering a 401 error that is related to not having authenticated first. This happens when I don't authenticate first using the browser. If I authenticate first with the browser and then use Python requests then everything is fine and I'm able to navigate to any site.
My questions is how do I perform the work authentication part using Python? I want to be able to automate this process so that I can set a cron job that grabs data from an external source every night.
I've tried providing an blank URL:
import requests
response = requests.get('')
But requests.get() requires a properly structure URL. I want to be able to emulate as if I've opened up a browser and capturing the pop-up that asks for authentication. This does not rely on any URL being used.
From the requests documentation
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)

Categories

Resources