I was trying to send http/https requests via proxy (socks5), but I can't understand if the problem is in my code or in the proxy.
I tried using this code and it gives me an error:
requests.exceptions.ConnectionError: SOCKSHTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.contrib.socks.SOCKSHTTPSConnection object at 0x000001B656AC9608>: Failed to establish a new connection: Connection closed unexpectedly'))
This is my code:
import requests
url = "https://www.google.com"
proxies = {
"http":"socks5://fsagsa:sacesf241_country-darwedafs_session-421dsafsa#x.xxx.xxx.xx:31112",
"https":"socks5://fsagsa:sacesf241_country-darwedafs_session-421dsafsa#x.xxx.xxx.xx:31112",
}
headers = {
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Sec-Gpc": "1",
"Sec-Fetch-Site": "same-origin",
"Sec-Fetch-User": "?1",
"Accept-Encoding": "gzip, deflate, br",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Dest": "document",
"Accept-Language": "en-GB,en;q=0.9"
}
r = requests.get(url, headers = headers, proxies = proxies)
print(r)
Then, I checked the proxy with an online tool
The tool manages to send requests through the proxy. .
So the problem is in this code? I can't figure out what's wrong.
Edit (15/09/2021)
I added headers but the problem is still there.
Create a local server/mock to handle the request using pytest or some other testing framework with responses library to eliminate variables external to your application/script. I’m quite sure Google will reject requests with empty headers. Also, ensure you installed the correct dependencies to enable SOCKS proxy support in requests (python -m pip install requests[socks]). Furthermore, if you are making a remote request to connect to your proxy you must change socks5 to socks5h in your proxies dictionary.
References
pytest: https://docs.pytest.org/en/6.2.x/
responses: https://github.com/getsentry/responses
requests[socks]: https://docs.python-requests.org/en/master/user/advanced/#socks
In addition to basic HTTP proxies, Requests also supports proxies using the SOCKS protocol. This is an optional feature that requires that additional third-party libraries be installed before use.
You can get the dependencies for this feature from pip:
$ python -m pip install requests[socks]
Once you’ve installed those dependencies, using a SOCKS proxy is just as easy as using a HTTP one:
proxies = {
'http': 'socks5://user:pass#host:port',
'https': 'socks5://user:pass#host:port'
}
Using the scheme socks5 causes the DNS resolution to happen on the client, rather than on the proxy server. This is in line with curl, which uses the scheme to decide whether to do the DNS resolution on the client or proxy. If you want to resolve the domains on the proxy server, use socks5h as the scheme.
Related
On Powershell, I am currently performing this request, copied from the network tab on developer tools
$session = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$session.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36"
Invoke-WebRequest -useBasicParsing -Uri "https://......?...."
-WebSession $session
-Headers #{
"Accept"="*/*"
"Accept-Encoding"="gzip, deflate, br"
"Accept-Language"="en-US,en;q=0.9"
"Authorization"="Basic mzYw....="
"Referer"="https://......."
"sec-Fetch-Dest"="empty"
"sec-Fetch-Mode"=="cors"
"sec-Fetch-Site="same-origin"
"sec-ch-ua"="`""Chromium`";v=`"106`",`"Google Chrome`";v=`"106`",`"NotlA=Brand`";v=`"99`""
"sec-ch-ua-mobile"="?0"
"sec-ch-ua-platform"="`"Windows`""
}
-ContentType "application/x-www-form-urlencoded"
which returns me the response 200 just fine.
However, when i tried to perform the same requests, with the headers config on python requests, I am getting a SSL proxy-related error (see SSL_verification wrong version number even with certifi verify).
Is proxy automatically configured on PowerShell requests? How can I find out what proxy are my requests currently routed to? Otherwise, how can I replicate 1:1 powershell requests to python requests?
I have tried running ipconfig /all command and using the Primary Dns Suffix field as proxy arguments in requests
requests.get(url, header = headers_in_powershell, proxies = { 'http': 'the_dns_suffix', 'https': 'the_dns_suffix' }
but the requests just gets stuck (waits with no response indefinitely).
For most commands, Powershell uses the system proxy by default (or they have a -Proxy switch to tell them where it is, but some don't and have to be told to use it.
From memory, Invoke-WebRequest can be problematical as (I think) it uses the .NET web client
Try adding this to the start of the PS script:
[System.Net.WebRequest]::DefaultWebProxy = [System.Net.WebRequest]::GetSystemWebProxy()
[System.Net.WebRequest]::DefaultWebProxy.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials
1 - The target DOMAIN is https://www.dnb.com/
This website is blocking access to it from many countries around the world including mine (Algeria).
So the known solution is clear (use a proxy), which I did.
2 - Configuring the system proxy in the network configuration, and connecting to the website via (Google Chrome) works, also using Firefox with the proxy settings works fine.
3 - I came to my code to start the job
import requests
# 1. Initialize the proxy
proxy = "xxx.xxx.xxx.xxx:3128"
# 2. Setting the Headers (I cloned Firefox request headers)
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0",
"Accept-Encoding": "gzip, deflate, br",
"Connection": "keep - alive",
"Accept": "text/html, application/xhtml+xml, application/xml;q=0.9, image/webp, */*;q = 0.8",
"Upgrade - Insecure - Requests": "1",
"Host": "www.dnb.com",
"DNT": "1"
}
# 3. URL
URL = "https://www.dnb.com/business-directory/company-profiles.bicicletas_monark_s-a.7ad1f8788ea84850ceef11444c425a52.html"
# 4. Make a get request.
r = requests.get(URL, headers=headers, proxies={"https": proxy})
# Nothing in return and program keep executing (like infinite loop).
Note:
I know this keeps on waiting because the default timeout is set to None, but it is sure that the setup is working, and the requests library must return a response, using the timeout here can be to assess the reliability of the proxy as an example.
So, What the cause for this, it stuck (and I'm also), I'm getting the response and the correct HTML content with (Firefox, Chrome, Postman) with the same configuration.
I checked your code and ran it on my local machine. It seems the issue is with proxy. I added a public proxy and it is working. You can confirm it by adding a "timeout" argument to the requests.get function to some seconds. Also if the code working properly(even the response is 403) it means there is an issue with the proxy.
My code works when I make request from location machine.
When I try to make the request from AWS EC2 I get the following error:
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='www1.xyz.com', port=443): Read timed out. (read timeout=20)
I tried checking the url and that was not the issue. I then went ahead and tried to visit the page using the url and hidemyass webproxy with location set to the AWS EC2 machine, it got a 404.
The code:
# Dummy URL's
header = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.75 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
url = 'https://www1.xyz.com/iKeys.jsp?symbol={}&date=31DEC2020'.format(
symbol)
raw_page = requests.get(url, timeout=10, headers=header).text
I have tried setting the proxies to another ip address in the request, which I searched online:
proxies = {
"http": "http://125.99.100.193",
"https": "https://125.99.100.193",}
raw_page = requests.get(url, timeout=10, headers=header, proxies=proxies).text
Still got the same error.
1- Do I need to specify the port in proxies? Could this be causing the error when proxy is set?
2- What could be a solution for this?
Thanks
I want to check the login status so. I make program to check it
import requests
import json
import datetime
headers = {
"Accept": "application/json, text/plain, */*",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "ko-KR,ko;q=0.9,en-US;q=0.8,en;q=0.7",
"Connection": "keep-alive",
"Content-Length": "50",
"Content-Type": "application/json;charset=UTF-8",
"Cookie": "_ga=GA1.2.290443894.1570500092; _gid=GA1.2.963761342.1579153496; JSESSIONID=A4B3165F23FBEA34B4BBE429D00F12DF",
"Host": "marke.ai",
"Origin": "http://marke",
"Referer": "http://marke/event2/login",
"User-Agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Mobile Safari/537.36",
}
url = "http://mark/api/users/login"
va = {"username": "seg", "password": "egkegn"}
c = requests.post(url, data=json.dumps(va), headers=headers)
if c.status_code != 200:
print("error")
This is working very well in my windows local with Pycharm
but when i ran the code in Linux i got the error like this
requests.exceptions.ProxyError: HTTPConnectionPool(host='marke', port=80):
Max retries exceeded with url: http://marke.ai/api/users/login (
Caused by ProxyError('Cannot connect to proxy.',
NewConnectionError('<urllib3.connection.HTTPConnection>: Failed to establish a new connection: [Errno 110] Connection timed out',)
)
)
So.. what is the problem please teach me also if you know the solution please teach me!!
thank you
According to your error, it seems you are behind a proxy.
So you have to specify your proxy parameters when building your request.
Build your proxies as a dict following this format
proxies = {
"http": "http://my_proxy:my_port",
"https": "https://my_proxy:my_port"
}
If you don't know your proxy parameters, then you can get them using urllib module :
import urllib
proxies = urllib.request.getproxies()
There's a proxy server configured on that Linux host, and it can't connect to it.
Judging by the documentation, you may have a PROXY_URL environment variable set.
Modifying #Arkenys answer. Please try this.
import urllib.request
proxies = urllib.request.getproxies()
# all other things
c = requests.post(url, data=json.dumps(va), headers=headers, proxies=proxies)
Sending a post request with proxies but keep running into proxy error.
Already tried multiple solutions on stackoverflow for [WinError 10061] No connection could be made because the target machine actively refused it.
Tried changing, system settings, verified if the remote server is existing and running, also no HTTP_PROXY environment variable is set in the system.
import requests
proxy = {IP_ADDRESS:PORT} #proxy
proxy = {'https': 'https://' + proxy}
#standard header
header={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Referer": "https://tres-bien.com/adidas-yeezy-boost-350-v2-black-fu9006-fw19",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8"
}
#payload to be posted
payload = {
"form_key":"1UGlG3F69LytBaMF",
"sku":"adi-fw19-003",
# above two values are dynamically populating the field; hardcoded the value here to help you replicate.
"fullname": "myname",
"email": "myemail#gmail.com",
"address": "myaddress",
"zipcode": "areacode",
"city": "mycity" ,
"country": "mycountry",
"phone": "myphonenumber",
"Size_raffle":"US_11"
}
r = requests.post(url, proxies=proxy, headers=header, verify=False, json=payload)
print(r.status_code)
Expected output: 200, alongside an email verification sent to my email address.
Actual output: requests.exceptions.ProxyError: HTTPSConnectionPool(host='tres-bien.com', port=443): Max retries exceeded with url: /adidas-yeezy-boost-350-v2-black-fu9006-fw19 (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError(': Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it',)))
Quite a few things are wrong here... (after looking at the raffle page you're trying to post to, I suspect it is https://tres-bien.com/adidas-yeezy-boost-350-v2-black-fu9006-fw19 based on the exception you posted).
1) I'm not sure whats going on with your first definition of proxy as a dict instead of a string. That said, it's probably a good practice to use both http and https proxies. If your proxy can support https then it should be able to support http.
proxy = {
'http': 'http://{}:{}'.format(IP_ADDRESS, PORT),
'https': 'https://{}:{}'.format(IP_ADDRESS, PORT)
}
2) Second issue is that the raffle you're trying to submit to takes url encoded form data, not json. Thus your request should be structured like:
r = requests.post(
url=url,
headers=headers,
data=payload
)
3) That page has a ReCaptcha present, which is missing from your form payload. This isn't why your request is getting a connection error, but you're not going to successfully submit a form that has a ReCaptcha field without a proper token.
4) Finally, I suspect the root of your ProxyError is you are trying to POST to the wrong url. Looking at Chrome Inspector, you should be submitting this data to
https://tres-bien.com/tbscatalog/manage/rafflepost/ whereas your exception output indicates you are POSTing to https://tres-bien.com/adidas-yeezy-boost-350-v2-black-fu9006-fw19
Good luck with the shoes.