Python requests: using proxy but get connect 'Connection aborted' - python

My vpn works WELL, and can visit google(from China).
sock5 Port of my VPN: 1080
But when I run the following code, I get error.
import requests
headers = {'user-agent': ''}
proxies = {"http": "socks5://127.0.0.1:1080",'https': 'socks5://127.0.0.1:1080'}
# url = 'https://www.baidu.com/'
url = 'https://www.google.com/search?q=python' #
res = requests.get(url, headers=headers, proxies=proxies)
print("res.status_code:\n",res.status_code)
if I remove , proxies=proxies, and change the url to baidu it works.
...
url = 'https://www.baidu.com/'
# url = 'https://www.google.com/search?q=python'
res = requests.get(url, headers=headers)
print("res.status_code:\n",res.status_code)
the error in 3:
Traceback (most recent call last):
File "Try.py", line 17, in <module>
res = requests.get(url, headers=headers, proxies=proxies)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests-2.22.0-py3.7.egg/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests-2.22.0-py3.7.egg/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests-2.22.0-py3.7.egg/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests-2.22.0-py3.7.egg/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests-2.22.0-py3.7.egg/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', OSError(0, 'Error'))
from 1 to 4, 1 is contradictory to 4. I don't really know where the problem is. I'd be extremely grateful If someone can help.

Solved by substituting sock5 with sock5h

Related

Getting (requests.exceptions.ConnectionError) using requests.get(URL)

I get requests.exceptions.ConnectionError error when I'm Running following code:
from requests import *
from bs4 import *
URL = "https://www.ldoceonline.com/dictionary/"
response = get(URL)
But when I test it with another URL, It works. I really want to scrape this website. How to fix this error?
Complete error note:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:/amirhossein/Project/Programming/L/Longmandict.py", line 5, in <module>
response = get(URL,verify=False)
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\requests\api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\User\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Apparently the server needs correct User-Agent HTTP header to be set:
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:92.0) Gecko/20100101 Firefox/92.0"
}
url = "https://www.ldoceonline.com/dictionary/"
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
print(soup.title.text)
Prints:
Longman English Dictionaries | Meanings, thesaurus, collocations and grammar

PUT Data on HDFS via HTTP WEB API

I'm trying to implement a PUT request on HDFS via the HDFS Web API.
So I looked up the Documentation on how to do that : https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE
First do a PUT without redirect, to get a 307, catch the new URL and then PUT on that URL with DATA.
When I do my first put, I do get the 307, but the URL is the same has the first one. So I'm note sure if I'm already on the "good" datanode or not.
Any way, I get this URL and try to add DATA to it, but I get an error, from what I understand, it is a connection error. Host is cutting down the connection.
class HttpFS:
def __init__(self, url=settings.HTTPFS_URL):
self.httpfs = url
self.auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL)
def put(self, local_file, hdfs_dir):
url = "{}{}".format(self.httpfs, hdfs_dir)
params = {"op": "CREATE", "overwrite": True}
print(url)
r = requests.put(url, auth=self.auth, params=params, stream=True, verify=settings.CA_ROOT_PATH, allow_redirects=False)
r = requests.put(r.headers['Location'], auth=self.auth, data=open(local_file, 'rb'), params=params, stream=True, verify=settings.CA_ROOT_PATH)
Here is the error given:
r = requests.put(r.headers['Location'], auth=self.auth, data=open(local_file, 'rb'), params=params, stream=True, verify=settings.CA_ROOT_PATH)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/api.py", line 134, in put
return request('put', url, data=data, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))
Edit 1:
I tried also with https://github.com/pywebhdfs/pywebhdfs repos. Since it is supposed to do exactly what i'm looking for. But I still have this Broken Pipe Error.
from requests_kerberos import OPTIONAL, HTTPKerberosAuth
from pywebhdfs.webhdfs import PyWebHdfsClient
from utils import settings
auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL)
url = f"https://{settings.HDFS_HTTPFS_HOST}:{settings.HDFS_HTTPFS_PORT}/webhdfs/v1/"
hdfs_client = PyWebHdfsClient(base_uri_pattern=url, request_extra_opts={'auth':auth, 'verify': settings.CA_ROOT_PATH})
with open(data_dir + file_name, 'rb') as file_data:
hdfs_client.create_file(hdfs_path + file_name, file_data=file_data, overwrite=True)
Same error:
hdfs_client.create_file(hdfs_path + file_name, file_data=file_data, overwrite=True)
File "/home/cdsw/.local/lib/python3.6/site-packages/pywebhdfs/webhdfs.py", line 115, in create_file
**self.request_extra_opts)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/api.py", line 134, in put
return request('put', url, data=data, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "/home/cdsw/.local/lib/python3.6/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))
Edit 2:
I found out I was sending too much data at once. So now I create a file on HDFS, then I append to it with chunk of data. But it is slow... And I still can get the same error as above randomly. The bigger are the are the chunk the more I have chances to get a Connection aborted. My files more in range of 200Mb, so it take ages comparing to the Hadoop binary "hdfs dfs -put"

SSLError: request module cannot connect via https

What am I missing?
HINT: I've also tried using urllib module
import requests
import sys
import time
import random
headers = {"User-Agent": "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/25.0"}
url = "HTTP LINK TO YOUTUBE VIDEO"
views = 10
videoMins = 3
videoSec = 33
refreshRate = videoMins * 60 + videoSec
proxy_list = [
{"http":"49.156.37.30:65309"}, {"http":"160.202.42.106:8080"},
{"http":"218.248.73.193:808"}, {"http":"195.246.57.154:8080"},
{"http":"80.161.30.156:80"}, {"http":"122.228.25.97:8101"},
{"http":"165.84.167.54:8080"},{"https":"178.140.216.229:8080"},
{"https":"46.37.193.74:3128"},{"https":"5.1.27.124:53281"},
{"https":"196.202.194.127:62225"},{"https":"194.243.194.51:8080"},
{"https":"206.132.165.246:8080"},{"https":"92.247.127.177:3128"}]
proxies = random.choice(proxy_list)
while True:
for view in range(views): # to loop in the number of allocated views
s = requests.Session()
s.get(url, headers=headers, proxies=proxies, stream=True, timeout=refreshRate)
s.close()
time.sleep(60) # time between loops so we appear real
sys.exit()
Here's the traceback error I got:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pytest.py", line 24, in <module>
s.get(url, headers=headers, proxies=proxies, stream=True,
timeout=refreshRate)
File "C:\Python\lib\site-packages\requests\sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "C:\Python\lib\site-packages\requests\sessions.py", line 508, in
request
resp = self.send(prep, **send_kwargs)
File "C:\Python\lib\site-packages\requests\sessions.py", line 640, in send
history = [resp for resp in gen] if allow_redirects else []
File "C:\Python\lib\site-packages\requests\sessions.py", line 640, in
<listcomp>
history = [resp for resp in gen] if allow_redirects else []
File "C:\Python\lib\site-packages\requests\sessions.py", line 218, in
resolve_redirects
**adapter_kwargs
File "C:\Python\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\Python\lib\site-packages\requests\adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.youtube.com',
port=443): Max retries exceed
ch?v=dHUP25DkKWo (Caused by SSLError(SSLError("bad handshake:
SysCallError(-1, 'Unexpected EOF')",),))
I suspect max retries from youtube. But its confusing because I'm connecting via random proxies. If that's the case, maybe the proxies aren't working...or no https connection was made.

Python: Requests patch method doesn't work

I have the code below which works fine and brings back what I need
import requests
from requests.auth import HTTPBasicAuth
response = requests.get('https://example/answers/331', auth=HTTPBasicAuth('username', 'password'),json={"solution": "12345"})
print response.content
However when I change it to a patch method, which is accepted by the server, I get the following errors. Any idea on why?
Traceback (most recent call last):
File "auth.py", line 8, in <module>
response = requests.patch('https://example/answers/331', auth=HTTPBasicAuth('username', 'password'),json={"solution": "12345"})
File "C:\Python27\lib\site-packages\requests-2.12.0-py2.7.egg\requests\api.py", line 138, in patch
return request('patch', url, data=data, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.12.0-py2.7.egg\requests\api.py", line 56, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.12.0-py2.7.egg\requests\sessions.py", line 488, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python27\lib\site-packages\requests-2.12.0-py2.7.egg\requests\sessions.py", line 609, in send
r = adapter.send(request, **kwargs)
File "C:\Python27\lib\site-packages\requests-2.12.0-py2.7.egg\requests\adapters.py", line 473, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', BadStatusLine("''",))
Thanks
Try using a POST request with the following header: X-HTTP-Method-Override: PATCH
This is unique to the Oracle Service Cloud REST API implementation and is documented.
In cases where the browser or client application does not support PATCH requests, or network intermediaries block PATCH requests, HTTP tunneling can be used with a POST request by supplying an X-HTTP-Method-Override header.
Example:
import requests
restURL = <Your REST URL>
params = {'field': 'val'}
headers = {'X-HTTP-Method-Override':'PATCH'}
try:
resp = requests.post(restURL, json=params, auth=('<uname>', '<pwd>'), headers=headers)
print resp
except requests.exceptions.RequestException as err:
errMsg = "Error: %s" % err
print errMsg

Python Requests - SSL error for client side cert

I'm calling a REST API with requests in python and so far have been successful when I set verify=False.
Now, I have to use client side cert that I need to import for authentication and I'm getting this error everytime I'm using the cert (.pfx). cert.pfx is password protected.
r = requests.post(url, params=payload, headers=headers,
data=payload, verify='cert.pfx')
This is the error I'm getting:
Traceback (most recent call last):
File "C:\Users\me\Desktop\test.py", line 65, in <module>
r = requests.post(url, params=payload, headers=headers, data=payload, verify=cafile)
File "C:\Python33\lib\site-packages\requests\api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "C:\Python33\lib\site-packages\requests\api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python33\lib\site-packages\requests\sessions.py", line 346, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python33\lib\site-packages\requests\sessions.py", line 449, in send
r = adapter.send(request, **kwargs)
File "C:\Python33\lib\site-packages\requests\adapters.py", line 322, in send
raise SSLError(e)
requests.exceptions.SSLError: unknown error (_ssl.c:2158)
I've also tried openssl to get .pem and key but with .pem and getting SSL: CERTIFICATE_VERIFY_FAILED
Can someone please direct me on how to import the certs and where to place it? I tried searching but still faced with the same issue.
I had this same problem. The verify parameter refers to the server's certificate. You want the cert parameter to specify your client certificate.
import requests
cert_file_path = "cert.pem"
key_file_path = "key.pem"
url = "https://example.com/resource"
params = {"param_1": "value_1", "param_2": "value_2"}
cert = (cert_file_path, key_file_path)
r = requests.get(url, params=params, cert=cert)
I had the same problem and to resolve this, I came to know that we have to send RootCA along with certificate and its key as shown below,
response = requests.post(url, data=your_data, cert=('path_client_certificate_file', 'path_certificate_key_file'), verify='path_rootCA')

Categories

Resources