I am trying to run a very simple selenium python code in PyCharm, but it throws out the following error every time when I call driver = webdriver.Firefox(driver_path).
/home/kimilao/PycharmProjects/pythonOne/main.py:7: DeprecationWarning: firefox_profile has been deprecated, please pass in an Options object
driver = webdriver.Firefox(driver_path)
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 169, in _new_conn
conn = connection.create_connection(
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 96, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 86, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/lib/python3.10/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/usr/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 200, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 181, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fedb7be9c00>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/kimilao/PycharmProjects/pythonOne/main.py", line 7, in <module>
driver = webdriver.Firefox(driver_path)
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/firefox/webdriver.py", line 177, in __init__
super().__init__(
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 277, in __init__
self.start_session(capabilities, browser_profile)
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 370, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 433, in execute
response = self.command_executor.execute(driver_command, params)
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/remote/remote_connection.py", line 344, in execute
return self._request(command_info[0], url, body=data)
File "/home/kimilao/.local/lib/python3.10/site-packages/selenium/webdriver/remote/remote_connection.py", line 366, in _request
response = self._conn.request(method, url, body=body, headers=headers)
File "/usr/lib/python3/dist-packages/urllib3/request.py", line 78, in request
return self.request_encode_body(
File "/usr/lib/python3/dist-packages/urllib3/request.py", line 170, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/poolmanager.py", line 375, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 783, in urlopen
return self.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 783, in urlopen
return self.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 783, in urlopen
return self.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=57917): Max retries exceeded with url: /session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fedb7be9c00>: Failed to establish a new connection: [Errno 111] Connection refused'))
Here's my full code:
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
driver_path = "/home/kimilao/Downloads/geckodriver-v0.31.0-linux64"
driver = webdriver.Firefox(driver_path)
driver.get("https://www.python.org/")
print(driver.title)
driver.quit()
The problem is that I can run this code without the errors when using Sublime Text, vscode and terminal, but not PyCharm, and I wonder why this happens to me. Thanks!
Python: 3.10.4
Selenium: 4.3.0
Firefox: 102.0 (64-bit)
Gecko Driver: geckodriver-v0.31.0-linux64
I'm pretty sure this happens because PyCharm automatically creates and enters a virtual environment, where you probably didn't run a pip install. To make sure, try launching these commands from vscode:
source venv/bin/activate
python your_script.py
deactivate # after executing your script, to exit the venv
If you do get the same error, then you didn't set up a venv properly
i use Pycharm (albeit on windows) and my import statements and driver object are completely different. Maybe you can give it a try and see if it works for you?
from selenium import webdriver
from selenium.webdriver.firefox.service import Service
from webdriver_manager.firefox import GeckoDriverManager
driver = webdriver.Firefox(service=Service(GeckoDriverManager().install()))
Note: this is under the assumption that the driver as been downloaded and installed (also, confirm your internet connection. I experienced something similar using chrome, but my problem was the internet connection)
Related
I need to access a website using selenium and Chrome using python.
Below is shortened version of my code.
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
# PROXY='https://myproxy.com:3128'
PROXY = 'https://username:password#myproxy.com:3128'
proxyuser='username' #this is the proxy user name used in above string
proxypwd='password' #this is the proxy password used in above string
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % PROXY)
chrome_options.add_argument("ignore-certificate-errors");
chrome = webdriver.Chrome(options=chrome_options,service=Service(ChromeDriverManager().install()))
chrome.get("https://www.google.com")
while True:
print('ok')
I am behind a corporate proxy server which requires authentication.
I am not sure how to pass login credentials and proxy settings for Install of chromedriver
When above code is run without proxy, it works as expected.
But it gives error as follows when run using proxy connection:
[WDM] - ====== WebDriver manager ======
[WDM] - Current google-chrome version is 105.0.5195
[WDM] - Get LATEST chromedriver version for 105.0.5195 google-chrome
Traceback (most recent call last):
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 1040, in _validate_conn
conn.connect()
File "D:\Python\Python39\lib\site-packages\urllib3\connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "D:\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "D:\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\Python\Python39\lib\ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "D:\Python\Python39\lib\ssl.py", line 1040, in _create
self.do_handshake()
File "D:\Python\Python39\lib\ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Python\Python39\lib\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 785, in urlopen
retries = retries.increment(
File "D:\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 550, in increment
raise six.reraise(type(error), error, _stacktrace)
File "D:\Python\Python39\lib\site-packages\urllib3\packages\six.py", line 769, in reraise
raise value.with_traceback(tb)
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "D:\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 1040, in _validate_conn
conn.connect()
File "D:\Python\Python39\lib\site-packages\urllib3\connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "D:\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
File "D:\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\Python\Python39\lib\ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "D:\Python\Python39\lib\ssl.py", line 1040, in _create
self.do_handshake()
File "D:\Python\Python39\lib\ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\eclipse-workspace\Essential Training\test.py", line 17, in <module>
chrome = webdriver.Chrome(options=chrome_options,service=Service(ChromeDriverManager().install()))
File "D:\Python\Python39\lib\site-packages\webdriver_manager\chrome.py", line 37, in install
driver_path = self._get_driver_path(self.driver)
File "D:\Python\Python39\lib\site-packages\webdriver_manager\core\manager.py", line 29, in _get_driver_path
binary_path = self.driver_cache.find_driver(driver)
File "D:\Python\Python39\lib\site-packages\webdriver_manager\core\driver_cache.py", line 95, in find_driver
driver_version = driver.get_version()
File "D:\Python\Python39\lib\site-packages\webdriver_manager\core\driver.py", line 42, in get_version
self.get_latest_release_version()
File "D:\Python\Python39\lib\site-packages\webdriver_manager\drivers\chrome.py", line 44, in get_latest_release_version
resp = self._http_client.get(url=latest_release_url)
File "D:\Python\Python39\lib\site-packages\webdriver_manager\core\http.py", line 31, in get
resp = requests.get(url=url, verify=self._ssl_verify, **kwargs)
File "D:\Python\Python39\lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "D:\Python\Python39\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "D:\Python\Python39\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "D:\Python\Python39\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "D:\Python\Python39\lib\site-packages\requests\adapters.py", line 547, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))
Bypassing proxy is not possible because of corporate policies
request("get", url, params=params, **kwargs)
this line in errors i believe is the issue.
it seems the driver is not able to determine the new version available on net as the request function here should have proxy settings - i am not able to see the **kwargs values, ideally - i believe, **kwargs should have proxy argument. - i am not sure but.
I have modified the code as follows.
essentially i have now used custom download manager
and this is worling
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from webdriver_manager.core.download_manager import WDMDownloadManager
from webdriver_manager.core.http import HttpClient
from selenium.webdriver.chrome.service import Service
from requests import Response
import urllib3
import requests
import os
os.environ['WDM_SSL_VERIFY'] = '0'
capabilities = webdriver.DesiredCapabilities.CHROME
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
PROXY = "https://username:password#myproxy.com:3128"
proxyuser='username'
proxypwd='password'
opt = webdriver.ChromeOptions()
opt.add_argument('--proxy-server=%s' % PROXY)
opt.add_argument("ignore-certificate-errors")
class CustomHttpClient(HttpClient):
def get(self, url, params=None) -> Response:
proxies={'http': 'http://username:password#myproxy.com:3128',
'https': 'http://username:password#myproxy.com:3128',
}
return requests.get(url, params,proxies=proxies,verify=False)
http_client = CustomHttpClient()
download_manager = WDMDownloadManager(http_client)
chrome = webdriver.Chrome(service=Service(ChromeDriverManager(download_manager=download_manager).install()),options=opt)
chrome.get("https://www.google.com")
while True:
print('ok')
Getting an Error for this Basic Appium Script while attempting to run it on a real device. Everything works fine if I used the Appium Inspector, but once I try to run the same code in PyCharm im getting an Error. Below is the JSON format, as Im pretty new to this library I don't know if I've done it correctly or not.
from appium import webdriver
desired_cap = {
"deviceName": "R9AN60B4CCJ",
"platformName": "Android",
"app": "C:\\Users\\John Doe\\AppData\\Local\\Android\\Sdk\\platform-tools\\airmirror2.apk"
} #The Capabilities to install an app from the apk file on your Computer
driver = webdriver.Remote('https://localhost:4723/wd/hub', desired_cap)
#Similar to Selenium; Declaring the Driver. The Appium Server should already be started
the Error is below
Traceback (most recent call last):
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
chunked=chunked)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request
self._validate_conn(conn)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 849, in _validate_conn
conn.connect()
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connection.py", line 356, in connect
ssl_context=context)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\util\ssl_.py", line 359, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/John Doe/PycharmProjects/SelTest/venv/FirstTest.py", line 11, in <module>
driver = webdriver.Remote('https://localhost:4723/wd/hub', desired_cap)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\appium\webdriver\webdriver.py", line 275, in __init__
AppiumConnection(command_executor, keep_alive=keep_alive), desired_capabilities, browser_profile, proxy
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 269, in __init__
self.start_session(capabilities, browser_profile)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\appium\webdriver\webdriver.py", line 369, in start_session
response = self.execute(RemoteCommand.NEW_SESSION, parameters)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 423, in execute
response = self.command_executor.execute(driver_command, params)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 333, in execute
return self._request(command_info[0], url, body=data)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 355, in _request
resp = self._conn.request(method, url, body=body, headers=headers)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\request.py", line 72, in request
**urlopen_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\request.py", line 150, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\poolmanager.py", line 322, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\util\retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='localhost', port=4723): Max retries exceeded with url: /wd/hub/session (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)')))
I dont know if I need a specific version of urllib3 in this venv.
Python requests not working for a website API but works in Browser (chrome). it used to work earlier but from the past 2 months its not working. the api i am using is yts.mx api.
I'll attach the code snippet below:
import requests
a = requests.get('https://yts.mx/api/v2/list_movies.json')
This is the error i am getting:
Traceback (most recent call last):
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "D:\Programs\Python\Python39\lib\http\client.py", line 1347, in getresponse
response.begin()
File "D:\Programs\Python\Python39\lib\http\client.py", line 307, in begin
version, status, reason = self._read_status()
File "D:\Programs\Python\Python39\lib\http\client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "D:\Programs\Python\Python39\lib\socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "D:\Programs\Python\Python39\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "D:\Programs\Python\Python39\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 726, in urlopen
retries = retries.increment(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 410, in increment
raise six.reraise(type(error), error, _stacktrace)
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\packages\six.py", line 734, in reraise
raise value.with_traceback(tb)
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 426, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 421, in _make_request
httplib_response = conn.getresponse()
File "D:\Programs\Python\Python39\lib\http\client.py", line 1347, in getresponse
response.begin()
File "D:\Programs\Python\Python39\lib\http\client.py", line 307, in begin
version, status, reason = self._read_status()
File "D:\Programs\Python\Python39\lib\http\client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "D:\Programs\Python\Python39\lib\socket.py", line 704, in readinto
return self._sock.recv_into(b)
File "D:\Programs\Python\Python39\lib\ssl.py", line 1241, in recv_into
return self.read(nbytes, buffer)
File "D:\Programs\Python\Python39\lib\ssl.py", line 1099, in read
return self._sslobj.read(len, buffer)
urllib3.exceptions.ProtocolError: ('Connection aborted.', TimeoutError(10060, 'A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond', None, 10060, None))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Programs\Python\Python39\test.py", line 8, in <module>
a = r.get('https://yts.mx/api/v2/list_movies.json')
File "D:\Programs\Python\Python39\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "D:\Programs\Python\Python39\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "D:\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "D:\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 643, in send
r = adapter.send(request, **kwargs)
File "D:\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', TimeoutError(10060, 'A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond', None, 10060, None))
and sometimes getting error like forcibly closed due to existing connection, this is the trace:
Traceback (most recent call last):
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 670, in urlopen
httplib_response = self._make_request(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 381, in _make_request
self._validate_conn(conn)
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 978, in _validate_conn
conn.connect()
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 362, in connect
self.sock = ssl_wrap_socket(
File "D:\Programs\Python\Python39\lib\site-packages\urllib3\util\ssl_.py", line 386, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "D:\Programs\Python\Python39\lib\ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "D:\Programs\Python\Python39\lib\ssl.py", line 1040, in _create
self.do_handshake()
File "D:\Programs\Python\Python39\lib\ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
This used to perfectly work a month ago or so. Tried various methods seen on web.
UPDATE
Turns out the issue is with the IP . When i use a VPN connection it works perfectly. However i dont need a VPN to access it from my web browser. any Idea why this is happening?
is there a way to bypass this?
Since you mentioned, that the browser request goes through but the python request does not, it crossed my mind, that the web site iteself blocks non-browser-requests. I found this question here, that dealt with that problem. The solution was to add a custom header to the python request to make it look like a browser request:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'} # This is chrome, you can set whatever browser you like
url = 'https://yts.mx/api/v2/list_movies.json'
a = requests.get(url,headers)
print(a.content)
With your request, you then send the cutom header as the second parameter in your get-request. I tried the script above and it worked for me, but also your original script worked on my connection.
I'm trying to install an application developed by someone else and when running a manage.py command I'm getting this error, which I do not understand. I wonder if anyone can tell me where to look to investigate first - thank you.
POST http://elasticsearch:9200/_bulk?refresh=true [status:N/A request:0.001s]
Traceback (most recent call last):
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/django/db/models/query.py", line 486, in get_or_create
return self.get(**lookup), False
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/djmoney/models/managers.py", line 209, in wrapper
queryset = func(*args, **kwargs)
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/django/db/models/query.py", line 399, in get
self.model._meta.object_name
providers.models.Organisation.DoesNotExist: Organisation matching query does not exist.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connection.py", line 160, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/util/connection.py", line 57, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.6/socket.py", line 745, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 95, in perform_request
response = self.pool.urlopen(method, url, body, retries=False, headers=self.headers, **kw)
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connectionpool.py", line 641, in urlopen
_stacktrace=sys.exc_info()[2])
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/util/retry.py", line 344, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/packages/six.py", line 686, in reraise
raise value
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connectionpool.py", line 355, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1254, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1300, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1249, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 974, in send
self.connect()
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connection.py", line 183, in connect
conn = self._new_conn()
File "/home/henry/.virtualenvs/hjsm/lib/python3.6/site-packages/urllib3/connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f506c988eb8>: Failed to establish a new connection: [Errno -2] Name or service not known
EDIT
Added manage.py as requested
#!/usr/bin/env python
import os
import sys
if __name__ == "__main__":
os.environ.setdefault('DEPLOY_ENV', 'local')
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'core.settings.%s' % os.environ['DEPLOY_ENV'])
from django.core.management import execute_from_command_line # noqa
execute_from_command_line(sys.argv)
As Goran said, I was trying to connect to elasticsearch:9200 but testing localhost:9200
I have created a single node HDFS in a VM (hadoop.master, IP: 192.168.12.52). The file etc/hadoop/core-site.xml has the following configuration for the namenode:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master.hadoop:9000/</value>
</property>
</configuration>
I want to read a file from the HDFS on my local, physical desktop. For that, this is my code, which I've saved in a file named hdfs_read.py:
from hdfs import InsecureClient
client = InsecureClient('http://192.168.12.52:9000')
with client.read('/opt/hadoop/LICENSE.txt') as reader:
features = reader.read()
print(features)
Now when I run it, I get the following timeout error:
$ python3 hdfs_read.py
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 137, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 91, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 81, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 560, in urlopen
body=body, headers=headers)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 162, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 376, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 610, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 273, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.12.52', port=9000): Max retries exceeded with url: /webhdfs/v1/home/edhuser/testdata.txt?user.name=embs&offset=0&op=OPEN (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "hdfs_read_local.py", line 3, in <module>
with client.read('/home/edhuser/testdata.txt') as reader:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 678, in read
buffersize=buffer_size,
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 118, in api_handler
raise err
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 107, in api_handler
**self.kwargs
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 207, in _request
**kwargs
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 437, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='192.168.12.52', port=9000): Max retries exceeded with url: /webhdfs/v1/home/edhuser/testdata.txt?user.name=embs&offset=0&op=OPEN (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host',))
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
from apport.report import Report
File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
import apport.fileutils
File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
from apport.packaging_impl import impl as packaging
File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 23, in <module>
import apt
File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'
Original exception was:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 137, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 91, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 81, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 560, in urlopen
body=body, headers=headers)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 162, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 376, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 610, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 273, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.12.52', port=9000): Max retries exceeded with url: /webhdfs/v1/home/edhuser/testdata.txt?user.name=embs&offset=0&op=OPEN (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "hdfs_read.py", line 3, in <module>
with client.read('/home/edhuser/testdata.txt') as reader:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 678, in read
buffersize=buffer_size,
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 118, in api_handler
raise err
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 107, in api_handler
**self.kwargs
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 207, in _request
**kwargs
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 437, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='192.168.12.52', port=9000): Max retries exceeded with url: /webhdfs/v1/home/edhuser/testdata.txt?user.name=embs&offset=0&op=OPEN (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2d88cef2b0>: Failed to establish a new connection: [Errno 113] No route to host',))
How can I fix this connection issue? Am I using a wrong port? I though the port that the namenode is using is specified in the core-site.xml, which I have shown above as stating 9000 for the port. In any case, I have tried all the default ports 50070, 8020, 8048 mentioned in the hadoop installation doc for various purposes, and I still get the same error. Instead of client = InsecureClient('http://192.168.12.52:9000'), should I be using client = InsecureClient('hdfs://192.168.12.52:9000'), or maybe client = InsecureClient('file:///192.168.12.52:9000'), or something like that? I have seen these elsewhere at various times.
I can access the HDFS in web by the way, as shown in the screenshot below:
Also, even if it connects successfully, I think I may not be giving the right file path (/opt/hadoop/README.txt). I gave this file path as this is what I see when I do search for the list of files and directories in the hadoop installation directory, which is /opt/hadoop:
$ ls /opt/hadoop/
bin lib read_from_hdfs.py write_to_hdfs_2.py
connect_to_hdfs.py libexec README.txt write_to_hdfs3.py
etc LICENSE.txt sbin write_to_hdfs.py
hdfs_read_write.py logs share
include NOTICE.txt test_storage
But I know that HDFS is separate, and maybe I copied the contents of my HDFS into the local machine by doing hdfs dfs -get /test_storage/ ./ before, which is why its showing these files. But when I search for the files in the path of the namenode, it returns some illegible files:
$ls /opt/volume/namenode/current/
edits_0000000000000000001-0000000000000000002
edits_0000000000000000003-0000000000000000010
edits_0000000000000000011-0000000000000000012
edits_0000000000000000013-0000000000000000015
edits_0000000000000000016-0000000000000000023
edits_0000000000000000024-0000000000000000025
edits_0000000000000000026-0000000000000000032
edits_0000000000000000033-0000000000000000033
edits_0000000000000000034-0000000000000000035
edits_0000000000000000036-0000000000000000037
edits_0000000000000000038-0000000000000000039
edits_0000000000000000040-0000000000000000041
edits_0000000000000000042-0000000000000000043
edits_0000000000000000044-0000000000000000045
edits_0000000000000000046-0000000000000000047
edits_0000000000000000048-0000000000000000049
edits_0000000000000000050-0000000000000000051
edits_0000000000000000052-0000000000000000053
edits_0000000000000000054-0000000000000000055
edits_0000000000000000056-0000000000000000057
edits_0000000000000000058-0000000000000000059
edits_0000000000000000060-0000000000000000061
edits_0000000000000000062-0000000000000000063
edits_0000000000000000064-0000000000000000065
edits_0000000000000000066-0000000000000000067
edits_0000000000000000068-0000000000000000070
edits_0000000000000000071-0000000000000000072
edits_0000000000000000073-0000000000000000074
edits_0000000000000000075-0000000000000000076
edits_0000000000000000077-0000000000000000078
edits_inprogress_0000000000000000079
fsimage_0000000000000000076
fsimage_0000000000000000076.md5
fsimage_0000000000000000078
fsimage_0000000000000000078.md5
seen_txid
VERSION
So, if I am specifying the file path to read wrongly, what is the correct file path to use?
EDIT: Upon changing the port to 50070 (i.e., client = InsecureClient('http://192.168.12.52:50070')), I get the following error:
$ python3 hdfs_read_local.py
Traceback (most recent call last):
File "hdfs_read.py", line 3, in <module>
with client.read('/opt/hadoop/LICENSE.txt') as reader:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 678, in read
buffersize=buffer_size,
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 112, in api_handler
raise err
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 107, in api_handler
**self.kwargs
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 210, in _request
_on_error(response)
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 50, in _on_error
raise HdfsError(message, exception=exception)
hdfs.util.HdfsError: File /opt/hadoop/LICENSE.txt not found.
EDIT2: Upon modifying the file path from /opt/hadoop/LICENSE.txt
to /test_storage/LICENSE.txt, which seems to be the correct HDFS path, Iand running the python script, I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 137, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 91, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 81, in create_connection
sock.connect(sa)
OSError: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 560, in urlopen
body=body, headers=headers)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 162, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: <requests.packages.urllib3.connection.HTTPConnection object at 0x7f2e87867400>: Failed to establish a new connection: [Errno 113] No route to host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 376, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 610, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 273, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='pr2.embs', port=50075): Max retries exceeded with url: /webhdfs/v1/test_storage/LICENSE.txt?op=OPEN&user.name=embs&namenoderpcaddress=192.168.12.52:9000&offset=0 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2e87867400>: Failed to establish a new connection: [Errno 113] No route to host',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "hdfs_read_local.py", line 3, in <module>
with client.read('/test_storage/LICENSE.txt') as reader:
File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
return next(self.gen)
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 678, in read
buffersize=buffer_size,
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 118, in api_handler
raise err
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 107, in api_handler
**self.kwargs
File "/home/embs/.local/lib/python3.6/site-packages/hdfs/client.py", line 207, in _request
**kwargs
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 468, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 597, in send
history = [resp for resp in gen] if allow_redirects else []
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 597, in <listcomp>
history = [resp for resp in gen] if allow_redirects else []
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 195, in resolve_redirects
**adapter_kwargs
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 576, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 437, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='pr2.embs', port=50075): Max retries exceeded with url: /webhdfs/v1/test_storage/LICENSE.txt?op=OPEN&user.name=embs&namenoderpcaddress=192.168.12.52:9000&offset=0 (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f2e87867400>: Failed to establish a new connection: [Errno 113] No route to host',))
As stated here link this python library is using webhdfs. If you want to test both the host and the file path are correct you can use the following command curl -i 'http://192.168.12.52:50070/webhdfs/v1/<PATH>?op=LISTSTATUS'. This will list a directory in hdfs. if you get that right you can use the same "config" in python.
from hdfs import InsecureClient
client = InsecureClient('http://192.168.12.52:50070')
with client.read('<hdfs_path>') as reader:
features = reader.read()
print(features)
http://192.168.12.52:9000
9000 is an RPC port. 50070 is the default HTTP WebHDFS port.
You might get No route to host if WebHDFS is disabled, or the datanode is not exposing port 50075 (the datanode http address) because it's down, or you changed that property
client.read('/opt/hadoop/LICENSE.txt')
You're running HDFS in pseudo distributed mode, but you're trying to read a local file. /opt does not exist in HDFS by default, and you've only ran a local ls... You should instead be using hadoop fs -ls /opt to see what files do exist at the path you're trying to open
But when I search for the files in the path of the namenode, it returns some illegible files:
Your files are not stored in the namenode... Their metadata is
Your files are stored in the datanode data directories, but as blocks, not as human-readable content
You can run this command to get a list of all the blocks and their locations
hdfs fsck /path/to/file.txt -files -blocks
There could be a problem with network configuration. Try this tweaked code for the time-being:
from hdfs import InsecureClient
client = InsecureClient('http://0.0.0.0:50070')
with client.read('/test-storage/LICENSE.txt') as reader:
features = reader.read()
print(features)
Read about IP Address 0.0.0.0
Hi I'm faced similar issue. It' looks like port is right. Im my case I was able to get list of dirs, but couldn't write any data. Issue was in my vpn which blocks some of the ports, and read and write uses different one.