This is probably (hopefully) a simple beginner's mistake, yet I cannot find any explanation for what I'm doing wrong.
I want to scrape some content for a yt-video with bs4 and requests_html. both of these libraries are installed. however, when I try to run the code, it does not work. PyCharm tells me that the method 'html' as in r.html.render(sleep=1) and soup = bs(r.html.html, "html.parser") is the problem. could anyone please have a quick look into it and tell me what I might be doing wrong?
from bs4 import BeautifulSoup as bs
from requests_html import HTMLSession
video_id = "v8Yh_4oE-Fs"
video_root = "https://www.youtube.com/watch?v="
video_url = "".join((video_root, video_id))
session = HTMLSession()
r = session.get(video_url)
r.html.render(sleep=1)
soup = bs(r.html.html, "html.parser")
etc
edit: This is the full error message.
[W:pyppeteer.chromium_downloader] start chromium download. Download may take a few minutes. Traceback (most recent call last): File "C:\Program Files\Python\lib\site-packages\urllib3\contrib\pyopenssl.py", line 488, in wrap_socket
cnx.do_handshake() File "C:\Program Files\Python\lib\site-packages\OpenSSL\SSL.py", line 1934, in do_handshake
self._raise_ssl_error(self._ssl, result) File "C:\Program Files\Python\lib\site-packages\OpenSSL\SSL.py", line 1671, in
_raise_ssl_error
_raise_current_error() File "C:\Program Files\Python\lib\site-packages\OpenSSL_util.py", line 54, in exception_from_error_queue
raise exception_type(errors) OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 677, in urlopen
chunked=chunked, File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 381, in _make_request
self._validate_conn(conn) File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 976, in validate_conn
conn.connect() File "C:\Program Files\Python\lib\site-packages\urllib3\connection.py", line 370, in connect
ssl_context=context, File "C:\Program Files\Python\lib\site-packages\urllib3\util\ssl.py", line 377, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname) File "C:\Program Files\Python\lib\site-packages\urllib3\contrib\pyopenssl.py", line 494, in wrap_socket
raise ssl.SSLError("bad handshake: %r" % e) ssl.SSLError: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:/Users/Administrator.WSW41/Documents/PycharmProjects/YoutubeData/try_out.py", line 10, in
r.html.render(sleep=1) File "C:\Program Files\Python\lib\site-packages\requests_html.py", line 586, in render
self.browser = self.session.browser # Automatically create a event loop and browser File "C:\Program Files\Python\lib\site-packages\requests_html.py", line 730, in browser
self._browser = self.loop.run_until_complete(super().browser) File "C:\Program Files\Python\lib\asyncio\base_events.py", line 584, in run_until_complete
return future.result() File "C:\Program Files\Python\lib\site-packages\requests_html.py", line 714, in browser
self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, args=self.__browser_args) File "C:\Program Files\Python\lib\site-packages\pyppeteer\launcher.py", line 306, in launch
return await Launcher(options, **kwargs).launch() File "C:\Program Files\Python\lib\site-packages\pyppeteer\launcher.py", line 119, in init
download_chromium() File "C:\Program Files\Python\lib\site-packages\pyppeteer\chromium_downloader.py", line 146, in download_chromium
extract_zip(download_zip(get_url()), DOWNLOADS_FOLDER / REVISION) File "C:\Program Files\Python\lib\site-packages\pyppeteer\chromium_downloader.py", line 85, in download_zip
data = http.request('GET', url, preload_content=False) File "C:\Program Files\Python\lib\site-packages\urllib3\request.py", line 76, in request
method, url, fields=fields, headers=headers, **urlopen_kw File "C:\Program Files\Python\lib\site-packages\urllib3\request.py", line 97, in request_encode_url
return self.urlopen(method, url, **extra_kw) File "C:\Program Files\Python\lib\site-packages\urllib3\poolmanager.py", line 336, in urlopen
response = conn.urlopen(method, u.request_uri, **kw) File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 765, in urlopen
**response_kw File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 765, in urlopen
**response_kw File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 765, in urlopen
**response_kw File "C:\Program Files\Python\lib\site-packages\urllib3\connectionpool.py", line 725, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] File "C:\Program Files\Python\lib\site-packages\urllib3\util\retry.py", line 439, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Win_x64/588429/chrome-win32.zip (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))
Process finished with exit code 1
Thank you in advance!
Related
When I try to read data from stooq through pandas_datareader.data this error keeps coming through:
Traceback (most recent call last):
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 700, in urlopen
self._prepare_proxy(conn)
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 996, in _prepare_proxy
conn.connect()
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(
^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 517, in wrap_socket
return self.sslsocket_class._create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 1075, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/ssl.py", line 1346, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:992)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='stooq.com', port=443): Max retries exceeded with url: /q/d/l/?s=%5EDJI&i=d&d1=20180216&d2=20230215 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:992)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/matteo/PycharmProjects/pythonProject/stock_data.py", line 3, in <module>
f = web.DataReader('^DJI', 'stooq')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas/util/_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas_datareader/data.py", line 432, in DataReader
).read()
^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas_datareader/base.py", line 253, in read
df = self._read_one_data(self.url, params=self._get_params(self.symbols))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas_datareader/base.py", line 108, in _read_one_data
out = self._read_url_as_StringIO(url, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas_datareader/base.py", line 119, in _read_url_as_StringIO
response = self._get_response(url, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/pandas_datareader/base.py", line 155, in _get_response
response = self.session.get(
^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/requests/sessions.py", line 600, in get
return self.request("GET", url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/matteo/PycharmProjects/pythonProject/venv/lib/python3.11/site-packages/requests/adapters.py", line 563, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='stooq.com', port=443): Max retries exceeded with url: /q/d/l/?s=%5EDJI&i=d&d1=20180216&d2=20230215 (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:992)')))
I tried to run the same code on jupiter lab and the error is the same, contrary in google colab I get the correct result. Here is the code from google colab:
Code:
import pandas_datareader.data as web
f = web.DataReader('^DJI', 'stooq')
f[:10]
Getting an Error for this Basic Appium Script while attempting to run it on a real device. Everything works fine if I used the Appium Inspector, but once I try to run the same code in PyCharm im getting an Error. Below is the JSON format, as Im pretty new to this library I don't know if I've done it correctly or not.
from appium import webdriver
desired_cap = {
"deviceName": "R9AN60B4CCJ",
"platformName": "Android",
"app": "C:\\Users\\John Doe\\AppData\\Local\\Android\\Sdk\\platform-tools\\airmirror2.apk"
} #The Capabilities to install an app from the apk file on your Computer
driver = webdriver.Remote('https://localhost:4723/wd/hub', desired_cap)
#Similar to Selenium; Declaring the Driver. The Appium Server should already be started
the Error is below
Traceback (most recent call last):
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 600, in urlopen
chunked=chunked)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 343, in _make_request
self._validate_conn(conn)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 849, in _validate_conn
conn.connect()
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connection.py", line 356, in connect
ssl_context=context)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\util\ssl_.py", line 359, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Users\John Doe\Miniconda3\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/John Doe/PycharmProjects/SelTest/venv/FirstTest.py", line 11, in <module>
driver = webdriver.Remote('https://localhost:4723/wd/hub', desired_cap)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\appium\webdriver\webdriver.py", line 275, in __init__
AppiumConnection(command_executor, keep_alive=keep_alive), desired_capabilities, browser_profile, proxy
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 269, in __init__
self.start_session(capabilities, browser_profile)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\appium\webdriver\webdriver.py", line 369, in start_session
response = self.execute(RemoteCommand.NEW_SESSION, parameters)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 423, in execute
response = self.command_executor.execute(driver_command, params)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 333, in execute
return self._request(command_info[0], url, body=data)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\selenium\webdriver\remote\remote_connection.py", line 355, in _request
resp = self._conn.request(method, url, body=body, headers=headers)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\request.py", line 72, in request
**urlopen_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\request.py", line 150, in request_encode_body
return self.urlopen(method, url, **extra_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\poolmanager.py", line 322, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 667, in urlopen
**response_kw)
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Users\John Doe\PycharmProjects\SelTest\venv\lib\site-packages\urllib3\util\retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='localhost', port=4723): Max retries exceeded with url: /wd/hub/session (Caused by SSLError(SSLError(1, '[SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:1076)')))
I dont know if I need a specific version of urllib3 in this venv.
I have a webscraping project that is stuck.
I am using the Requests package and the get() method to download the website http, after which I want to use Beautiful Soup on it.
It works fine on my labtop, but when I upload the program to my AWS Ubuntu EC2 instance, I run into an error. I have tried other websites, and they all work, I only run into these problems with this site.
Does anyone know why this is happening?
Based on the error messages, I suspected SSL issues, but even with the verify=False parameter, it still wont work.
The code:
import requests
url = "http://www.kino.dk"
r = requests.get(url, verify=False)
print(r.text)
The error message:
> Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 485, in wrap_socket
cnx.do_handshake()
File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1915, in do_handshake
self._raise_ssl_error(self._ssl, result)
File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1647, in _raise_ssl_error
_raise_current_error()
File "/usr/lib/python3/dist-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue
raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'tls12_check_peer_sigalg', 'wrong signature type')]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 665, in urlopen
httplib_response = self._make_request(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 996, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 352, in connect
self.sock = ssl_wrap_socket(
File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 370, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "/usr/lib/python3/dist-packages/urllib3/contrib/pyopenssl.py", line 491, in wrap_socket
raise ssl.SSLError("bad handshake: %r" % e)
ssl.SSLError: ("bad handshake: Error([('SSL routines', 'tls12_check_peer_sigalg', 'wrong signature type')])",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.kino.dk', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls12_check_peer_sigalg', 'wrong signature type')])")))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "test.py", line 6, in <module>
response = requests.get(url, verify=False)
File "/usr/lib/python3/dist-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 668, in send
history = [resp for resp in gen] if allow_redirects else []
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 668, in <listcomp>
history = [resp for resp in gen] if allow_redirects else []
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 239, in resolve_redirects
resp = self.send(
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.kino.dk', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls12_check_peer_sigalg', 'wrong signature type')])")))
I am trying to download youtube video using PyTube package but I am unable to do so.
I am using Python 3.5 and pytube 9.2.2.
Urllib doesn't give me an error in other programs but it is giving me errors here.
The code I am using is:
#importing the module
from pytube import YouTube
#where to save
SAVE_PATH = "E:/" #to_do
#link of the video to be downloaded
link="https://www.youtube.com/watch?v=xWOoBJUqlbI"
yt = YouTube(link)
#filters out all the files with "mp4" extension
mp4files = yt.filter('mp4')
yt.set_filename('GeeksforGeeks Video') #to set the name of the file
#get the video with the extension and resolution passed in the get() function
d_video = yt.get(mp4files[-1].extension,mp4files[-1].resolution)
try:
#downloading the video
d_video.download(SAVE_PATH)
except:
print("Some Error!")
print('Task Completed!')
The errors that I am getting are:
Traceback (most recent call last):
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 1240, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1083, in request
self._send_request(method, url, body, headers)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1128, in _send_request
self.endheaders(body)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1079, in endheaders
self._send_output(message_body)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 911, in _send_output
self.send(msg)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 854, in send
self.connect()
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\http\client.py", line 1237, in connect
server_hostname=server_hostname)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\ssl.py", line 376, in wrap_socket
_context=self)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\ssl.py", line 747, in __init__
self.do_handshake()
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\ssl.py", line 983, in do_handshake
self._sslobj.do_handshake()
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\ssl.py", line 628, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:646)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "2TryingPyTube.py", line 10, in <module>
yt = YouTube(link)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\site-packages\pytube\__main__.py", line 87, in __init__
self.prefetch_init()
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\site-packages\pytube\__main__.py", line 95, in prefetch_init
self.prefetch()
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\site-packages\pytube\__main__.py", line 158, in prefetch
self.watch_html = request.get(url=self.watch_url)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\site-packages\pytube\request.py", line 21, in get
response = urlopen(url)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 465, in open
response = self._open(req, data)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 483, in _open
'_open', req)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 443, in _call_chain
result = func(*args)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 1283, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\hp\AppData\Local\Programs\Python\Python35\lib\urllib\request.py", line 1242, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:646)>
I wish someone could help me.
PS - I am only a beginner in using urllib, http packages. Please don't write harsh comments, thank you :0
When I try to follow the guide here: https://automatetheboringstuff.com/chapter11/ my script fails:
import requests
res = requests.get('https://automatetheboringstuff.com/files/rj.txt')
type(res)
res.raise_for_status()
requests is installed.
I am given the following error messages after a very long wait, which only appear when using HTTPS URLs; the same thing occurs on two Windows 10 64bit machines with Python 3.6.3 64bit and Python 3.6.4 64bit:
"C:\Program Files\Python36\python.exe" "C:/Users/user.name/Google Drive/Automation/RoHSWebScraper/main.py"
Traceback (most recent call last):
File "C:\Program Files\Python36\lib\site-packages\urllib3\contrib\pyopenssl.py", line 441, in wrap_socket
cnx.do_handshake()
File "C:\Program Files\Python36\lib\site-packages\OpenSSL\SSL.py", line 1716, in do_handshake
self._raise_ssl_error(self._ssl, result)
File "C:\Program Files\Python36\lib\site-packages\OpenSSL\SSL.py", line 1449, in _raise_ssl_error
raise SysCallError(-1, "Unexpected EOF")
OpenSSL.SSL.SysCallError: (-1, 'Unexpected EOF')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Program Files\Python36\lib\site-packages\urllib3\connectionpool.py", line 601, in urlopen
chunked=chunked)
File "C:\Program Files\Python36\lib\site-packages\urllib3\connectionpool.py", line 346, in _make_request
self._validate_conn(conn)
File "C:\Program Files\Python36\lib\site-packages\urllib3\connectionpool.py", line 850, in _validate_conn
conn.connect()
File "C:\Program Files\Python36\lib\site-packages\urllib3\connection.py", line 326, in connect
ssl_context=context)
File "C:\Program Files\Python36\lib\site-packages\urllib3\util\ssl_.py", line 329, in ssl_wrap_socket
return context.wrap_socket(sock, server_hostname=server_hostname)
File "C:\Program Files\Python36\lib\site-packages\urllib3\contrib\pyopenssl.py", line 448, in wrap_socket
raise ssl.SSLError('bad handshake: %r' % e)
ssl.SSLError: ("bad handshake: SysCallError(-1, 'Unexpected EOF')",)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Program Files\Python36\lib\site-packages\requests\adapters.py", line 440, in send
timeout=timeout
File "C:\Program Files\Python36\lib\site-packages\urllib3\connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:\Program Files\Python36\lib\site-packages\urllib3\util\retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='automatetheboringstuff.com', port=443): Max retries exceeded with url: /files/rj.txt (Caused by SSLError(SSLError("bad handshake: SysCallError(-1, 'Unexpected EOF')",),))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/user.name/Google Drive/Automation/RoHSWebScraper/main.py", line 3, in <module>
res = requests.get('https://automatetheboringstuff.com/files/rj.txt', verify=False)
File "C:\Program Files\Python36\lib\site-packages\requests\api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "C:\Program Files\Python36\lib\site-packages\requests\adapters.py", line 506, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='automatetheboringstuff.com', port=443): Max retries exceeded with url: /files/rj.txt (Caused by SSLError(SSLError("bad handshake: SysCallError(-1, 'Unexpected EOF')",),))
Process finished with exit code 1
Can anyone help me with this infuriating problem!!?
You can try urllib:
Python2:
import urllib
data = urllib.urlopen('https://automatetheboringstuff.com/files/rj.txt').read()
Python3:
import urllib.requests
data = urllib.requests.urlopen('https://automatetheboringstuff.com/files/rj.txt').read()
So it turns out the computers on my corporate network are using proxy servers, which was preventing my HTTP and HTTPS requests from connecting properly.
I followed the answer from Lelouchzqy here to determine what my HTTP and HTTPS proxy servers were.
I then followed the answer from Roland Smith here to tell requests which proxies to use.
Hopefully this will help someone in the future if they have the same issue!