'urllib' on an infinite loop - python

For a while, i have been trying to use cloud9 to give me an extra hand on repetitive tasks on my job. However, the majority of those tasks have web scraping involved, and I am having a hard time using the urllib library.
It does not work even for simple codes. It keeps running forever, and I can't find the reasons for that. I will appreciate some tips...
from urllib.request import urlopen
import json
import ssl
# Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = ('Enter - ')
html = urlopen(url, context=ctx).read()
str_data = open(html).read()
json_data = json.loads(str_data)
for entry in json_data:
name = entry[0];
title = entry[1];
print((name, title))

Without providing much debug information, one can only assume that you're probably timing out, since you're not providing the urlopen function with a timeout.
To avoid this, put a reasonable timeout to your urlopen function so that it doesn't run indefinitely.

Do yourself a favor and replace that with Requests. Your example simplifies to:
import requests
url = "..."
json_data = requests.get(url).json()
for entry in json_data:
name = entry[0]
title = entry[1]
print(name, title)
That also removes a whole awful lot of potential for mistakes from your code.

New code
import requests
import json
url = "-"
json_data = requests.get(url).json()
info = json.loads(json_data)
print('User count:', len(info))
print(info)
Debug Mode:
[IKP3db-g] 15:43:52,971733 - INFO - IKP3db 1.4.1 - Inouk Python Debugger for CPython 3.6+
[IKP3db-g] 15:43:52,972714 - INFO - IKP3db listening on 127.0.0.1:15471
[IKP3db-g] 15:43:53,003016 - INFO - Connected with 127.0.0.1:50936
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 157, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn)
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn
conn.connect()
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 334, in connect
conn = self._new_conn()
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7fa2bb9729e8>: Failed to establish a new connection: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 720, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='ot.cloud.mapsfinancial.com', port=443): Max retries exceeded with url: /pegasus/api/consulta/caixa/saldos/carteira/DVG1%20FIA/2018-01-08 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fa2bb9729e8>: Failed to establish a new connection: [Errno 110] Connection timed out',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/ikp3db.py", line 2047, in main
File "/usr/local/lib/python3.6/dist-packages/ikp3db.py", line 1526, in _runscript
exec(statement, globals, locals)
File "<string>", line 1, in <module>
File "/home/ubuntu/environment/Testing.py", line 5, in <module>
json_data = requests.get(url).json()
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/ubuntu/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='ot.cloud.mapsfinancial.com', port=443): Max retries exceeded with url: xxxxxxxx (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fa2bb9729e8>: Failed to establish a new connection: [Errno 110] Connection timed out',))[IKP3db-g] 15:46:04,668520 - INFO - Uncaught exception. Entering post mortem debugging

Related

how to download this zip file using python requests?

I am trying to download a zip file that is stored here:
http://e4ftl01.cr.usgs.gov/MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip
If you paste this into the browser and hit enter, it will download the .zip folder.
If you inspect the browser while this is happening, you will see that there is an internal redirect going on:
And eventually the zip gets downloaded.
I am trying to automate this downloading using the python requests library by doing the following:
import requests
requests.get(url,
allow_redirects=True,
headers={'User-Agent':'Chrome/107.0.0.0'})
I've tried tons of combinations, using the full header string from the HTML inspection, forcing verify=True, with and without redirects, adding a HTTPBasicAuth user/pass that says is required although the file seems to download fine without any credentials.
Honestly no clue what I'm missing, this is not my expertise. I keep getting this error:
>>> requests.get(url,
... allow_redirects=True,
... headers={'User-Agent':'Chrome/107.0.0.0'})
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\connection.py", line 95, in create_connection
raise err
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine
actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 398, in _make_request
conn.request(method, url, **httplib_request_kw)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 1037, in _send_output
self.send(msg)
File "C:\Users\dere\Miniconda3\envs\blender\lib\http\client.py", line 975, in send
self.connect()
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 205, in connect
conn = self._new_conn()
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\adapters.py", line 489, in send
resp = conn.urlopen(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\urllib3\util\retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='e4ftl01.cr.usgs.gov', port=80): Max retries exceeded with url: /MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "C:\Users\dere\Miniconda3\envs\blender\lib\site-packages\requests\adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='e4ftl01.cr.usgs.gov', port=80): Max retries exceeded with url: /MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000016B76A6FE50>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))
Can someone help me arrive at the code that will result in a successful request? I know how to write the request into a zip afterwards..
Change http to Https: This should work
import requests
# download zip file from url
url = "https://e4ftl01.cr.usgs.gov/MEASURES/SRTMGL1.003/2000.02.11/N45W074.SRTMGL1.hgt.zip"
r = requests.get(url)
with open("N45W074.SRTMGL1.hgt.zip", "wb") as f:
f.write(r.content)
Thanks to #cnemri for the tip.
It was a combination of changing http to https and also including this specific cookie header in the request header:
headers = {'Cookie': '_gid=GA1.2.48775707.1669266346; _hjSessionUser_606685=eyJpZCI6IjQ1MzEzM2QzLTI3MWEtNWM0YS04M2YzLWRmMmMzNDk4NjY1ZSIsImNyZWF0ZWQiOjE2NjkyNjYzNDU2MjUsImV4aXN0aW5nIjp0cnVlfQ==; ERS_production_2=b7dfade669180a6d6d250b030ffc3cf2UZZa8%2F471MfcV%2FaNuFtu6Pli%2BpP8jKOIwR4JvQjm%2B6DLPjl679vVf5SCDk7C5TLQsj7qckIev6lmtGb6Mes5RKDHUs%2BBp3EAjKW2%2BMCUpV%2Fnx0z1pdaCQQ%3D%3D; EROS_SSO_production_secure=eyJjcmVhdGVkIjoxNjY5MjY5MjEzLCJ1cGRhdGVkIjoiMjAyMi0xMS0yMyAyMzo1MzoyOCIsImF1dGhUeXBlIjoiRVJTIiwiYXV0aFNlcnZpY2UiOiJFUk9TIiwidmVyc2lvbiI6MS4xLCJzdGF0ZSI6ImVjMDY3YmMzYmRhMzBiZTUyNTkxYTNiZTYwMTMwZWNmNjAwMWU1Y2JlMGMxZmNkYTU4Y2Y4OTY0YjRlNTJkOTEiLCJpZCI6InY5U1Q4YVl3M1hRKCsyIiwic2VjcmV0IjoiPl8lMCZPYiY9MUVkclRLZTt4fHVZLVcyU1VvTiA1SE17KiQqaG12OSxvPl5%2BdyJ9; _ga_0YWDZEJ295=GS1.1.1669269458.1.0.1669269460.0.0.0; _ga=GA1.2.1055277358.1669266346; _ga_71JPYV1CCS=GS1.1.1669306300.2.1.1669306402.0.0.0; DATA=Y3_gg9uD6LmunsfDHneR9wAAARQ'}
then, this worked:
requests.get(url, headers=headers)
I'm not sure if this is the "solution" or just a workaround I've discovered. Still appreciate any input. I think it may just be credentials that are stored?

New to Python requests library, getting '[Errno 61] Connection refused'

I've searched high and low and can't find a fix that seems to fit my use case. I've had some Python3 install issues on my M1 mac but think it's now running okay.
Just trying to learn / work with the requests library, but when I try to run this very simple python code:
import requests
response = requests.get("https://api.open-notify.org/this-api-doesnt-exist")
print(response.status_code)
I get the below error...
Any advice on how to fix this? I'm pulling my hair out.
Thank you!
/Users/myname/venv/python310/bin/python /Users/myname/Documents/2022_api/api01.py
Traceback (most recent call last):
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1040, in _validate_conn
conn.connect()
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connection.py", line 358, in connect
conn = self._new_conn()
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x10f5a0850>: Failed to establish a new connection: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/adapters.py", line 440, in send
resp = conn.urlopen(
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 785, in urlopen
retries = retries.increment(
File "/Users/myname/venv/python310/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.open-notify.org', port=443): Max retries exceeded with url: /this-api-doesnt-exist (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x10f5a0850>: Failed to establish a new connection: [Errno 61] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/myname/Documents/2022_api/api01.py", line 3, in <module>
response = requests.get("https://api.open-notify.org/this-api-doesnt-exist")
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/sessions.py", line 529, in request
resp = self.send(prep, **send_kwargs)
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "/Users/myname/venv/python310/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.open-notify.org', port=443): Max retries exceeded with url: /this-api-doesnt-exist (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x10f5a0850>: Failed to establish a new connection: [Errno 61] Connection refused'))
Process finished with exit code 1
Connection Refused means that the host (api.open-notify.org) is not listening on the port (https is on port 443) in your request.
I tried connecting to port 80 (plain http) instead and that worked for me:
>>> response = requests.get("http://api.open-notify.org/this-api-doesnt-exist")
>>> print(response.status_code)
404
For laughs, I tried connecting to the base url
>>> response = requests.get("http://api.open-notify.org")
>>> print(response.status_code)
500
>>> response
<Response [500]>
>>> response.text
'<html>\r\n<head><title>500 Internal Server Error</title></head>\r\n<body bgcolor="white">\r\n<center><h1>500 Internal Server Error</h1></center>\r\n<hr><center>nginx/1.10.3</center>\r\n</body>\r\n</html>\r\n'

I'm stuck in loading the data from xml file

import requests
try:
import xml.etree.cElementTree as et
except ImportError:
import xml.etree.ElementTree as et
user_key = authorized_key
doc_name = "F-C0032-001"
api_link = "http://opendata.cwb.gov.tw/opendataapi?dataid=%s&authorizationkey=%s" % (doc_name,user_key)
report = requests.get(url=api_link).text #The problem is here
Error:
Traceback (most recent call last):
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 169, in _new_conn
conn = connection.create_connection(
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\connection.py", line 96, in create_connection
raise err
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\connection.py", line 86, in create_connection
sock.connect(sa)
ConnectionRefusedError: [WinError 10061]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 234, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1255, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1301, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1250, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 1010, in _send_output
self.send(msg)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\http\client.py", line 950, in send
self.connect()
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 200, in connect
conn = self._new_conn()
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connection.py", line 181, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000001F0DC29F940>: Failed to establish a new connection: [WinError 10061]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 439, in send
resp = conn.urlopen(
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\urllib3\util\retry.py", line 573, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='opendata.cwb.gov.tw', port=80): Max retries exceeded with url: /opendataapi?dataid=F-C0032-001&authorizationkey= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001F0DC29F940>: Failed to establish a new connection: [WinError 10061]'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/Python Books/Python/Ch17 Flask Web API/weatherdata.py", line 14, in <module>
report = requests.get(url=api_link,headers=headers).text
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\domin\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='opendata.cwb.gov.tw', port=80): Max retries exceeded with url: /opendataapi?dataid=F-C0032-001&authorizationkey= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001F0DC29F940>: Failed to establish a new connection: [WinError 10061]'))
Actually my aim is to read the data and select the related location from the api_link but however the file is too bulky so is there any good ways or methods to properly read the related data from the xml file without any connection error messages? Since I do not understand what the error messages telling me how to deal with so if possible could anyone give some related solutions?
it looks like the error is not reading the xlm but when you send the request to the server, this error:
requests.exceptions.ConnectionError: HTTPConnectionPool(host='opendata.cwb.gov.tw', port=80): Max retries exceeded with url: /opendataapi?dataid=F-C0032-001&authorizationkey=
Is send when you're sending too many requests from same ip address in short period of time.
Have you tried this ?
https://stackoverflow.com/a/47475019/17044313
It should look like this:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
user_key = Authorization Key
doc_name = 'F-C0032-001'
api_link = 'http://opendata.cwb.gov.tw/opendataapi?dataid=%s&authorizationkey=%s'%(doc_name,user_key)
session = requests.Session()
retry = Retry(connect=3, backoff_factor=0.5)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
session.get(api_link)

Python3 Requests Proxy (Socks5)

Edit: I don't know if this is a thing, but often this happens when i've been testing a lot and I have a lot of python scripts running. Usually none active at the same time but I'll have ran a few of them in a short space of time. I don't know if that's why teh error starts happening.
Ok here's my dilemma.
I've got a NordVPN account and I want to randomly loop through the ip's for the requests i'm making to google. This works fine .. sometimes. For some reason after a few request pings my program starts slowing down and then I start getting 'Max Connection Exceeded' errors and as far as I can tell the ip's are fine.
I started thinking that google was blocking the ip but then I request to other website and the same error continues.. I started thinking that maybe the proxy server only allows a certain amount of requests in a certain time frame so I tried some free proxies but I get the same errors again..
Either way if I leave it alone for a while it seems to work fine and then a few dozen pings later and the errors start again.
The only think I can think of and not sure how I can test or even if it's possible to test remotely (work router) is that maybe some open connections remain open and affects my code and then when the backlog of open connections are thrown out I can resume as normal. I started looking for if I needed to "close" my connection after my get request but it doesn't seem necessary (although I wasn't even sure what I was looking for).
This is one version of my code, I've tried without sessions, I've tried a different way of writing the sessions. All seem to work in the same way:
import requests, random
username = 'xxx'
password = 'yyy'
with open('proxies.txt', 'r') as p:
proxy_lost = p.read().splitlines()
with open('data.txt', 'r') as d:
data = d.read().splitlines()
for i in data:
result = None
while result is None:
try:
proxy = proxy_list[random.randint(0,len(proxy_list)-1)] + '.nordvpn.com'
prox = 'socks5://{}:{}#{}:{}'.format(username, password, proxy, 1080) #I've also tried socks5h
with requests.Session() as r:
r.proxies['http': prox]
r.proxies['https': prox]
result = r.get('http://icanhazip.com')
print(result.text.strip())
If anyone has any idea whatsoever any help is appreciated. I've reached a point where i'm struggling to think of new ideas to try.
This is an example of one of the errors I get during this whole process:
Traceback (most recent call last): File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 809, in connect
negotiate(self, dest_addr, dest_port) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 444, in _negotiate_SOCKS5
self, CONNECT, dest_addr) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 503, in _SOCKS5_request
raise SOCKS5AuthError("SOCKS5 authentication failed") socks.SOCKS5AuthError: SOCKS5 authentication failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\contrib\socks.py",
line 88, in _new_conn
**extra_kw File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 209, in create_connection
raise err File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 199, in create_connection
sock.connect((remote_host, remote_port)) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 47, in wrapper
return function(*args, **kwargs) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\socks.py",
line 814, in connect
raise GeneralProxyError("Socket error", error) socks.GeneralProxyError: Socket error: SOCKS5 authentication failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\connectionpool.py",
line 600, in urlopen
chunked=chunked) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\connectionpool.py",
line 354, in _make_request
conn.request(method, url, **httplib_request_kw) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\http\client.py",
line 1244, in request
self._send_request(method, url, body, headers, encode_chunked) File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\http\client.py",
line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\http\client.py",
line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked) File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\http\client.py",
line 1026, in _send_output
self.send(msg) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\http\client.py",
line 966, in send
self.connect() File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\connection.py",
line 181, in connect
conn = self._new_conn() File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\contrib\socks.py",
line 110, in _new_conn
"Failed to establish a new connection: %s" % error urllib3.exceptions.NewConnectionError:
:
Failed to establish a new connection: SOCKS5 authentication failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\adapters.py",
line 449, in send
timeout=timeout File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\connectionpool.py",
line 638, in urlopen
_stacktrace=sys.exc_info()[2]) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\urllib3\util\retry.py",
line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError:
SOCKSHTTPConnectionPool(host='icanhazip.com', port=80): Max retries
exceeded with url: / (Caused by
NewConnectionError(': Failed to establish a new connection: SOCKS5
authentication failed'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File
"C:\Users*\Documents_Scripts\Find
Proxies\FindProxies.py", line 23, in
result = requests.get('http://icanhazip.com', proxies=proxies) File
"C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\api.py",
line 75, in get
return request('get', url, params=params, **kwargs) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\api.py",
line 60, in request
return session.request(method=method, url=url, **kwargs) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\sessions.py",
line 533, in request
resp = self.send(prep, **send_kwargs) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\sessions.py",
line 646, in send
r = adapter.send(request, **kwargs) File "C:\Users*\AppData\Local\Programs\Python\Python37\lib\site-packages\requests\adapters.py",
line 516, in send
raise ConnectionError(e, request=request) requests.exceptions.ConnectionError:
SOCKSHTTPConnectionPool(host='icanhazip.com', port=80): Max retries
exceeded with url: / (Caused by
NewConnectionError(': Failed to establish a new connection: SOCKS5
authentication failed'))
Are you also closing the sessions again after use?
Try with:
r = requests.session(config={'keep_alive': False})
Inspired by:
python-requests-close-http-connection

list index out of range when listing all projects using python-gitlab

I'm trying to use python-gitlab to list all my gitlab projects. But when I use the following code to list part of my projects in gitlab, I got list index out of range error.
# file name: poc.py
import gitlab
gl = gitlab.Gitlab(
'http://my-gitlab.com:10080',
private_token='<my_private_token>'
)
projects = gl.projects.list(all=True)
for project in projects:
print(project.attributes['name'])
My gitlab server has more than 300 projects.
The following is the complete error message:
Traceback (most recent call last):
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 783, in next
item = self._data[self._current]
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/util/connection.py", line 80, in create_connection
raise err
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/util/connection.py", line 70, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/http/client.py", line 1239, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/http/client.py", line 1285, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/http/client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/http/client.py", line 1026, in _send_output
self.send(msg)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/http/client.py", line 964, in send
self.connect()
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connection.py", line 181, in connect
conn = self._new_conn()
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connection.py", line 168, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x107784be0>: Failed to establish a new connection: [Errno 61] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/MyUserName/.local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/urllib3/util/retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='my-gitlab.com', port=80): Max retries exceeded with url: /api/v4/projects?membership=false&order_by=created_at&owned=false&page=2&per_page=20&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x107784be0>: Failed to establish a new connection: [Errno 61] Connection refused',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "poc.py", line 15, in <module>
projects = gl.projects.list(as_list=False, all=True)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/exceptions.py", line 255, in wrapped_f
return f(*args, **kwargs)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/mixins.py", line 133, in list
obj = self.gitlab.http_list(path, **data)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 597, in http_list
return list(GitlabList(self, url, query_data, **kwargs))
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 779, in __next__
return self.next()
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 788, in next
self._query(self._next_url)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 716, in _query
**kwargs)
File "/Users/MyUserName/.pyenv/versions/3.6.5/lib/python3.6/site-packages/gitlab/__init__.py", line 498, in http_request
result = self.session.send(prepped, timeout=timeout, **settings)
File "/Users/MyUserName/.local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/Users/MyUserName/.local/lib/python3.6/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='my-gitlab.com', port=80): Max retries exceeded with url: /api/v4/projects?membership=false&order_by=created_at&owned=false&page=2&per_page=20&simple=false&sort=desc&starred=false&statistics=false&with_custom_attributes=false&with_issues_enabled=false&with_merge_requests_enabled=false (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x107784be0>: Failed to establish a new connection: [Errno 61] Connection refused',))
The error message requests.exceptions.ConnectionError: HTTPConnectionPool(host='my-gitlab.com', port=80): Max retries exceeded says that it uses port 80 instead of 10080 to communicate with gitlab server. This is because there was an IndexError triggered at line 825 of gitlab/init.py. So python-gitlab tried to use port 80 to see if this port works.
I've checked that the length of self._data is 20. And I don't know why self._current was added to 20. self._data[20] triggers IndexError because the index's maximum value is 19.
For now, I use the following workaround to list all the projects:
import gitlab
gl = gitlab.Gitlab(
'http://my-gitlab.com:10080',
private_token='<my_private_token>'
)
current_page = 1
projects = dest_gitlab.projects.list(as_list=False)
total_pages = projects.total_pages
while (current_page <= total_pages):
projects = gl.projects.list(as_list=False, per_page=100, page=current_page)
for project in projects:
print(project.attributes['name'])
current_page += 1
But this is inconvenient to list all projects.
The document here doesn't mention anything about this error.
Anyone know how to list all projects using python-gitlab?
Is this a bug of python-gitlab?
Is len(self._data) == 20 an expected behavior when calling gl.projects.list(all=True)?
The reason for python-gitlab using port 80 to communicate with gitlab server is because this gitlab bug.
For now, I can only use the workaround or wait for gitlab to fix this bug.
import gitlab
gl = gitlab.Gitlab(
'http://my-gitlab.com:10080',
private_token='<my_private_token>'
)
current_page = 1
projects = dest_gitlab.projects.list(as_list=False)
total_pages = projects.total_pages
while (current_page <= total_pages):
projects = gl.projects.list(as_list=False, per_page=100, page=current_page)
for project in projects:
print(project.attributes['name'])
current_page += 1

Categories

Resources