Suddenly I get the following error from my code, sometime ago it was running just fine. Is it caused by the internet connection?
#Get data checkpoint size
url = 'http://url:8080/vrio/blk'
r = requests.get(url)
data = r.json()
def counterVolume_one():
wanted = {'Bytes_Written', 'Bytes_Written', 'IO_Operation'}
for d in data['Block Devices'].itervalues():
values = {k: v for k, v in d.iteritems() if k in wanted}
print json.dumps(values)
Traceback (most recent call last):
File "hyper.py", line 31, in <module>
r = requests.get(url)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='http://url', port=8080): Max retries exceeded with url: /vrio/blk (Caused by <class 'socket.error'>: [Errno 101] Network is unreachable)
Try declaring a timeout at the requests block ie;
r = requests.get(url, timeout=20)
By doing this you must most probably get rid off that error :)
Related
I am creating a web scraper to go over almost 400k records. Basically, it works like this, I have a CSV of part numbers that need to be searched on this site. The site has an exposed API so I am able to skip the frontend and make a direct request to the site after logging in. I created one function called GetPdcResults() which takes in a list of parts and a start number. The start argument is for if the scraper stops for any reason I can start it back up at the same point it left off on the parts list. Then the main loop of the scraper which enumerates over each part in the list builds a payload for that part and requests the information. Some error handling for if I have a network error or a cookie error which only happens when my user's session has expired. Then it calls the CleanPdcResults() function. This cleans the response returned from the site and saves the relevant information to a CSV for exporting.
To my understanding recursion is when a function calls itself repeatedly and there is a limit to this in python and is more resource intensive. Iteration is when you use a loop to repeat a set of actions.
I think I want iteration in my app, not recursion because currently I a, getting this error I have never seen before.
RecursionError: maximum recursion depth exceeded while calling a Python object
I'm assuming because there is recursion happening in my functions instead of iteration but I can't seem to point it out. The only time a function is calling itself is when there is a cookie error and the GetPdcResults() function is called again but that wouldn't be called so many times that a Limit is reached.
Can someone help me find where recursion is happening in my scrapper and how I can convert it to iteration to stop this error?? Any help is appreciated!
def GetPdcResults(parts, start=0):
logger = PartsLogger()
logger.log_cookie(headers['cookie'])
logger.log_num_parts(parts)
for (i, part) in tqdm(enumerate(parts[start:], start), total=len(parts[start:])):
if part == nan:
break
logger.log_cur_part(i, part)
payload = "{\"catalogId\":\"2\",\"locatorService\":\"Panda\""
payload += f',"partNumber":"{part}", "cacheKey":"{part}_2_en-US_7497fea0-4fb6-4b28-b0e8-62e3e4204cc5"{"}"}'
try:
response = requests.request("POST", url, headers=headers, data=payload)
except requests.exceptions.RequestException as e:
print('\n[-] Request Error')
print(e)
logger.log_error(str(e), part=part)
if response.status_code == 401:
logger.log_error('[-] Cookie Error', part=part)
print('\n[-] Cookie Error')
GetPdcResults(parts, start=i)
break
CleanPdcResults(response.json(), i, part, logger)
def CleanPdcResults(resp, index, part, logger):
try:
pdc_results = resp['d']['PdcResults']
pdc92 = {}
for pdc in pdc_results:
if '92' in pdc['LocationName']:
pdc92.update(pdc)
break
if(bool(pdc92)):
foundPart = [{'':index, 'Part':part, 'Qty':pdc92['Quantity']}]
df = pd.DataFrame(foundPart)
if not exists('Parts.csv'):
df.to_csv('Parts.csv', index=False)
df.to_csv('Parts.csv', mode='a', index=False, header=False)
else:
print('\n[-] Part Not Found')
except Exception as e:
logger.log_error(str(e), part=part, response=resp)
Traceback (most recent call last):
File "c:\Users\carte\OneDrive\Documents\GrayTeck\Chad S\CleanPartsCSV.py", line 30, in run
GetPdcResults(partsList, start=startIndex)
File "c:\Users\carte\OneDrive\Documents\GrayTeck\Chad S\GetPDCRes.py", line 57, in GetPdcResults
GetPdcResults(parts, start=i)
File "c:\Users\carte\OneDrive\Documents\GrayTeck\Chad S\GetPDCRes.py", line 57, in GetPdcResults
GetPdcResults(parts, start=i)
File "c:\Users\carte\OneDrive\Documents\GrayTeck\Chad S\GetPDCRes.py", line 57, in GetPdcResults
GetPdcResults(parts, start=i)
[Previous line repeated 973 more times]
File "c:\Users\carte\OneDrive\Documents\GrayTeck\Chad S\GetPDCRes.py", line 48, in GetPdcResults
response = requests.request("POST", url, headers=headers, data=payload)
File "C:\Python310\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Python310\lib\site-packages\requests\sessions.py", line 529, in request
resp = self.send(prep, **send_kwargs)
File "C:\Python310\lib\site-packages\requests\sessions.py", line 645, in send
r = adapter.send(request, **kwargs)
File "C:\Python310\lib\site-packages\requests\adapters.py", line 440, in send
resp = conn.urlopen(
File "C:\Python310\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "C:\Python310\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "C:\Python310\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
httplib_response = conn.getresponse()
File "C:\Python310\lib\http\client.py", line 1374, in getresponse
response.begin()
File "C:\Python310\lib\http\client.py", line 337, in begin
self.headers = self.msg = parse_headers(self.fp)
File "C:\Python310\lib\http\client.py", line 236, in parse_headers
return email.parser.Parser(_class=_class).parsestr(hstring)
File "C:\Python310\lib\email\parser.py", line 67, in parsestr
return self.parse(StringIO(text), headersonly=headersonly)
File "C:\Python310\lib\email\parser.py", line 56, in parse
feedparser.feed(data)
File "C:\Python310\lib\email\feedparser.py", line 176, in feed
self._call_parse()
File "C:\Python310\lib\email\feedparser.py", line 180, in _call_parse
self._parse()
File "C:\Python310\lib\email\feedparser.py", line 295, in _parsegen
if self._cur.get_content_maintype() == 'message':
File "C:\Python310\lib\email\message.py", line 594, in get_content_maintype
ctype = self.get_content_type()
File "C:\Python310\lib\email\message.py", line 578, in get_content_type
value = self.get('content-type', missing)
File "C:\Python310\lib\email\message.py", line 471, in get
return self.policy.header_fetch_parse(k, v)
File "C:\Python310\lib\email\_policybase.py", line 316, in header_fetch_parse
return self._sanitize_header(name, value)
File "C:\Python310\lib\email\_policybase.py", line 287, in _sanitize_header
if _has_surrogates(value):
File "C:\Python310\lib\email\utils.py", line 57, in _has_surrogates
s.encode()
RecursionError: maximum recursion depth exceeded while calling a Python object
Python's default maximum recursion depth is 1000, but you can check yours with print(sys.getrecursionlimit()) or set a new one with
# import sys
new_recursion_limit = 2000 # set as you prefer
sys.setrecursionlimit(new_recursion_limit)
# print('recursion limit is now', sys.getrecursionlimit())
but this is considered a dangerous method.
Instead you should consider setting extra parameters for GetPdcResults [or any other recursive function] - something like
def GetPdcResults(parts, start=0, maxDepth=999, curDepth=0):
and then increment curDepth with every recursive call, such as
# if response.status_code == 401:
# logger.log_error('[-] Cookie Error', part=part)
# print('\n[-] Cookie Error')
if curDepth < maxDepth:
GetPdcResults(parts, start=i, maxDepth=maxDepth, curDepth=curDepth+1)
## else: print(curDepth, 'is too deep') # if you want an alternate action...
# break
This is a small API request that is throwing me a requests.exceptions.ChunkedEncodingError
import requests
def categories_list():
categories = []
response = requests.get("https://fr.openfoodfacts.org/categories&json=1")
data = response.json()
i = 0
for category in data["tags"]:
if category["products"] >= 1200:
name = category["name"]
categories.append(name)
i += 1
print("It's ok, imported %s" % i)
categories_list()
Error code:
File "exception.py", line 18, in <module>
categories_list()
File "exception.py", line 6, in categories_list
response = requests.get("https://fr.openfoodfacts.org/categories&json=1")
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/sessions.py", line 683, in send
r.content
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/home/pi/Documents/venv/lib/python3.7/site-packages/requests/models.py", line 754, in generate
raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(3573 bytes read, 6667 more expected)', IncompleteRead(3573 bytes read, 6667 more expected))
Could it be possibly my internet connection? Similar queries worked for me yesterday...
Below is my Code in raspberry PI's python(Thonny Idle).
Kindly Ignore the Url, it is not the real address.
Code
from firebase import firebase
firebase = firebase.FirebaseApplication('https://testing123123-iot.firebaseio.com',authentication=None)
data = {
'Name':'Hi',
'Email':'hihi.com',
'Phone':512232131
}
result = firebase.post('/testing123123-iot:/Customer', data)
print(result)
Error
Traceback (most recent call last):
File "/home/pi/Documents/PythonCode/TestingFirebase-1.py", line 17, in
result = firebase.post('/testing-iot:/Customer', data)
File "/usr/local/lib/python3.7/dist-packages/firebase/decorators.py", line 19, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/firebase/firebase.py", line 329, in post
connection=connection)
File "/usr/local/lib/python3.7/dist-packages/firebase/decorators.py", line 19, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/firebase/firebase.py", line 97, in make_post_request
timeout=timeout)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 340, in post
return self.request('POST', url, data=data, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 279, in request
resp = self.send(prep, stream=stream, timeout=timeout, verify=verify, cert=cert, proxies=proxies)
File "/usr/local/lib/python3.7/dist-packages/requests/sessions.py", line 374, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/requests/adapters.py", line 174, in send
timeout=timeout
File "/usr/local/lib/python3.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 417, in urlopen
conn = self._get_conn(timeout=pool_timeout)
File "/usr/local/lib/python3.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 232, in _get_conn
return conn or self._new_conn()
File "/usr/local/lib/python3.7/dist-packages/requests/packages/urllib3/connectionpool.py", line 547, in _new_conn
strict=self.strict)
TypeError: init() got an unexpected keyword argument 'strict'
use json.dumps :
import json
data = {
'Name':'Hi',
'Email':'hihi.com',
'Phone':512232131
}
sent = json.dumps(data)
result = firebase.post('/testing123123-iot:/Customer', sent)
print(result)
In my production server suddenly python-requests library throws this error on making https connection with 3rd party application.
raised unexpected: SSLError(SSLError(SSLError("bad ca_certs: '/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/requests/cacert.pem'", Error([('system library', 'fopen', 'No such file or directory'), ('BIO routines', 'BIO_new_file', 'no such file'), ('x509 certificate routines', 'X509_load_cert_crl_file', 'system lib')],)),),)
please suggest some solutions and reason behind this exception.
Edit:
Traceback (most recent call last):
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/opt/vogo/api/api/api/celery.py", line 146, in spreadsheets_management
ss = gs.open_by_key("10K_0JGCVZjngWo7dr1CTeRJ1rAhMWxAGxHbl_NworeU")
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/gspread/client.py", line 105, in open_by_key
feed = self.get_spreadsheets_feed()
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/gspread/client.py", line 155, in get_spreadsheets_feed
r = self.session.get(url)
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/gspread/httpsession.py", line 73, in get
return self.request('GET', url, params=params, **kwargs)
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/gspread/httpsession.py", line 65, in request
response = func(url, data=data, params=params, headers=request_headers, files=files, json=json)
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/requests/sessions.py", line 480, in get
"""
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/requests/sessions.py", line 468, in request
:param allow_redirects: (optional) Set to True by default.
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/requests/sessions.py", line 576, in send
File "/home/ubuntu/.virtualenvs/api/local/lib/python2.7/site-packages/requests/adapters.py", line 447, in send
python version: 2.7.6
python-requests version: 2.18.4
Edit: My Code
def request(self, method, url, data=None, params=None, headers=None, files=None, json=None):
if data and isinstance(data, bytes):
data = data.decode()
if data and not isinstance(data, basestring):
data = urlencode(data)
if data is not None:
data = data.encode('utf8')
# If we have data and Content-Type is not set, set it...
if data and not headers.get('Content-Type', None):
headers['Content-Type'] = 'application/x-www-form-urlencoded'
request_headers = self.headers.copy()
if headers:
for k, v in headers.items():
if v is None:
del request_headers[k]
else:
request_headers[k] = v
try:
# self.requests_session = requests.Session() // this is init. before
func = getattr(self.requests_session, method.lower())
except AttributeError:
raise RequestError("HTTP method '{}' is not supported".format(method))
response = func(url, data=data, params=params, headers=request_headers, files=files, json=json)
if response.status_code > 399:
raise RequestError(response.status_code, "{0}: {1}".format(
response.status_code, response.content))
return response
My python code contains a method as follows:
import tagme
def concept_extraction ():
tagme.GCUBE_TOKEN = "702e87ce-3750-4069-900d-92d12a17cda4-843334162"
string = "Sachin is running in the Kolkata stadium."
print string
lunch_annotations = tagme.annotate(string)
#Print annotations with a score higher than 0.1
for ann in lunch_annotations.get_annotations(0.1):
print ann
return;
Where for a given string, I want to determine the entities present in the string. But I am getting an error as follows:
File "/home/krishnendu/Python_workspace/QG/concept_extraction.py", line 8, in concept_extraction
lunch_annotations = tagme.annotate(string)
File "/usr/local/lib/python2.7/dist-packages/tagme/__init__.py", line 201, in annotate
json_response = _issue_request(api, payload, gcube_token)
File "/usr/local/lib/python2.7/dist-packages/tagme/__init__.py", line 271, in _issue_request
res = requests.post(api, data=payload)
File "/usr/local/lib/python2.7/dist-packages/requests-1.2.3-py2.7.egg/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests-1.2.3-py2.7.egg/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests-1.2.3-py2.7.egg/requests/sessions.py", line 335, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests-1.2.3-py2.7.egg/requests/sessions.py", line 454, in send
history = [resp for resp in gen] if allow_redirects else []
File "/usr/local/lib/python2.7/dist-packages/requests-1.2.3-py2.7.egg/requests/sessions.py", line 87, in resolve_redirects
raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
Question: The python version is 2.7 and tagme version is 0.1.3. How to handle this issue?