urrlib2 inside a loop

urrlib2 inside a loop - python

I'm trying to use urllib2 inside a loop with a try/except but when one iterate enter in the except, all the next iterations enter in the except too:
for machine_id in machines:
machine = Machine.objects.get(id=machine_id)
r2 = urllib2.Request('http://localhost:9191/run')
r2.add_header('Accept', 'application/json')
r2.add_header('Content-Type', 'application/json')
data = json.dumps({"client":"ssh", "tgt":machine.name, "fun": "state.sls", "arg":["update2", "nacl"]})
r2.add_data(data)
try:
resp2 = urllib2.urlopen(r2)
json_response = json.load(resp2)['return'][0]
json_response_m7 = json_response[machine.name]
try:
json_response_m7 = json_response_m7['return']
for key, value in json_response_m7.items():
if(value['result'] == False):
print(key)
print("\n")
print(value['changes']['stderr'])
data_return['key'].append(key)
data_return['error'].append(value['changes']['stderr'])
data_return['machine'].append(machine.name)
#data_return = {"key":key, "error": value['changes']['stderr']}
except:
print("except!!!!!!")
print(json_response_m7['stderr'])
data_return['key'].append('stderr:')
data_return['error'].append(json_response_m7['stderr'])
data_return['machine'].append(machine.name)
except (IOError, httplib.HTTPException):
print("Errooor")
data_return['key'].append('stderr: ')
data_return['error'].append('This machine is not added to the roster file')
data_return['machine'].append(machine.name)
the problem is with the first try/except. Can anyone help me please?
Thanks!

Related

Changing variable for parameters for http json get requests

Pretty new to python so go easy on me :). This code works below but I was wondering if there is a way to change the indcode parameter by doing a loop so I do not have to repeat the requests.get.
paraD = dict()
paraD["area"] = "123"
paraD["periodtype"] = "2"
paraD["indcode"] = "722"
paraD["$limit"]=1000
#Open URL and get data for business indcode 722
document_1 = requests.get(dataURL, params=paraD)
bizdata_1 = document_1.json()
#Open URL and get data for business indcode 445
paraD["indcode"] = "445"
document_2 = requests.get(dataURL, params=paraD)
bizdata_2 = document_2.json()
#Open URL and get data for business indcode 311
paraD["indcode"] = "311"
document_3 = requests.get(dataURL, params=paraD)
bizdata_3 = document_3.json()
#Combine the three lists
output = bizdata_1 + bizdata_2 + bizdata_3

Since indcode is the only parameter that changes for each request, we will put that in a list and make the web requests inside a loop.
data_url = ""
post_params = dict()
post_params["area"] = "123"
post_params["periodtype"] = "2"
post_params["$limit"]=1000
# The list of indcode values
ind_codes = ["722", "445", "311"]
output = []
# Loop on indcode values
for code in ind_codes:
# Change indcode parameter value in the loop
post_params["indcode"] = code
try:
response = requests.get(data_url, params=post_params)
data1 = response.json()
output.append(data1)
except:
print("web request failed")
# More error handling / retry if required
print(output)

Assuming you're using Python 3.9+ you can combine dictionaries using the | operator. However, you need to be sure that you understand exactly what this will do. It's more likely that the code to combine dictionaries will be more complex.
When using the requests module it is very important to check the HTTP status code returned from the function (HTTP verb) you're calling.
Here's an approach to the stated problem that may work (depending on how the dictionary merge is effected).
from urllib.error import HTTPError
from requests import get as GET
from requests.exceptions import Timeout, TooManyRedirects, RequestException
from sys import stderr
# base parameters
params = {'area': '123', 'periodtype': '2', '$limit': 1000}
# the indcodes
indcodes = ('722', '445', '311')
# gets the JSON response (as a Python dictionary)
def getjson(url, params):
try:
(r := GET(url, params, timeout=1.0)).raise_for_status()
return r.json() # all good
# if we get any of these exceptions, report to stderr and return an empty dictionary
except (HTTPError, ConnectionError, Timeout, TooManyRedirects, RequestException) as e:
print(e, file=stderr)
return {}
# any exception here is not associated with requests/urllib. Report and raise
except Exception as f:
print(f, file=stderr)
raise
# an empty dictionary
target = {}
# build the target dictionary
# May not produce desired results depending on how the dictionary merge should be carried out
for indcode in indcodes:
target |= getjson('https://httpbin.org/json', params | {'indcode' : indcode})

How to apply changes to ontology saved on SQLite Database?

Everytime I create a new instance on my ontology, something goes wrong If I try to read from the same database again.
ps - these are all part of different views on Django
This is how I am adding instances to my ontology:
# OWLREADY2
try:
myworld = World(filename='backup.db', exclusive=False)
kiposcrum = myworld.get_ontology(os.path.dirname(__file__) + '/kipo.owl').load()
except:
print("Error opening ontology")
# Sync
#--------------------------------------------------------------------------
sync_reasoner()
seed = str(time.time())
id_unico = faz_id(seed)
try:
with kiposcrum:
# here I am creating my instance, these are all strings I got from the user
kiposcrum[input_classe](input_nome + id_unico)
if input_observacao != "":
kiposcrum[input_nome + id_unico].Observacao.append(input_observacao)
sync_reasoner()
status = "OK!"
myworld.close()
myworld.save()
except:
print("Mistakes were made!")
status = "Error!"
input_nome = "Mistakes were made!"
input_classe = "Mistakes were made!"
finally:
print(input_nome + " " + id_unico)
print(input_classe)
print(status)
This is how I am reading stuff from It:
# OWLREADY2
try:
myworld = World(filename='backup.db', exclusive=False)
kiposcrum = myworld.get_ontology(os.path.dirname(__file__) + '/kipo_fialho.owl').load()
except:
print("Error")
sync_reasoner()
try:
with kiposcrum:
num_inst = 0
# gets a list of properties given an instance informed by the user
propriedades = kiposcrum[instancia].get_properties()
num_prop = len(propriedades)
myworld.close()
I am 100% able to read from my ontology, but If I try to create an instance and then try to read the database again, something goes wrong.

Calling 2 functions within another function

I'm trying to call the extract function and the extract_url function within a function. I get name error: name 'endpoint' and name 'agg_key' is not defined. I'm doing this so I can call a script from another script so I don't need to run the command line. How would I go about doing this?
Function I'm trying to call:
def scrape_all_products(URL):
extract(endpoint, agg_key, page_range=None)
extract_url(args)
Functions I'm calling:
def extract(endpoint, agg_key, page_range=None):
r_list = list(range(page_range[0], page_range[1]+1)) if page_range else []
page = 1
agg_data = []
while True:
page_endpoint = endpoint + f'?page={str(page)}'
response = requests.get(page_endpoint, timeout=(
int(os.environ.get('REQUEST_TIMEOUT', 0)) or 10))
response.raise_for_status()
if response.url != page_endpoint: # to handle potential redirects
p_endpoint = urlparse(response.url) # parsed URL
endpoint = p_endpoint.scheme + '://' + p_endpoint.netloc + p_endpoint.path
if not response.headers['Content-Type'] == 'application/json; charset=utf-8':
raise Exception('Incorrect response content type')
data = response.json()
page_has_products = agg_key in data and len(
data[agg_key]) > 0
page_in_range = page in r_list or page_range is None
# break loop if empty or want first page
if not page_has_products or not page_in_range:
break
agg_data += data[agg_key]
page += 1
return agg_data
Other function:
def extract_url(args):
p = format_url(args.url, scheme='https', return_type='parse_result')
formatted_url = p.geturl()
agg_key = 'products'
if args.collections:
agg_key = 'collections'
fp = os.path.join(
args.dest_path, f'{p.netloc}.{agg_key}.{args.output_type}')
if args.file_path:
fp = os.path.join(
args.dest_path, f'{args.file_path}.{args.output_type}')
endpoint = f'{formatted_url}/{agg_key}.json'
ret = {
'endpoint_attempted': endpoint,
'collected_at': str(datetime.now()),
'success': False,
'error': ''
}
try:
data = extract(endpoint, agg_key, args.page_range)
except requests.exceptions.HTTPError as err:
ret['error'] = str(err)
except json.decoder.JSONDecodeError as err:
ret['error'] = str(err)
except Exception as err:
ret['error'] = str(err)
else:
ret['success'] = True
ret[agg_key] = data
if ret['success']:
ret['file_path'] = str(fp)
save_to_file(fp, data, args.output_type)
return ret

The scrape_all_products function only knows about variables created inside of that function and variables passed to it (which in this case is URL). endpoint and agg_key were both created inside of a different function. You have to pass those variables to scrape_all_products the same way you are passing URL. So do:
def scrape_all_products(URL, endpoint, agg_key, args):
And then you would have to appropriately modify anywhere scrape_all_products is called.

How can I get Google Calendar API status_code in Python when get list events?

I try to use Google Calendar API
events_result = service.events().list(calendarId=calendarId,
timeMax=now,
alwaysIncludeEmail=True,
maxResults=100, singleEvents=True,
orderBy='startTime').execute()
Everything is ok, when I have permission to access the calendarId, but it will be errors if wrong when I don't have calendarId permission.
I build an autoload.py function with schedule python to load events every 10 mins, this function will be stopped if error come, and I have to use SSH terminal to restart autoload.py manually
So i want to know:
How can I get status_code, example, if it is 404, python will PASS

Answer:
You can use a try/except block within a loop to go through all your calendars, and skip over accesses which throw an error.
Code Example:
To get the error code, make sure to import json:
import json
and then you can get the error code out of the Exception:
calendarIds = ["calendar ID 1", "calendar ID 2", "calendar Id 3", "etc"]
for i in calendarIds:
try:
events_result = service.events().list(calendarId=i,
timeMax=now,
alwaysIncludeEmail=True,
maxResults=100, singleEvents=True,
orderBy='startTime').execute()
except Exception as e:
print(json.loads(e.content)['error']['code'])
continue
Further Reading:
Python Try Except - w3schools
Python For Loops - w3schools

Thanks to #Rafa Guillermo, I uploaded the full code to the autoload.py program, but I also wanted to know, how to get response json or status_code for request Google API.
The solution:
try:
code here
except Exception as e:
continue
import schedule
import time
from datetime import datetime
import dir
import sqlite3
from project.function import cmsCalendar as cal
db_file = str(dir.dir) + '/admin.sqlite'
def get_list_shop_from_db(db_file):
cur = sqlite3.connect(db_file).cursor()
query = cur.execute('SELECT * FROM Shop')
colname = [ d[0] for d in query.description ]
result_list = [ dict(zip(colname, r)) for r in query.fetchall() ]
cur.close()
cur.connection.close()
return result_list
def auto_load_google_database(list_shop, calendarError=False):
shopId = 0
for shop in list_shop:
try:
shopId = shopId+1
print("dang ghi vao shop", shopId)
service = cal.service_build()
shop_step_time_db = list_shop[shopId]['shop_step_time']
shop_duration_db = list_shop[shopId]['shop_duration']
slot_available = list_shop[shopId]['shop_slots']
slot_available = int(slot_available)
workers = list_shop[shopId]['shop_workers']
workers = int(workers)
calendarId = list_shop[shopId]['shop_calendarId']
if slot_available > workers:
a = workers
else:
a = slot_available
if shop_duration_db == None:
shop_duration_db = '30'
if shop_step_time_db == None:
shop_step_time_db = '15'
shop_duration = int(shop_duration_db)
shop_step_time = int(shop_step_time_db)
shop_start_time = list_shop[shopId]['shop_start_time']
shop_start_time = datetime.strptime(shop_start_time, "%H:%M:%S.%f").time()
shop_end_time = list_shop[shopId]['shop_end_time']
shop_end_time = datetime.strptime(shop_end_time, "%H:%M:%S.%f").time()
# nang luc moi khung gio lay ra tu file Json WorkShop.js
booking_status = cal.auto_load_listtimes(service, shopId, calendarId, shop_step_time, shop_duration, a,
shop_start_time,
shop_end_time)
except Exception as e:
continue
def main():
list_shop = get_list_shop_from_db(db_file)
auto_load_google_database(list_shop)
if __name__ == '__main__':
main()
schedule.every(5).minutes.do(main)
while True:
# Checks whether a scheduled task
# is pending to run or not
schedule.run_pending()
time.sleep(1)

cookie_str = match.group(1).AttributeError: 'NoneType' object has no attribute 'group'

I am working on Stock predicting project.I want to download historical data from yahoo finance and save them in CSV format.
Since I am beginner in Python I am unable to correct the error.
My code is as follows:
import re
import urllib2
import calendar
import datetime
import getopt
import sys
import time
crumble_link = 'https://finance.yahoo.com/quote/{0}/history?p={0}'
crumble_regex = r'CrumbStore":{"crumb":"(.*?)"}'
cookie_regex = r'Set-Cookie: (.*?); '
quote_link = 'https://query1.finance.yahoo.com/v7/finance/download/{}?period1={}&period2={}&interval=1d&events=history&crumb={}'
def get_crumble_and_cookie(symbol):
link = crumble_link.format(symbol)
response = urllib2.urlopen(link)
match = re.search(cookie_regex, str(response.info()))
cookie_str = match.group(1)
text = response.read()
match = re.search(crumble_regex, text)
crumble_str = match.group(1)
return crumble_str, cookie_str
def download_quote(symbol, date_from, date_to):
time_stamp_from = calendar.timegm(datetime.datetime.strptime(date_from, "%Y-%m-%d").timetuple())
time_stamp_to = calendar.timegm(datetime.datetime.strptime(date_to, "%Y-%m-%d").timetuple())
attempts = 0
while attempts < 5:
crumble_str, cookie_str = get_crumble_and_cookie(symbol)
link = quote_link.format(symbol, time_stamp_from, time_stamp_to, crumble_str)
#print link
r = urllib2.Request(link, headers={'Cookie': cookie_str})
try:
response = urllib2.urlopen(r)
text = response.read()
print "{} downloaded".format(symbol)
return text
except urllib2.URLError:
print "{} failed at attempt # {}".format(symbol, attempts)
attempts += 1
time.sleep(2*attempts)
return ""
if __name__ == '__main__':
print get_crumble_and_cookie('KO')
from_arg = "from"
to_arg = "to"
symbol_arg = "symbol"
output_arg = "o"
opt_list = (from_arg+"=", to_arg+"=", symbol_arg+"=")
try:
options, args = getopt.getopt(sys.argv[1:],output_arg+":",opt_list)
except getopt.GetoptError as err:
print err
for opt, value in options:
if opt[2:] == from_arg:
from_val = value
elif opt[2:] == to_arg:
to_val = value
elif opt[2:] == symbol_arg:
symbol_val = value
elif opt[1:] == output_arg:
output_val = value
print "downloading {}".format(symbol_val)
text = download_quote(symbol_val, from_val, to_val)
with open(output_val, 'wb') as f:
f.write(text)
print "{} written to {}".format(symbol_val, output_val)
And the Error message that I am getting is :
File "C:/Users/Murali/PycharmProjects/generate/venv/tcl/generate2.py", line
49, in <module>
print get_crumble_and_cookie('KO')
File "C:/Users/Murali/PycharmProjects/generate/venv/tcl/generate2.py", line
19, in get_crumble_and_cookie
cookie_str = match.group(1)
AttributeError: 'NoneType' object has no attribute 'group'
So how can we resolve this problem that has popped up?

Look at these two commands:
match = re.search(cookie_regex, str(response.info()))
cookie_str = match.group(1)
The first one takes the string response.info() does a regular expression search to match cookie_regex. Then match.group(1) is supposed to take the match from it. The problem however is that if you do a print match in between these commands, you'll see that the re.search() returned nothing. This means match.group() has nothing to "group", which is why it errors out.
If you take a closer look at response.info() (you could just add a print response.info() command in your script to see it), you'll see that there's a line in response code that starts with "set-cookie:", the code after which you're trying to capture. However, you have your cookie_regex string set to look for a line with "Set-Cookie:". Note the capital letters. When I change that string to all lower-case, the error goes away:
cookie_regex = r'set-cookie: (.*?); '
I did run into another error after that, where print "downloading {}".format(symbol_val) stops because symbol_val hasn't been defined. It seems that this variable is only declared and assigned when opt[2:] == symbol_arg:. So you may want to rewrite that part to cover all cases.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

urrlib2 inside a loop - python

Related

Changing variable for parameters for http json get requests

How to apply changes to ontology saved on SQLite Database?

Calling 2 functions within another function

How can I get Google Calendar API status_code in Python when get list events?

cookie_str = match.group(1).AttributeError: 'NoneType' object has no attribute 'group'

Categories

Resources